Originally Posted by Rich Graves
We currently have Zimbra 5.0.4 running in multi-node fashion under Xen 3.1.3. We're using our own custom Xen RPM's (and therefore custom Xen 188.8.131.52 kernel) on RHEL 5.1 dom0's. Each Zimbra node also runs on RHEL 5.1 virtual machines. In total, there are 13 virtual machines running in our Xen cluster for our production mail setup:
- Two Zimbra MX Virtual Machine's
- Two split-DNS Nameserver Virtual Machine's (see Split DNS - Zimbra :: Wiki)
- Two Zimbra Proxy Virtual Machine's
- Two Zimbra LDAP Virtual Machine's (one master, one replica)
- One Zimbra Logger Virtual Machine
- Four Zimbra Mail Store Virtual Machine's
We run a 64-bit Xen hypervisor with 32-bit dom0's and domU's. The hypervisor was built 64-bit so we can utilize the full 32GB of memory on each of our 14 Dell PowerEdge 1950's within the Xen cluster. But, we stuck with 32-bit dom0's and 32-bit domU's because we require the ability to live migrate virtual machines from one dom0 to another. (You can boot a 32-bit domU on a 64-bit dom0, and vice versa, but you cannot migrate the domU; a limitation of the xc_save and xc_restore tools; see my post on the Xen-devel mailing list for details: [Xen-devel] Migrate/Save of 32-bit domU Broken on Xen 3.1.2 64-b - Xen Source).
Back when we ran Xen 3.0.4-1 with RHEL 4.x dom0's and domU's, we did see issues with BDB and components which use it (OpenLDAP, MySQL). Even though we followed the common practice of moving /lib/tls to /lib/tls.disabled on our domU's, we still ran into BDB issues due to its heavy utilization of thread local storage (TLS) within the native posix threading library (NPTL) by those applications. (Just a note - to get around BDB woes with MySQL on a RHEL 4.x domU, we would enter "skip-bdb" within the /etc/my.cnf and just not use the BDB MySQL database engine).
However, since we've switched to Xen 3.1.3 and RHEL 5.1 for dom0's and domU's, provided we enter that same exact "hwcap 0 nosegneg" directive within /etc/ld.so.conf (and run ldconfig for changes to take effect), we've had zero issues with BDB and friends. From what I can tell, the "hwcap 0 nosegneg" just instructs the RHEL-specific (or stock, I'm not sure) dynamic linker to ignore and not execute the offending TLS code. From what I can see, its essentially the same as compiling one's entire system with the GCC flag "-mno-tls-direct-seg-refs", which is common for us Gentoo folks who use Xen and are compiling our entire system from scratch. I think that Red Hat preferred a more "toggleable" or dynamic way of dis/allowing the NTPL/TLS feature without fully turning it off or fully on, such that their users could leverage its features if they aren't using Xen.
For reference, from the GCC man page:
Regarding 'quanah's post on executing:
Controls whether TLS variables may be accessed with offsets from the TLS segment register (%gs for 32-bit, %fs for 64-bit), or whether the
thread base pointer must be added. Whether or not this is legal depends on the operating system, and whether it maps the segment to cover the
entire TLS area.
For systems that use GNU libc, the default is on.
to test for BDB woes, we have executed such without any errors or issues. I didn't expect to see problems either, as OpenLDAP, MySQL, or Zimbra would probably not even start if TLS and Xen were conflicting (as is the case with RHEL 4.x and Xen 3.0.4-1). Additionally, Red Hat worked pretty hard to make RHEL 5.x support Xen and be Xen compatible, so I'd expect it to work (as it does).
Originally Posted by quanah
For those folks interested in the performance of our system, it is quite outstanding. Because of the paravirtualized nature of Xen, we don't run into speed issues with the virtual processors or memory. We run many other applications within our Xen cluster that are quite busy, such as Oracle 10g, MySQL 4.x and 5.x databases, our entire Moodle infrastructure (see Welcome to LATTE - Brandeis University), PeopleSoft Financials 9.0 and Campus Solutions 9.0, RT, and so forth. I/O (network and block) performance is also quite good, as we've taken a unique approach with both:
First, the actual servers (dom0's) have both 1GB/s NIC's (eth0 and eth1) bonded (etherchannel for those Cisco folks out there) to become bond0, in order to provide high availability and up to 2GB/s aggregate throughput (we use LACP for bonding). A VLAN trunk is passed down the bond (bond0), and split out into separate bridges accordingly. Virtual machines (domU's) are then connected to any of the bridges they require access to; VM's generally need either one or two network connections out of the 20 VLAN's sent to each dom0.
Second, for block devices (storage), each virtual machine (domU) has its own set of volumes on our SAN. Xen servers (dom0's) are connected to the SAN via 4GB/s fibre channel links. We then pass those block devices directly into the virtual machines so there is no middle-layer such as the clustered file system approach like OCFS2, GFS, or VMFS. We do use multipath in order to abstract the volume names to those like /dev/mapper/vm_zimbra_store_1 instead of the real block devices names /dev/sdz (because while a volume will appear as sdz on one box, it could be sdm on another).
Rich (or anybody else), if you'd like to take a gander at our custom Xen RPM's, just let me know and I'll send them your way. We've made our own to enable quite a few additional features, such as XENFB support and a 64-bit Hypervisor, as well as for bug fixes (the Xen API in 3.1.3 is much improved over that in 3.1.0).
I'll also keep this forum updated as to our success with future versions of Xen and Zimbra.
Hope this helps.