[Linux-cluster] virtualization on top of Red Hat cluster
Geoffrey
lists at serioustechnology.com
Fri Jan 23 20:43:02 UTC 2009
Gunther Schlegel wrote:
>
>
> Geoffrey wrote:
>
>> Before going into the actual issues we are running into, I thought we
>> should first find out if there are others who are doing this, or
>> attempting to do this.
>>
>> This includes a separate virtual server for the following services:
>> squid, ldap, mail server, dns, samba, email and a handful of
>> Xservers/application servers.
>>
>> The specific hardware includes:
>>
>> 8 Dell 19150 nodes w/ rh5.2 xen
>> Each node has 32GB of ram and 8 cores (xeon 2.66)
>
> you need a 64-bit distro to have xen support more than 16GB RAM.
We do.
>
>> EMC San CX-310
>> 2 Brocade 5000 fibre switches
>>
>> Any feedback from anyone attempting or currently running a similar
>> solution (virtualization on top of cluster) would be greatly appreciated.
>
> We are running something like that for more than a year now (tests on
> rhel5.0, live on rhel5.1, now running rhel5.2).
>
> 45+ paravirtualized VMs on 2 clusters.
> VMs are located on clustered logical volumes, not gfs.
> gfs is used for /etc/xen, though.
Yes, we are using lvm as well.
> performance is good, even with io-intensive apps inside the VMs.
> stability is fair (but improving over time).
This is good to hear.
> The main issue has always been the cluster loosing quorum without
> apparent reason. Improved after we added a Quorum Disk.
We are using Quorum disk as well.
> Improved further
> after RH Support recommended a couple of not so well documented ;)
> configuration parameters, which also cannot be maintained with
> conga/luci/ricci or system-config-cluster. Improved even further after
> we complained again and RH came up with even more undocumented setting
> to solve the race-conditions we experienced when a node left the cluster
> (even intentionally).
Any possible way you can share these undocumented settings with us?
> Another big issue was live migration, turned out that the bridge in Dom0
> has a default forward delay of 15 seconds. We may have experienced that
> because we use a different xen network-script than delivered by RH.
> Though, the original one could not deal with neither bonding nor Vlans,
> so we had to do that. This has changed according to the rhel5.3
> changelog, but we have not tested it yet.
>
> We also had issues with the DomU time jumping around after live
> migration, but RH fixed that recently and as a workaround one could run
> ntpd inside the DomU.
>
>
> best regards, Gunther
>
>
> ------------------------------------------------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
--
Until later, Geoffrey
Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
- Benjamin Franklin
More information about the Linux-cluster
mailing list