Re: [Linux-cluster] Cluster of XEN guests unstable when rebooting a node under CS5.1

On Wed, 2007-12-12 at 19:23 +0100, Paolo Marini wrote:
> I reiterate the request for help hoping someone has undergone (and 
> hopefully solved) the same issues.
> I am building up a cluster of XEN Guests with root file system residing 
> on a file on an GFS filesystem (iscsi actually).
> Each cluster node mounts an GFS file system residing on an iscsi device.
> For performance reasons, both the iscsi device and the physical nodes 
> (part also of a cluster) use two gigabit ethernet with bonding and LACP. 
> For the physical machines, I had to insert a sleep 30 on the 
> /etc/init.d/iscsi script before the iscsi login, in order to wait for 
> the bond interface to come up, otherwise the iscsi devices are not seen 
> and no gfs mount is possible.
> Then, going to the cluster of XEN Guests, they work fine, I am able to 
> migrate each one to a different physical node without problems on the guest.
> When I reboot or fence one of the guests, the guest cluster breaks, e.g. 
> the quorum is dissolved and I have to fence ALL the nodes and reboot 
> them in order for the cluster to restart.

How many guests - and what are you using for fencing ?

> Does it have to do with the xen bridge going up and down for a time 
> longer than the heartbeat timeout ?

Not sure - it shouldn't be that big of a deal.  If you think that's the
problem try adding:

   <totem token="30000"/>

to the vm cluster's cluster.conf

-- Lon

