[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] vm.sh: vm services depend on xend


The service script 'vm.sh' gathers the vm service status using the 'xm' command, however 'xm' relies on xend for proper operation. If xend is down, bad things happen, up to destroying the VM.

I would have filed this issue with RH support, but I feel the solution to this problem requires some qualified thinking in the first place.

What happened:

(Environment: production 4 node Xen / RHEL 5.2 cluster running 30+ pv guests, Nagios monitoring, VM services configured to "Restart" failover)

a) xenconsoled died (this happens from time to time, monitored by Nagios).

b) Operations guy ran "service xend restart" to bring xenconsoled back up. The restart operation implies that xend is down for a short period of time.

c) rgmanager checked 3 VMs within the time frame xend was down. In vm.sh

xm list $OCF_RESKEY_name &> /dev/null

failed as xm could not communicate with xend. As a result rgmanager tried to stop and restart these 3 VMs. As the time frame without xend running has been quite short, xend was up again at the time rgmanager ran "vm.sh stop" on the 3 VMs, therefore the 3 VMs were shut down properly and came up afterwards.

This had been bad enough, but in fact we had been lucky, as I learned when replaying the issue in our test environment. A notable difference is that the test cluster is set to "Relocate" service recovery at the moment. I also had to shut down xend for the test, so it was down significantly longer than on the production cluster.

Background information on xend: xend is not required for Xen VMs to run, it is only required to control VMs. Restarting xend while VMs are running is a safe operation.

As a result of the longer xend downtime, "vm.sh stop" could not shut down the VM, as the stop operation again uses 'xm' to communicate with xend.

Afterwards rgmanager started the VM on another cluster node, where it came up perfectly well.

But the VM has never been shut down on the cluster node not running xend. As a result the VM (which is installed on shared storage) was running twice on two different nodes and the ext3-filesystems had been mounted rw by both VM instances.

Any production server's filesystems would not have survived this for more than a couple of seconds. So there is the risk of severe damage here, especially as "relocate" is the default failover configuration.

As a workaround I propose to change xm.sh:

+       xm info &> /dev/null || return 0
        xm list $OCF_RESKEY_name &> /dev/null
        if [ $? -eq 0 ]; then
                return 0
        xm list migrating-$OCF_RESKEY_name &> /dev/null
        return $?

Though: this is not good enough. xend may vanish between 'xm info' and 'xm list', leading to the described scenario.

Therefore xend should be a cluster service. The VM services would have to depend in the xend service. If a VM fails rgmanager would have to additionally check xend, and only act on the VM if xend has not failed and the VM fails a second test (xend may have just come up again, so we need to retest the VM).

If a VM has failed and it turns out that xend has failed as well, rgmanager should try to reactivate xend.

If xend cannot be started, the cluster node has to be fenced. As xend is not required for VMs to run, the VMs may be perfectly fine and must niot be restarted on another node unless they are guaranteed to be down.

Any comment is welcome.

best regards, Gunther

Gunther Schlegel
Manager IT Infrastructure

Riege Software International GmbH  Fon: +49 (2159) 9148 0
Mollsfeld 10                       Fax: +49 (2159) 9148 11
40670 Meerbusch                    Web: www.riege.com
Germany                            E-Mail: schlegel riege com
---                                ---
Handelsregister:                   Managing Directors:
Amtsgericht Neuss HRB-NR 4207      Christian Riege
USt-ID-Nr.: DE120585842            Gabriele  Riege
                                  Johannes  Riege

fn:Gunther Schlegel
org:Riege Software International GmbH;IT Infrastructure
adr:;;Mollsfeld 10;Meerbusch;;40670;Germany
email;internet:schlegel riege com
title:Manager IT Infrastructure

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]