[Linux-cluster] VM Resource Failover

I have been trying to simulate a xen VM failover,  I have a 2 machine cluster and 2 vm’s running.  If I issue a “ xm destroy ID”  the vm will automatically reboot to the other node.  But if I reboot one of the clusters to simulate a machine failure the vm never boots back up until the other machine comes online.  So here are my questions…


1.       How do I get the cluster to boot the vm that has failed when one of the clustered machines are down?

2.       When I do a “xm destroy ID” the cluster always reboots the vm onto the other cluster machine, is there any way for me to have it boot back to the machine its supposed to be running on without having to do a manual migrate?   Can It auto-migrate back to its original machine over time?



Here is the out put of my clustat during a reboot of one of the clusters…


Cluster Status for Xen @ Thu Aug 14 10:11:21 2008

Member Status: Quorate

 Member Name                             ID   Status

 ------ ----                             ---- ------

 xen1.smartechcorp.net                       1 Online, Local, rgmanager

 xen2.smartechcorp.net                       2 Offline

 Service Name                   Owner (Last)                   State        

 ------- ----                   ----- ------                   -----        

 vm:Linux1                      xen2.smartechcorp.net          stopping     

 vm:Windows1                    xen1.smartechcorp.net          started  


Here is my cluster.conf….


<?xml version="1.0"?>

<cluster alias="Xen" config_version="29" name="Xen">

        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="-1"/>


                <clusternode name="xen1.smartechcorp.net" nodeid="1" votes="1">


                                <method name="1">

                                        <device name="manual" nodename="xen1.smartechcorp.net"/>




                <clusternode name="xen2.smartechcorp.net" nodeid="2" votes="1">


                                <method name="1">

                                        <device name="manual" nodename="xen2.smartechcorp.net"/>





        <cman expected_votes="1" two_node="1"/>


                <fencedevice agent="fence_manual" name="manual"/>




                        <failoverdomain name="bias-xen1" nofailback="0" ordered="1" restricted="0">

                                <failoverdomainnode name="xen1.smartechcorp.net" priority="1"/>

                                <failoverdomainnode name="xen2.smartechcorp.net" priority="2"/>


                        <failoverdomain name="bias-xen2" nofailback="0" ordered="1" restricted="0">

                                <failoverdomainnode name="xen1.smartechcorp.net" priority="2"/>

                                <failoverdomainnode name="xen2.smartechcorp.net" priority="1"/>




                <vm autostart="1" domain="bias-xen1" exclusive="0" migrate="live" name="Windows1" path="/var/lib/xen/images" recovery="relocate"/>

                <vm autostart="1" domain="bias-xen2" exclusive="0" migrate="live" name="Linux1" path="/var/lib/xen/images" recovery="relocate"/>



Thanks for any help, this is driving me crazy!




