[Linux-cluster] Guest is not relocating under cluster
Srija
swap_project at yahoo.com
Sat Aug 20 23:35:58 UTC 2011
Hi
I have six node test cluster, running on rhel5.7 86_64 bit OS.
The nodes are under the xen environment. Trying to relocate the guest if the node fails
where the guest is running. But the guest is not relocating, it is getting stopped.
The version of cman and rgmanger are :
cman-2.0.115-85.el5
rgmanager-2.0.52-21.el5
Here is the cluster.conf
--------------------------------------
<?xml version="1.0"?>
<cluster alias="newtest" config_version="26" name="newtest">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="node1" nodeid="1" votes="1">
<fence>
<method name="1">
<device action="reboot" name="ilo-node1"/>
</method>
</fence>
</clusternode>
........
<snip>
</clusternodes>
<cman>
<multicast addr="xxx.1.5.1"/>
</cman>
<totem token="20000"/>
<fencedevices>
<fencedevice agent="fence_ilo" hostname="node1r" login="Admin" name="ilo-node1" passwd="xxxxx"/>
........
<snip>
</fencedevices>
<rm log_level="7" log_facility="local4">
<failoverdomains>
<failoverdomain name="nd1-nd2-nd3-nd4-nd5-nd6" nofailback="1" ordered="1" restricted="1">
<failoverdomainnode name="node1" priority="1"/>
<failoverdomainnode name="node2" priority="2"/>
<failoverdomainnode name="node3" priority="3"/>
<failoverdomainnode name="node4" priority="4"/>
<failoverdomainnode name="node5" priority="5"/>
<failoverdomainnode name="node6" priority="6"/>
</failoverdomain>
</failoverdomains>
<resources/>
<vm autostart="1" name="guest1" migrate="live" recovery="relocate"/>
</rm>
<cman/>
</cluster>
Here are few lines from the log file..
--------------------------------------------------------------
Aug 20 18:51:09 node clurgmgrd[7431]: <debug> Event: Port Opened
Aug 20 18:51:09 node clurgmgrd[7431]: <info> State change: node3 UP
Aug 20 18:51:14 node clurgmgrd[7431]: <debug> Evaluating RG vm:guest1, state stopped, owner none
Aug 20 18:51:14 node clurgmgrd[7431]: <debug> Event (0:3:1) Processed
Aug 20 18:51:19 node clurgmgrd[7431]: <debug> 1 events processed
Aug 20 18:51:35 node clurgmgrd[7431]: <debug> No other nodes have seen vm:guest1
Aug 20 18:51:35 node clurgmgrd[7431]: <notice> Starting stopped service vm:guest1
Aug 20 18:51:36 node clurgmgrd: [7431]: <debug> virsh -c xen:/// start guest1
Aug 20 18:51:37 node clurgmgrd[7431]: <notice> start on vm "guest1" returned 1 (generic error)
Aug 20 18:51:37 node clurgmgrd[7431]: <warning> #68: Failed to start vm:guest1; return value: 1
Aug 20 18:51:37 node clurgmgrd[7431]: <debug> Stopping failed service vm:guest1
Aug 20 18:51:37 node clurgmgrd[7431]: <notice> Stopping service vm:guest1
Aug 20 18:51:37 node clurgmgrd: [7431]: <debug> Virtual machine guest1 is
Aug 20 18:51:38 node clurgmgrd[7431]: <notice> Service vm:guest1 is recovering
Aug 20 18:51:38 node clurgmgrd[7431]: <warning> #71: Relocating failed service vm:guest1
Aug 20 18:51:38 node clurgmgrd[7431]: <debug> Sent remote-start request to 6
Aug 20 18:51:49 node clurgmgrd[7431]: <debug> 4 events processed
Any advice is really appreciated.
Thanks in advance.
More information about the Linux-cluster
mailing list