[Linux-cluster] migrating to better node
Jakub Suchy
jakub.suchy at enlogit.cz
Tue Feb 26 12:48:34 UTC 2008
Hi,
I am currently testing following cluster of VM machines:
Two nodes, Shared storage, APC fencing device.
cluster.conf follows.
I am encountering this situation:
1) VM is running on clu2
2) clu2 gets fenced and VM is migrated to clu1
3) after clu2 is started again, VM is automatically migrated to clu1
according to logs and this fails.
Do anybody know, why this fails?
Feb 26 13:25:27 clu1 fenced[2973]: fence "clu2.test-cluster.cz" success
Feb 26 13:25:32 clu1 kernel: GFS: fsid=adler:virtdata.1: jid=0: Trying
to acquire journal lock...
...
Feb 26 13:25:32 clu1 kernel: GFS: fsid=adler:virtdata.1: jid=0: Done
...
Feb 26 13:25:32 clu1 clurgmgrd[3885]: <notice> Taking over service
vm:win2003 from down member clu2.test-cluster.cz
...clu2 rejoins...
Feb 26 13:27:32 clu1 clurgmgrd[3885]: <notice> Migrating vm:win2003 to
better node clu2.test-cluster.cz
Feb 26 13:27:35 clu1 kernel: peth0: received packet with own address as
source address
Feb 26 13:27:37 clu1 kernel: dlm: connecting to 1
-----> Feb 26 13:27:47 clu1 clurgmgrd[3885]: <err> #75: Failed changing service
status
Cluste.conf:
<?xml version="1.0"?>
<cluster alias="adler" config_version="13" name="adler">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="clu2.test-cluster.cz" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="apc" port="3"/>
</method>
</fence>
</clusternode>
<clusternode name="clu1.test-cluster.cz" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="apc" port="1"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_apc" ipaddr="192.168.0.54" login="apc" name="apc" passwd="apc"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="clu1" ordered="0" restricted="0">
<failoverdomainnode name="clu1.test-cluster.cz" priority="1"/>
</failoverdomain>
<failoverdomain name="clu2" ordered="0" restricted="0">
<failoverdomainnode name="clu2.test-cluster.cz" priority="1"/>
</failoverdomain>
<failoverdomain name="clu" ordered="0" restricted="1">
<failoverdomainnode name="clu2.test-cluster.cz" priority="1"/>
<failoverdomainnode name="clu1.test-cluster.cz" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<fs device="/dev/sdb8" force_fsck="0" force_unmount="1" fsid="62307" fstype="ext3" mountpoint="/mnt/data" name="data" self_fence="0"/>
<clusterfs device="/dev/mapper/gfs1-gfsdata" force_unmount="0" fsid="59408" fstype="gfs" mountpoint="/mnt/gfs" name="gfs"/>
</resources>
<vm autostart="1" domain="clu" exclusive="0" name="sybase" path="/mnt/gfs/" recovery="restart"/>
<vm autostart="1" domain="clu2" exclusive="0" name="win2003" path="/mnt/gfs" recovery="relocate"/>
</rm>
</cluster>
Thanks,
Jakub Suchy
--
Jakub Suchý <jakub.suchy at enlogit.cz>
GSM: +420 - 777 817 949
Enlogit s.r.o, U Cukrovaru 509/4, 400 07 Ústí nad Labem
tel.: +420 - 474 745 159, fax: +420 - 474 745 160
e-mail: info at enlogit.cz, web: http://www.enlogit.cz
More information about the Linux-cluster
mailing list