[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Fence_xvmd/fence_xvm problem



Hi,

 

I was trying to configure Xen guests as virtual services under Cluster Suite. My configuration is simple:

 

Node one "d1" runs xen guest as virtual service "vm_service1", and node one "d2" runs virtual service "vm_service2".

 

The /etc/cluster/cluster.conf file is below:

 

<?xml version="1.0"?>

<cluster alias="VM_Data_Cluster" config_version="112" name="VM_Data_Cluster">

        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="300"/>

        <clusternodes>

                <clusternode name="d1" nodeid="1" votes="1">

                        <multicast addr="225.0.0.1" interface="eth0"/>

                        <fence>

                                <method name="1">

                                        <device name="apc_power_switch" port="1"/>

                                </method>

                        </fence>

                </clusternode>

                <clusternode name="d2" nodeid="2" votes="1">

                        <multicast addr="225.0.0.1" interface="eth0"/>

                        <fence>

                                <method name="1">

                                        <device name="apc_power_switch" port="2"/>

                                </method>

                        </fence>

                </clusternode>

        </clusternodes>

        <cman expected_votes="1" two_node="1">

                <multicast addr="225.0.0.1"/>

        </cman>

        <fencedevices>

                <fencedevice agent="fence_apc" ipaddr="X.X.X.X" login="apc" name="apc_power_switch"   passwd="apc"/>

        </fencedevices>

        <rm>

                <failoverdomains>

                        <failoverdomain name="VM_d1_failover" ordered="0" restricted="0">

                                <failoverdomainnode name="d1" priority="1"/>

                        </failoverdomain>

                        <failoverdomain name="VM_d2_failover" ordered="0" restricted="0">

                                <failoverdomainnode name="d2" priority="1"/>

                        </failoverdomain>

                <resources/>

                <vm autostart="1" domain="VM_d1_failover" exclusive="0" name="vm_service1"       

                    path="/virts/service1" recovery="relocate"/>

                <vm autostart="1" domain="VM_d2_failover" exclusive="0" name="vm_service2"

                    path="/virts/service2" recovery="relocate"/>

        </rm>

        <totem consensus="4800" join="60" token="10000" token_retransmits_before_loss_const="20"/>

        <fence_xvmd family="ipv4"/>

</cluster>

 

On guests “vm_service1”  and “vm_service2” I have configured the second cluster.

 

<cluster alias="SV_Data_Cluster" config_version="29" name="SV_Data_Cluster">

        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>

        <clusternodes>

                <clusternode name="d11" nodeid="1" votes="1">

                        <fence>

                                <method name="1">

                                        <device domain="d11" name="virtual_fence"/>

                                </method>

                        </fence>

                </clusternode>

                <clusternode name="d12" nodeid="2" votes="1">

                        <fence>

                                <method name="1">

                                        <device domain="d12" name="virtual_fence"/>

                                </method>

                        </fence>

                </clusternode>

        </clusternodes>

        <cman expected_votes="1" two_node="1"/>

        <fencedevices>

                <fencedevice agent="fence_xvm" name="virtual_fence”/>

        </fencedevices>

        <rm>

       </rm>

</cluster>

 

The problem is that the fence_xvmd/fence_xvm mechanism doesn’t work due to propably misconfiguration of multicast.

 

Physical nodes “d1” and “d2” and xen guests “vm_service1” and “vm_service2”  have two ethernet interfaces: private– 10.0.200.x (eth0) and public (eth1).

 

On physical nodes, “fence_xvmd” deamon listens defaults on eth1 interface:

[root d2 ~]# netstat -g

IPv6/IPv4 Group Memberships

Interface       RefCnt Group

--------------- ------ ---------------------

lo              1      ALL-SYSTEMS.MCAST.NET

eth0            1      225.0.0.1

eth0            1      ALL-SYSTEMS.MCAST.NET

eth1            1      225.0.0.12

eth1            1      ALL-SYSTEMS.MCAST.NET

virbr0          1      ALL-SYSTEMS.MCAST.NET

lo              1      ff02::1

….

 

Next when I make on xen guest “vm_service1”  a test to fence guest “vm_service2”  I get:

 

[root d11 cluster]# /sbin/fence_xvm -H d12 -ddddd

Debugging threshold is now 5

-- args @ 0xbf8aea70 --

  args->addr = 225.0.0.12

  args->domain = d12

  args->key_file = /etc/cluster/fence_xvm.key

  args->op = 2

  args->hash = 2

  args->auth = 2

  args->port = 1229

  args->family = 2

  args->timeout = 30

  args->retr_time = 20

  args->flags = 0

  args->debug = 5

-- end args --

Reading in key file /etc/cluster/fence_xvm.key into 0xbf8ada1c (4096 max size)

Actual key length = 4096 bytesOpening /dev/urandom

Sending to 225.0.0.12 via 127.0.0.1

Opening /dev/urandom

Sending to 225.0.0.12 via X.X.X.X

Opening /dev/urandom

Sending to 225.0.0.12 via 10.0.200.124

Waiting for connection from XVM host daemon.

….

Waiting for connection from XVM host daemon.

Timed out waiting for response

 

On the node “d2” where “vm_service2” is running I get:

 

[root d2 ~]# /sbin/fence_xvmd -fddd

Debugging threshold is now 3

-- args @ 0xbfc54e3c --

  args->addr = 225.0.0.12

  args->domain = (null)

  args->key_file = /etc/cluster/fence_xvm.key

  args->op = 2

  args->hash = 2

  args->auth = 2

  args->port = 1229

  args->family = 2

  args->timeout = 30

  args->retr_time = 20

  args->flags = 1

  args->debug = 3

-- end args --

Reading in key file /etc/cluster/fence_xvm.key into 0xbfc53e3c (4096 max size)

Actual key length = 4096 bytesOpened ckpt vm_states

My Node ID = 1

Domain                   UUID                                 Owner State

------                   ----                                 ----- -----

Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001

vm_service2      2dd8193f-e4d4-f41c-a4af-f5b30d19fe00 00001 00001

Storing vm_service2

Domain                   UUID                                 Owner State

------                   ----                                 ----- -----

Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001

vm_service2     2dd8193f-e4d4-f41c-a4af-f5b30d19fe00 00001 00001

Storing vm_service2

Request to fence: d12.

Evaluating Domain: d12   Last Owner/State Unknown

Domain                   UUID                                 Owner State

------                   ----                                 ----- -----

Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001

vm_service2      2dd8193f-e4d4-f41c-a4af-f5b30d19fe00 00001 00001

Storing vm_service2

Request to fence: d12

Evaluating Domain: d12   Last Owner/State Unknown

 

So it looks like the fence_xvmd and fence_xvm cannot communicate earch other.

But “fence_xvm” on “vm_service1” sends multicast packets through all interfaces and node “d2” can receive them. Tcpdump on node “d2” says that the node “d2” receives the packages:

 

[root d2 ~]# tcpdump  -i peth0 -n host 225.0.0.12

listening on peth0, link-type EN10MB (Ethernet), capture size 96 bytes

17:50:47.972477 IP 10.0.200.124.filenet-pch > 225.0.0.12.novell-zfs: UDP, length 176

17:50:49.960841 IP 10.0.200.124.filenet-pch > 225.0.0.12.novell-zfs: UDP, length 176

17:50:51.977425 IP 10.0.200.124.filenet-pch > 225.0.0.12.novell-zfs: UDP, length 176

 

[root d2 ~]# tcpdump  -i peth1 -n host 225.0.0.12

listening on peth1, link-type EN10MB (Ethernet), capture size 96 bytes

17:51:26.168132 IP X.X.X.X.filenet-pch > 225.0.0.12.novell-zfs: UDP, length 176

17:51:28.184802 IP X.X.X.X.filenet-pch > 225.0.0.12.novell-zfs: UDP, length 176

17:51:30.196875 IP X.X.X.X.filenet-pch > 225.0.0.12.novell-zfs: UDP, length 176

 

But I can’t see the “node2” sends anything to xen guest “vm_service1”. So “fence_xvm” gets timeout.

What can I do wrong?

 

Cheers

 

Agnieszka Kukałowicz

NASK, Polska.pl


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]