[Linux-cluster] Fence agents not working as expected

Ben Yarwood ben.yarwood at juno.co.uk
Wed Feb 15 16:51:10 UTC 2006


I am using FC4 updated to the lastest releases-updated version and am having
some problems getting the wti fence agent to work.  I think perhaps I have a
config error.  I have a three node cluster.  When a node needs fencing, the
following messages appear

>From member jrmedia-c

Feb 15 10:54:46 jrmedia-c kernel: CMAN: removing node jrmedia-a from the
cluster : Missed too many heartbeats
Feb 15 10:54:47 jrmedia-c fenced[2069]: fencing deferred to jrmedia-b

>From member Jrmedia-b

Feb 15 10:54:45 jrmedia-b fenced[2055]: jrmedia-a not a cluster member after
0 sec post_fail_delay
Feb 15 10:54:45 jrmedia-b fenced[2055]: fencing node "jrmedia-a"
Feb 15 10:54:45 jrmedia-b fence_manual: Node jrmedia-a needs to be reset
before recovery can procede.  Waiting for jrmedia-a to rejoin the cluster or
for manual acknowledgement that it has been reset (i.e. fence_ack_manual -n
jrmedia-a)

After a "fence_ack_manual -n jrmedia-a" the following message appears.

Feb 15 10:56:38 jrmedia-b fenced[2055]: fence "jrmedia-a" success

There is no mention of trying to use the wti fence agent.  It should be
noted the wti fence device has no password and I am not trying to use the
brocade agents that I have named as devices at the moment. Below is an
extract from my cluster.conf file.

        <clusternodes>

                <clusternode name="jrmedia-a">
                        <fence>
                                <!-- "power" method is tried before all
others -->
                                <method name="power">
                                        <device name="wti" port="16"/>
                                </method>
                                <method name="human">
                                        <device name="last_resort"
ipaddr="jrmedia-a"/>
                                </method>
                        </fence>
                </clusternode>

                <clusternode name="jrmedia-b">
                        <fence>
                                <!-- "power" method is tried before all
others -->
                                <method name="power">
                                        <device name="wti" port="15"/>
                                </method>
                                <method name="human">
                                        <device name="last_resort"
ipaddr="jrmedia-b"/>
                                </method>
                        </fence>
                </clusternode>

                <clusternode name="jrmedia-c">
                        <fence>
                                <!-- "power" method is tried before all
others -->
                                <method name="power">
                                        <device name="wti" port="13"/>
                                </method>
                                <method name="human">
                                        <device name="last_resort"
ipaddr="jrmedia-c"/>
                                </method>
                        </fence>
                </clusternode>

        </clusternodes>

        <fencedevices>

                <!-- The WTI fence device requires no login name -->
                <fencedevice name="ibm_3534_a" agent="fence_brocade"
ipaddr="10.0.1.67" login="admin" passwd="xxxx"/>
                <fencedevice name="ibm_3534_b" agent="fence_brocade"
ipaddr="10.0.1.68" login="admin" passwd="xxxx"/>
                <fencedevice name="wti" agent="fence_wti"
ipaddress="10.0.1.40" passwd=""/>
                <fencedevice name="last_resort" agent="fence_manual"/>

        </fencedevices>


Ben Yarwood
Technical Director
Juno Records
t - 020 7424 2804
m - 07930 922 333
e - ben.yarwood at juno.co.uk 







More information about the Linux-cluster mailing list