[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] manual fencing problem



I was running my test last night and I got an i/o error
from the disk subsystem that caused one of the nodes to
panic.

The other 2 nodes removed the dead node from membership, but
the the fencing did not work.  

cl030 /var/log/messages:

Dec 13 21:54:26 cl030 kernel: CMAN: no HELLO from cl032a, removing from the cluster
Dec 13 21:54:27 cl030 fenced[12121]: fencing node "cl032a"
Dec 13 21:54:27 cl030 fenced[12121]: fence "cl032a" failed
Dec 13 21:54:28 cl030 fenced[12121]: fencing node "cl032a"
Dec 13 21:54:28 cl030 fenced[12121]: fence "cl032a" failed
Dec 13 21:54:29 cl030 fenced[12121]: fencing node "cl032a"

This goes on all night..

cl031 /var/log/messagew:
Dec 13 21:54:27 cl031 fenced[11850]: fencing deferred to 1

[root cl030 root]#  fence_ack_manual -s cl032a
                                                                                
Warning:  If the node "cl032a" has not been manually fenced
(i.e. power cycled or disconnected from shared storage devices)
the GFS file system may become corrupted and all its data
unrecoverable!  Please verify that the node shown above has
been reset or disconnected from storage.
                                                                                
Are you certain you want to continue? [yN] y
can't open /tmp/fence_manual.fifo: No such file or directory

I've attached my cluster.conf file.

Do I have fencing set up correctly.  Any ideas on why
fenced is failing to fence?

Thanks,

Daniel


<?xml version="1.0"?>
<cluster name="gfs_cluster" config_version="1">

<cman>
</cman>

<clusternodes>
<clusternode name="cl030a" votes="1">
	<fence>
		<method name="single">
			<device name="human" ipaddr="cl030a"/>
		</method>
	</fence>
</clusternode>

<clusternode name="cl031a" votes="1">
	<fence>
		<method name="single">
			<device name="human" ipaddr="cl031a"/>
		</method>
	</fence>
</clusternode>

<clusternode name="cl032a" votes="1">
	<fence>
		<method name="single">
			<device name="human" ipaddr="cl032a"/>
		</method>
	</fence>
</clusternode>

</clusternodes>

<fencedevices>
	<device name="human" agent="fence_manual"/>
</fencedevices>

</cluster>

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]