[Linux-cluster] Cluster "averts" fencing

David Challoner dchalloner at alchemy.net
Wed Oct 21 05:15:37 UTC 2009


Hello,

I have a four node cluster using fence-scsi and when I purposely fail
any node but the primary (node 1) it seems to always "avert" fencing.
If I fail the primary node, the cluster correctly fences the node.

 

>From node1's 'group_tool  dump fence' when I fail node4:

1255903014 start default 103 members 2 3 1  
1255903014 do_recovery stop 98 start 103 finish 98 
1255903014 add node 4 to list 1 
1255903014 averting fence of node 192.168.105.16 
1255903014 finish default 103

 

The node doesn't get fenced and it retains it's scsi registrations.

 

>From the source:
http://git.fedorahosted.org/git/fence.git?p=fence.git;a=blob;f=fence/fen
ced/recover.c

 

It looks like the conditions for failing are:
cpg_member = is_clean_daemon_member(node->nodeid); 
ext = is_fenced_external(fd, node->nodeid); 
if ((cluster_member && cpg_member) || ext) { 
log_debug("averting fence of node %s " 
  "cluster member %d cpg member %d external %d", 
  node->name, cluster_member, cpg_member, ext);

 

I don't think either  "is_clean_daemon_member" or "is_fenced"external"
should be true.   Fenced isn't started as a clean daemon and
is_fenced_external (I believe) means that the node is fenced externally
by another fenced daemon which shouldn't be true either. 

 

Any ideas what could be going on here?  Help or suggestions would be
appreciated!

 

Sincerely,

David Challoner

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20091020/c232f38c/attachment.htm>


More information about the Linux-cluster mailing list