[Linux-cluster] clearing a fence from a node.

Noam Meltzer noam at emet.co.il
Tue Jan 23 16:27:27 UTC 2007


Hello,

I have the following problem with RHCS4 u4:
1. My cluster configuration has two nodes and I'm using DLM.
2. For some arbitrary reason I decide to take one of the nodes down. (/sbin/poweroff)
3. Cluster services are migrated to the other node.
4. When I try to poweron the first node again, it fails to join the cluster, because it is fenced out.
5. This means I am left with only one node in the cluster.
6. Only Solution I have found for this issue this far is to reboot both nodes simultaneously, make sure they boot at about the same time (~3 seconds gap at most), and then when the DLM service is started on both node at the same time, the cluster is reestablished with two members.
7. Failing to boot the cluster at the same time will put me back to the problem described in article 4.

I am looking for a manual way to clear the fencing prohibiting the first node from rejoining the cluster during reboot. Any ideas?

Best regards,
Noam Meltzer
Software Support Engineer & RHCE
E&M Computing




More information about the Linux-cluster mailing list