[Cluster-devel] cluster/cman/man cman_tool.8

Thu Nov 8 09:39:11 UTC 2007

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL5
Changes by:	pcaulfield at sourceware.org	2007-11-08 09:39:10

Modified files:
	cman/man       : cman_tool.8 

Log message:
	add an explanation of the "cman_tool nodes" states and some detail about the
	"disallowed" state.
	bz#323931

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman/man/cman_tool.8.diff?cvsroot=cluster&only_with_tag=RHEL5&r1=1.9.2.4&r2=1.9.2.5

--- cluster/cman/man/cman_tool.8	2007/10/12 18:53:57	1.9.2.4
+++ cluster/cman/man/cman_tool.8	2007/11/08 09:39:10	1.9.2.5
@@ -1,4 +1,4 @@
-.TH CMAN_TOOL 8 "Nov 23 2004" "Cluster utilities"
+.TH CMAN_TOOL 8 "Nov 8 2007" "Cluster utilities"
 
 .SH NAME
 cman_tool \- Cluster Management Tool
@@ -267,3 +267,69 @@
 .br
 16 Interaction with OpenAIS
 .br
+.SH NOTES
+.br
+the 
+.B nodes
+subcommand shows a list of nodes known to cman. the state is one of the following:
+.br
+M	The node is a member of the cluster
+.br
+X	The node is not a member of the cluster
+.br
+d	The node is known to the cluster but disallowed access to it.
+.br
+.SH DISALLOWED NODES
+Occasionally (but very infrequently I hope) you may see nodes marked as "Disallowed" in cman_tool status or "d" in cman_tool nodes.  This is a bit of a nasty hack to get around mismatch between what the upper layers expect of the cluster manager and OpenAIS.
+.TP
+If a node experiences a momentary lack of connectivity, but one that is long enough to trigger the token timeouts, then it will be removed from the cluster. When connectivity is restored OpenAIS will happily let it rejoin the cluster with no fuss. Sadly the upper layers don't like this very much. They may (indeed probably will have) have changed their internal state while the other node was away and there is no straightforward way to bring the rejoined node up-to-date with that state. When this happens the node is marked "Disallowed" and is not permitted to take part in cman operations.  
+.P
+If the remainder of the cluster is quorate the the node will be sent a kill message and it will be forced to leave the cluster that way. Note that fencing should kick in to remove the node permanently anyway, but it may take longer than the network outage for this to complete.
+
+If the remainder of the cluster is inquorate then we have a problem. The likelihood is that we will have two (or more) partitioned clusters and we cannot decide which is the "right" one. In this case we need to defer to the system administrator to kill an appropriate selection of nodes to restore the cluster to sensible operation.
+
+The latter scenario should be very rare and may indicate a bug somewhere in the code. If the local network is very flaky or busy it may be necessary to increase some of the protocol timeouts for OpenAIS. We are trying to think of better solutions to this problem.
+
+Recovering from this state can, unfortunately, be complicated. Fortunately, in the majority of cases, fencing will do the job for you, and the disallowed state will only be temporary. If it persists, the recommended approach it is to do a cman tool nodes on all systems in the cluster and determine the largest common subset of nodes that are valid members to each other. Then reboot the others and let them rejoin correctly. In the case of a single-node disconnection this should be straightforward, with a large cluster that has experienced a network partition it could get very complicated!
+
+Example:
+
+In this example we have a five node cluster that has experienced a network partition. Here is the output of cman_tool nodes from all systems:
+.nf
+Node  Sts   Inc   Joined               Name
+   1   M   2372   2007-11-05 02:58:55  node-01.example.com
+   2   d   2376   2007-11-05 02:58:56  node-02.example.com
+   3   d   2376   2007-11-05 02:58:56  node-03.example.com
+   4   M   2376   2007-11-05 02:58:56  node-04.example.com
+   5   M   2376   2007-11-05 02:58:56  node-05.example.com
+
+Node  Sts   Inc   Joined               Name
+   1   d   2372   2007-11-05 02:58:55  node-01.example.com
+   2   M   2376   2007-11-05 02:58:56  node-02.example.com
+   3   M   2376   2007-11-05 02:58:56  node-03.example.com
+   4   d   2376   2007-11-05 02:58:56  node-04.example.com
+   5   d   2376   2007-11-05 02:58:56  node-05.example.com
+
+Node  Sts   Inc   Joined               Name
+   1   d   2372   2007-11-05 02:58:55  node-01.example.com
+   2   M   2376   2007-11-05 02:58:56  node-02.example.com
+   3   M   2376   2007-11-05 02:58:56  node-03.example.com
+   4   d   2376   2007-11-05 02:58:56  node-04.example.com
+   5   d   2376   2007-11-05 02:58:56  node-05.example.com
+
+Node  Sts   Inc   Joined               Name
+   1   M   2372   2007-11-05 02:58:55  node-01.example.com
+   2   d   2376   2007-11-05 02:58:56  node-02.example.com
+   3   d   2376   2007-11-05 02:58:56  node-03.example.com
+   4   M   2376   2007-11-05 02:58:56  node-04.example.com
+   5   M   2376   2007-11-05 02:58:56  node-05.example.com
+
+Node  Sts   Inc   Joined               Name
+   1   M   2372   2007-11-05 02:58:55  node-01.example.com
+   2   d   2376   2007-11-05 02:58:56  node-02.example.com
+   3   d   2376   2007-11-05 02:58:56  node-03.example.com
+   4   M   2376   2007-11-05 02:58:56  node-04.example.com
+   5   M   2376   2007-11-05 02:58:56  node-05.example.com
+.fi
+In this scenario we should kill the node node-02 and node-03. Of course, the 3 node cluster of node-01, node-04 & node-05 should remain quorate and be able to fenced the two rejoined nodes anyway, but it is possible that the cluster has a qdisk setup that precludes this.
+