[Linux-cluster] quorum lost in spite of 'leave remove'

Kadlecsik Jozsi kadlec at sunserv.kfki.hu
Tue Sep 4 09:26:18 UTC 2007


On Fri, 31 Aug 2007, Kadlecsik Jozsi wrote:

> In spite of having 'fence_tool leave' and 'cman_tool leave remove' in the 
> 'cman' init script, when stopping the five-member cluster, it looses 
> quorum when only two machines run the cluster components:
> 
> root at web1:~# cman_tool status
> Version: 6.0.1
> Config Version: 6
> Cluster Name: kfki
> Cluster Id: 1583
> Cluster Member: Yes
> Cluster Generation: 748
> Membership state: Cluster-Member
> Nodes: 2
> Expected votes: 5
> Total votes: 2
> Quorum: 3 Activity blocked
> Active subsystems: 7
> Flags: 
> Ports Bound: 0 11  
> Node name: web1-gfs
> Node ID: 4
> Multicast addresses: 224.0.0.3 
> Node addresses: 192.168.192.6 
> 
> root at web1:~# cman_tool nodes 
> Node  Sts   Inc   Joined               Name
>    1   X    728                        lxserv0-gfs
>    2   M    728   2007-08-31 09:19:09  lxserv1-gfs
>    3   X    728                        web0-gfs
>    4   M    724   2007-08-31 09:18:48  web1-gfs
>    5   X    728                        saturn-gfs
> 
> '/etc/init.d/cman stop' was issued and executed successfully on the tree 
> other nodes.

As I see it happens because the 'expected_votes' of the nodes are not
adjusted when nodes are removed. So even when decreasing of the quorum is 
allowed, the highest expected vote value prevents decreasing the 
value of the quorum.

I wrote the attached patch to adjust expected_votes when a node is removed 
(and when it appears again). Please review it and apply if you agree with 
it.

Best regards,
Jozsef
--
E-mail : kadlec at sunserv.kfki.hu, kadlec at blackhole.kfki.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: KFKI Research Institute for Particle and Nuclear Physics
         H-1525 Budapest 114, POB. 49, Hungary
-------------- next part --------------
diff -urN --exclude=deb cluster-2.01.00.orig/cman/daemon/commands.c cluster-2.01.00/cman/daemon/commands.c
--- cluster-2.01.00.orig/cman/daemon/commands.c	2007-06-26 11:09:13.000000000 +0200
+++ cluster-2.01.00/cman/daemon/commands.c	2007-09-04 10:43:27.000000000 +0200
@@ -1867,7 +1867,7 @@
 	}
 }
 
-void override_expected(int newexp)
+void reset_expected(int may_increase, int newexp)
 {
 	struct list *nodelist;
 	struct cluster_node *node;
@@ -1875,13 +1875,12 @@
 	list_iterate(nodelist, &cluster_members_list) {
 		node = list_item(nodelist, struct cluster_node);
 		if (node->state == NODESTATE_MEMBER
-		    && node->expected_votes > newexp) {
+		    && (node->expected_votes > newexp || may_increase)) {
 			node->expected_votes = newexp;
 		}
 	}
 }
 
-
 /* Add a node from CCS, note that it may already exist if user has simply updated the config file */
 void add_ccs_node(char *nodename, int nodeid, int votes, int expected_votes)
 {
@@ -1942,6 +1941,8 @@
 		node->incarnation = incarnation;
 		node->state = NODESTATE_MEMBER;
 		cluster_members++;
+		if ((node->leave_reason & 0xF) == CLUSTER_LEAVEFLAG_REMOVED)
+			reset_expected(1, us->expected_votes + node->votes);
 		recalculate_quorum(0);
 	}
 }
@@ -1983,9 +1984,11 @@
 		node->state = NODESTATE_DEAD;
 		cluster_members--;
 
-		if ((node->leave_reason & 0xF) == CLUSTER_LEAVEFLAG_REMOVED)
+		if ((node->leave_reason & 0xF) == CLUSTER_LEAVEFLAG_REMOVED) {
+			override_expected(us->expected_votes > node->votes ?
+					  us->expected_votes - node->votes : 1);
 			recalculate_quorum(1);
-		else
+		} else
 			recalculate_quorum(0);
 		break;
 
diff -urN --exclude=deb cluster-2.01.00.orig/cman/daemon/commands.h cluster-2.01.00/cman/daemon/commands.h
--- cluster-2.01.00.orig/cman/daemon/commands.h	2006-08-17 15:22:39.000000000 +0200
+++ cluster-2.01.00/cman/daemon/commands.h	2007-09-04 10:28:17.000000000 +0200
@@ -29,12 +29,12 @@
 extern void add_ais_node(int nodeid, uint64_t incarnation, int total_members);
 extern void del_ais_node(int nodeid);
 extern void add_ccs_node(char *name, int nodeid, int votes, int expected_votes);
-extern void override_expected(int expected);
+extern void reset_expected(int may_increase, int expected);
 extern void cman_send_confchg(unsigned int *member_list, int member_list_entries,
 			      unsigned int *left_list, int left_list_entries,
 			      unsigned int *joined_list, int joined_list_entries);
 
-
+#define override_expected(expected)	reset_expected(0, expected)
 
 /* Startup stuff called from cmanccs: */
 extern int cman_set_nodename(char *name);


More information about the Linux-cluster mailing list