[Linux-cluster] cman startup after after update to 5.3

Gunther Schlegel schlegel at riege.com
Tue Jan 27 21:03:35 UTC 2009


Hello,

I updated one node from 5.2 to 5.3 using yum update and now cman does 
not start up anymore -- looks like ccsd has some problems:

[root at motel6 /]# /sbin/ccsd -4 -n
Starting ccsd 2.0.98:
  Built: Dec  3 2008 16:32:30
  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
   IP Protocol:: IPv4 only
   No Daemon:: SET

Cluster is not quorate.  Refusing connection.
Error while processing connect: Connection refused
Cluster is not quorate.  Refusing connection.
Error while processing connect: Connection refused
Unable to connect to cluster infrastructure after 30 seconds.
Unable to connect to cluster infrastructure after 60 seconds.


When starting ccsd using /etc/init.d/cman it reports all three nodes to 
be on cluster.conf version 78, so I guess it is not a network 
connectivity problem.

The other two nodes (still on 5.2z) of the cluster are up and running 
with quorum. Openais is talking to those 2 other nodes and it looks fine 
to me:

Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] Members Joined:
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] #011r(0) ip(10.11.5.22)
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] #011r(0) ip(10.11.5.23)
Jan 27 21:05:26 motel6 openais[1278]: [SYNC ] This node is within the 
primary component and will provide service.
Jan 27 21:05:26 motel6 openais[1278]: [TOTEM] entering OPERATIONAL state.
Jan 27 21:05:26 motel6 openais[1278]: [CMAN ] quorum regained, resuming 
activity
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] got nodejoin message 
10.11.5.21
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] got nodejoin message 
10.11.5.22
Jan 27 21:05:26 motel6 openais[1278]: [CLM  ] got nodejoin message 
10.11.5.23


I am a bit lost...

cluster.conf:
[root at motel6 init.d]# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="RSIXENCluster2" config_version="87" name="RSIXENCluster2">
	<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
	<clusternodes>
		<clusternode name="concorde.riege.de" nodeid="1" votes="1">
			<fence>
				<method name="1">
					<device name="Concorde_IPMI"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="motel6.riege.de" nodeid="2" votes="1">
			<fence>
				<method name="1">
					<device name="Motel6_IPMI"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="mercure.riege.de" nodeid="3" votes="1">
			<fence>
				<method name="1">
					<device name="Mercure_IPMI"/>
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<fencedevices>
		<fencedevice agent="fence_ipmilan" ipaddr="10.11.5.132" login="root" 
name="Concorde_IPMI" passwd="XXX"/>
		<fencedevice agent="fence_ipmilan" ipaddr="10.11.5.131" login="root" 
name="Motel6_IPMI" passwd="xxx"/>
		<fencedevice agent="fence_ipmilan" ipaddr="10.11.5.133" login="root" 
name="Mercure_IPMI" passwd="XXX"/>
	</fencedevices>
	<rm>
		<failoverdomains>
			<failoverdomain name="Earth" nofailback="1" ordered="1" restricted="1">
				<failoverdomainnode name="concorde.riege.de" priority="1"/>
				<failoverdomainnode name="motel6.riege.de" priority="1"/>
				<failoverdomainnode name="mercure.riege.de" priority="1"/>
			</failoverdomain>
			<failoverdomain name="Europe" nofailback="0" ordered="1" restricted="0">
				<failoverdomainnode name="concorde.riege.de" priority="2"/>
			</failoverdomain>
			<failoverdomain name="North America" nofailback="0" ordered="1" 
restricted="0">
				<failoverdomainnode name="motel6.riege.de" priority="2"/>
			</failoverdomain>
			<failoverdomain name="Africa" nofailback="0" ordered="1" restricted="0">
				<failoverdomainnode name="mercure.riege.de" priority="1"/>
			</failoverdomain>
		</failoverdomains>
		<resources/>
		<vm autostart="1" domain="Africa" exclusive="0" migrate="live" 
name="vm64.test.riege.de_64" path="/etc/xen" recovery="restart"/>
		<vm autostart="1" domain="North America" exclusive="0" migrate="pause" 
name="rt.test.riege.de_32" path="/etc/xen" recovery="restart"/>
		<vm autostart="1" domain="Africa" exclusive="0" migrate="pause" 
name="poincare.riege.de_32" path="/etc/xen" recovery="restart"/>
		<vm autostart="1" domain="North America" exclusive="0" migrate="live" 
name="jboss.dev.riege.de_64" path="/etc/xen" recovery="relocate"/>
		<vm autostart="1" domain="Africa" exclusive="0" migrate="live" 
name="master.cc3.dev.riege.de_64" path="/etc/xen" recovery="relocate"/>
		<vm autostart="1" domain="Europe" exclusive="0" migrate="pause" 
name="test.alphatrans.scope.riege.com_32" path="/etc/xen" 
recovery="relocate"/>
		<vm autostart="1" domain="North America" exclusive="0" migrate="live" 
name="slave.cc3.dev.riege.de_64" path="/etc/xen" recovery="restart"/>
		<vm autostart="1" domain="North America" exclusive="0" migrate="live" 
name="webmail.riege.com_64" path="/etc/xen" recovery="relocate"/>
		<vm autostart="1" domain="Europe" exclusive="0" migrate="live" 
name="live.rsi.scope.riege.com_64" path="/etc/xen" recovery="relocate"/>
		<vm autostart="1" domain="Europe" exclusive="0" migrate="pause" 
name="qa-16.rsi.scope.riege.com_32" path="/etc/xen" recovery="relocate"/>
		<vm autostart="1" domain="Africa" exclusive="0" migrate="pause" 
name="qa-18.rsi.scope.riege.com_32" path="/etc/xen" recovery="relocate"/>
		<vm autostart="1" domain="Africa" exclusive="0" migrate="pause" 
name="vm32.test.riege.de_32" path="/etc/xen" recovery="restart"/>
		<vm autostart="1" domain="Europe" exclusive="0" migrate="pause" 
name="qa-head.rsi.scope.riege.com_32" path="/etc/xen" recovery="restart"/>
		<vm autostart="1" domain="North America" exclusive="0" migrate="live" 
name="mq.dev.riege.de_64" path="/etc/xen" recovery="relocate"/>
		<vm autostart="1" domain="Europe" exclusive="0" migrate="live" 
name="archive.dev.riege.de_64" path="/etc/xen" recovery="restart"/>
	</rm>
	<cman quorum_dev_poll="50000"/>
	<totem consensus="4800" join="60" token="60000" 
token_retransmits_before_loss_const="20"/>
	<quorumd device="/dev/mapper/Quorum_Partition" interval="3" 
min_score="1" tko="10" votes="2"/>
</cluster>

best regards, Gunther

-- 
.............................................................
Riege Software International GmbH  Fon: +49 (2159) 9148 0
Mollsfeld 10                       Fax: +49 (2159) 9148 11
40670 Meerbusch                    Web: www.riege.com
Germany                            E-Mail: schlegel at riege.com
---                                ---
Handelsregister:                   Managing Directors:
Amtsgericht Neuss HRB-NR 4207      Christian Riege
USt-ID-Nr.: DE120585842            Gabriele  Riege
                                   Johannes  Riege
.............................................................
           YOU CARE FOR FREIGHT, WE CARE FOR YOU          



-------------- next part --------------
A non-text attachment was scrubbed...
Name: schlegel.vcf
Type: text/x-vcard
Size: 344 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090127/ec0ca44f/attachment.vcf>


More information about the Linux-cluster mailing list