[Linux-cluster] Cluster services die when nonactive node is rebooted
Eric Schneider
eschneid at uccs.edu
Wed Jul 28 18:47:53 UTC 2010
I made the change and I will try it today during our scheduled maintenance.
Thanks,
Eric
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of umesh susvirkar
Sent: Sunday, July 25, 2010 12:18 AM
To: linux clustering
Subject: Re: [Linux-cluster] Cluster services die when nonactive node is
rebooted
Try to set following in you cluster.conf file
<cman expected_votes="3" quorum_dev_poll="35000" >
<multicast addr="224.0.0.1" interface="eth0"/>
</cman>
---
cal for
quorum_dev_poll > (interval * tko )
as per below 5*6 = 30 so 35
<quorumd interval="5" label="delta_qdisk" min_score="1" tko="6" votes="1">
<heuristic interval="5" program="ping -t1 -c1 192.168.1.1"
score="1"/>
</quorumd>
for more info read following doc
https://access.redhat.com/kb/docs/DOC-2882
http://people.redhat.com/lhh/cmanvsqdisk.png
On Sat, Jul 24, 2010 at 3:50 AM, Eric Schneider <eschneid at uccs.edu> wrote:
I have a few 2 node clusters and I notice that recently the clusters lose
quorum when I reboot the node without running services. I could do this in
the past without any problems. CentOS 5.5 on ESX 4.0 u1. Maybe a bug with
a new kernel or cman software?
I get the following right away when the node reboots:
Jul 23 16:02:32 happy5 clurgmgrd[4269]: <notice> Member 2 shutting down
Jul 23 16:02:52 happy5 qdiskd[3562]: <info> Node 2 shutdown
Jul 23 16:03:02 happy5 qdiskd[3562]: <info> Assuming master role
Jul 23 16:03:03 happy5 clurgmgrd[4269]: <emerg> #1: Quorum Dissolved
Jul 23 16:03:03 happy5 openais[3533]: [CMAN ] lost contact with quorum
device
Jul 23 16:03:03 happy5 openais[3533]: [CMAN ] quorum lost, blocking activity
Jul 23 16:03:03 happy5 ccsd[3493]: Cluster is not quorate. Refusing
connection.
Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing connect:
Connection refused
Jul 23 16:03:03 happy5 ccsd[3493]: Cluster is not quorate. Refusing
connection.
Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing connect:
Connection refused
Jul 23 16:03:03 happy5 ccsd[3493]: Invalid descriptor specified (-111).
Jul 23 16:03:03 happy5 ccsd[3493]: Someone may be attempting something evil.
Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing get: Invalid
request descriptor
Jul 23 16:03:03 happy5 ccsd[3493]: Invalid descriptor specified (-111).
Jul 23 16:03:03 happy5 ccsd[3493]: Someone may be attempting something evil.
Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing get: Invalid
request descriptor
<?xml version="1.0"?>
<cluster alias="delta_cluster" config_version="40" name="delta_cluster">
<fence_daemon post_fail_delay="5" post_join_delay="120"/>
<quorumd interval="5" label="delta_qdisk" min_score="1" tko="6"
votes="1">
<heuristic interval="5" program="ping -t1 -c1 192.168.1.1"
score="1"/>
</quorumd>
<clusternodes>
<clusternode name="node1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="node1"/>
</method>
</fence>
</clusternode>
<clusternode name="node2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="node2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="3">
<multicast addr="224.0.0.1" interface="eth0"/>
</cman>
<fencedevices>
<fencedevice agent="fence_manual" name="fence_manual"/>
<fencedevice agent="fence_vmware" ipaddr="bob"
login="username" name="node1" passwd="password" port="node1"/>
<fencedevice agent="fence_vmware" ipaddr="bob"
login="username" name="node2" passwd="password" port="node2"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="node1" ordered="0"
restricted="1">
<failoverdomainnode name="node1"
priority="1"/>
</failoverdomain>
<failoverdomain name="node2" restricted="1">
<failoverdomainnode name="node2"
priority="1"/>
</failoverdomain>
<failoverdomain name="failover_pro-http"
restricted="0">
<failoverdomainnode name="node1"
priority="1"/>
<failoverdomainnode name="node2"
priority="1"/>
</failoverdomain>
</failoverdomains>
</rm>
<totem token="21000"/>
</cluster>
Thanks,
Eric
--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100728/c189a1f2/attachment.htm>
More information about the Linux-cluster
mailing list