[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Cannot join two nodes together



I have two nodes on the same subnet, can ping each other, are both alive, both are members of a two-node cluster.  When I start cman on both nodes at the same time it says “X not a cluster member after 60 sec post_join_delay”.  The output of clustat shows that the other node is “Offline” and the first node is “Online, local”.  The nodes are fencing each other and powering each other off. 

 

Please help determine why I cannot get these nodes to join. Below is some information from my systems. RedHat Support is not getting anywhere. 

 

Thanks

 

----------------------------------

 

Node1: bplmft11

Node2: bplmft12

 

uname –a -> Linux bplmft11 2.6.18-8.1.10.el5 #1 SMP Thu Aug 30 20:43:28 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

 

[root bplmft11 ~]# clustat

msg_open: No such file or directory

Member Status: Quorate

 

  Member Name                        ID   Status

  ------ ----                        ---- ------

  bplmft12                              1 Offline

  bplmft11                              2 Online, Local

 

 

/etc/cluster/cluster.conf file (with the fencing levels removed):

 

<?xml version="1.0" ?>

<cluster alias="plm_test" config_version="16" name="plm_test">

        <fence_daemon post_fail_delay="0" post_join_delay="60"/>

        <clusternodes>

                <clusternode name="bplmft12" nodeid="1" votes="1">

                        <fence>

                                <method name="1"/>

                        </fence>

                </clusternode>

                <clusternode name="bplmft11" nodeid="2" votes="1">

                        <fence>

                                <method name="1"/>

                        </fence>

                </clusternode>

        </clusternodes>

        <cman expected_votes="1" two_node="1"/>

        <fencedevices>

                <fencedevice agent="fence_ilo" hostname="ilo-bplmft12" login="redhat_cluster_user" name="ilo-bplmft12" passwd="PASSWORD"/>

                <fencedevice agent="fence_ilo" hostname="ilo-bplmft11" login="redhat_cluster_user" name="ilo-bplmft11" passwd="PASSWORD"/>

        </fencedevices>

        <rm>

                <failoverdomains/>

                <resources/>

        </rm>

</cluster>

 

 

/var/log/messages after doing a “service cman start” on both nodes:

 

ep 26 13:33:04 bplmft11 ccsd[31407]: Cluster is not quorate.  Refusing connection.

Sep 26 13:33:04 bplmft11 ccsd[31407]: Error while processing connect: Connection refused

Sep 26 13:33:04 bplmft11 ccsd[31407]: Initial status:: Quorate

Sep 26 13:33:10 bplmft11 snmpd[2616]: Connection from UDP: [127.0.0.1]:32771

Sep 26 13:33:10 bplmft11 snmpd[2616]: Received SNMP packet(s) from UDP: [127.0.0.1]:32771

Sep 26 13:33:25 bplmft11 snmpd[2616]: Connection from UDP: [127.0.0.1]:32771

Sep 26 13:33:55 bplmft11 last message repeated 2 times

Sep 26 13:34:06 bplmft11 fenced[31436]: bplmft12 not a cluster member after 60 sec post_join_delay

Sep 26 13:34:06 bplmft11 fenced[31436]: fencing node "bplmft12"

Sep 26 13:34:06 bplmft11 fenced[31436]: fence "bplmft12" failed


This message is intended only for the individual or entity to which it is addressed and contains information that is proprietary to The Babcock & Wilcox Company and/or its affiliates, or may be otherwise confidential. If the reader of this message is not the intended recipient, or the employee agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return e-mail and delete this message from your computer. Thank you.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]