[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Trouble adding back in an old node



I'm running Centos 5.2 and using the the cluster suite + GFS1. I have an EMC CX600 providing shared storage to some LUNs. Im using broacde port fencing.

I'm experiencing a problem trying to add a previously removed node back into the cluster. The node was having hardare RAM issues so it was removed from the cluster completely (i.e. removed from the cluster.conf and removed from the storage zoning as well). I then added 3 more nodes to the cluster. Now that the bad RAM has been identified and removed, I wanted to add the node back in. I followed the instructions that I had used on the previous 3 nodes (i.e. used system-config-cluster to configure the node, save and propagate the cluster.conf, manually propagate the cluster.conf to the newly added node, and then start up cman and clvmd). However when I tried to start up cman with "service cman start". The process hangs when actually starting up cman. I did some digging and in the /var/log/messages of the node I'm attempting to add, I get the following:

Jan 23 15:41:39 node004 ccsd[9342]: Initial status:: Inquorate
Jan 23 15:41:40 node004 ccsd[9342]: Cluster is not quorate. Refusing connection. Jan 23 15:41:40 node004 ccsd[9342]: Error while processing connect: Connection refused Jan 23 15:41:45 node004 ccsd[9342]: Cluster is not quorate. Refusing connection. Jan 23 15:41:45 node004 ccsd[9342]: Error while processing connect: Connection refused Jan 23 15:41:50 node004 ccsd[9342]: Cluster is not quorate. Refusing connection. Jan 23 15:41:50 node004 ccsd[9342]: Error while processing connect: Connection refused

I suspect that this is at least part of the problem. However, I'm a bit confused because the cluster its attempting to join is most definitely quorate. At least according to clustat -f

Cluster Status for rsph_centos_5 @ Fri Jan 23 17:00:45 2009
Member Status: Quorate

Member Name                                                  ID   Status
------ ----                                                  ---- ------
head1.clus.sph.emory.edu 1 Online, Local
node002.clus.sph.emory.edu                                       2 Online
node003.clus.sph.emory.edu                                       3 Online
node004.clus.sph.emory.edu                                       4 Offline
node005.clus.sph.emory.edu                                       5 Online
node006.clus.sph.emory.edu                                       6 Online
node007.clus.sph.emory.edu                                       7 Online


I'm thinking that there is something subtlet that I am missing that I can change to make this work. I really don't want to have to re-install and reconfigure the machine to get this to work. That is something that you do in the Windows world :-)


So here is my cluster.conf file. Passwords changed to protect the guilty.

<?xml version="2.0"?>
<cluster alias="rsph_centos_5" config_version="41" name="rsph_centos_5">
<fence_daemon clean_start="1" post_fail_delay="30" post_join_delay="90"/>
       <clusternodes>
<clusternode name="head1.clus.sph.emory.edu" nodeid="1" votes="7">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="1"/> <device name="sanclusb1.sph.emory.edu" port="1"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node002.clus.sph.emory.edu" nodeid="2" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="2"/> <device name="sanclusb1.sph.emory.edu" port="2"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node003.clus.sph.emory.edu" nodeid="3" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="3"/> <device name="sanclusb1.sph.emory.edu" port="3"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node005.clus.sph.emory.edu" nodeid="5" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="5"/> <device name="sanclusb1.sph.emory.edu" port="5"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node006.clus.sph.emory.edu" nodeid="6" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="6"/> <device name="sanclusb1.sph.emory.edu" port="6"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node007.clus.sph.emory.edu" nodeid="7" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="7"/> <device name="sanclusb1.sph.emory.edu" port="7"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node004.clus.sph.emory.edu" nodeid="4" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="4"/> <device name="sanclusb1.sph.emory.edu" port="4"/>
                               </method>
                       </fence>
               </clusternode>
       </clusternodes>
       <cman/>
       <fencedevices>
<fencedevice agent="fence_brocade" ipaddr="170.140.183.87" login="admin" name="sanclusa1.sph.emory.edu" passwd="mypasshere"/> <fencedevice agent="fence_brocade" ipaddr="170.140.183.88" login="admin" name="sanclusb1.sph.emory.edu" passwd="mypasshere"/>
       </fencedevices>
       <rm>
               <failoverdomains/>
               <resources/>
       </rm>
</cluster>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]