[Linux-cluster] 3 node cluster problems

Dalton, Maurice bobby.m.dalton at nasa.gov
Thu Mar 27 15:54:04 UTC 2008


I have removed the 3rd server, as long as I am running with 2 nodes and
qdisk. I am not seeing any problems

I add the 3rd server and my problems begin.


-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie Thomas
Sent: Thursday, March 27, 2008 10:28 AM
To: linux clustering
Subject: Re: [Linux-cluster] 3 node cluster problems

Are you using a private vlan for your cluster communications. If not, 
you should be. the communicatuions
between the clustered nodes is very chatty Just my opinion.

These are my opinions and experiences.

Any views or opinions presented are solely those of the author and do
not necessarily 
represent those of Raytheon unless specifically stated. 
Electronic communications including email might be monitored by
Raytheon. 
for operational or business reasons.


Dalton, Maurice wrote:
> Cisco 3550
>
>
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie Thomas
> Sent: Thursday, March 27, 2008 9:53 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] 3 node cluster problems
>
> what is the switch brand.   I have read where the RHCS has problems
with
>
> certain switches
>
> Dalton, Maurice wrote:
>   
>> Switches
>>
>> Storage is fiber
>>
>>
>> -----Original Message-----
>> From: linux-cluster-bounces at redhat.com
>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie Thomas
>> Sent: Thursday, March 27, 2008 9:04 AM
>> To: linux clustering
>> Subject: Re: [Linux-cluster] 3 node cluster problems
>>
>> How is your Cluster connections connected. (ie. Are you using a 
>> hub,switch or direct connecting the heartbeat cables) ?
>>
>> Dalton, Maurice wrote:
>>   
>>     
>>> Still having the problem. I can't figure it out. 
>>>
>>> I just upgraded to the latest 5.1 cman.. No help.!!!!!!!!!
>>>
>>>
>>> -----Original Message-----
>>> From: linux-cluster-bounces at redhat.com
>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie Thomas
>>> Sent: Tuesday, March 25, 2008 10:57 AM
>>> To: linux clustering
>>> Subject: Re: [Linux-cluster] 3 node cluster problems
>>>
>>>
>>> Glad they are working. I have not used lvm with our Clusters. You
>>>       
> know
>   
>>>     
>>>       
>>   
>>     
>>> have peaked
>>> my curiosity and I will have to try building one. So were you also
>>>     
>>>       
>> using
>>   
>>     
>>> GFS ?
>>>
>>> Dalton, Maurice wrote:
>>>   
>>>     
>>>       
>>>> Sorry but security here will not allow me to send host files
>>>>
>>>> BUT.
>>>>
>>>>
>>>> I was getting this in /var/log/messages on csarcsys3
>>>>
>>>> Mar 25 15:26:11 csarcsys3-eth0 ccsd[7448]: Cluster is not quorate.
>>>> Refusing connection.
>>>> Mar 25 15:26:11 csarcsys3-eth0 ccsd[7448]: Error while processing
>>>> connect: Connection refused
>>>> Mar 25 15:26:12 csarcsys3-eth0 dlm_controld[7476]: connect to ccs
>>>>     
>>>>       
>>>>         
>>> error
>>>   
>>>     
>>>       
>>>> -111, check ccsd or cluster status
>>>> Mar 25 15:26:12 csarcsys3-eth0 ccsd[7448]: Cluster is not quorate.
>>>> Refusing connection.
>>>> Mar 25 15:26:12 csarcsys3-eth0 ccsd[7448]: Error while processing
>>>> connect: Connection refused
>>>>
>>>>
>>>> I had /dev/vg0/gfsvol on these systems.
>>>>
>>>> I did a lvremove 
>>>>
>>>> Restarted cman on all systems and for some strange reason my
>>>>         
> clusters
>   
>>>> are working.
>>>>
>>>> It doesn't make any sense.
>>>>
>>>> I can't thank you enough for your help.......!!!!!!
>>>>
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: linux-cluster-bounces at redhat.com
>>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie
Thomas
>>>> Sent: Tuesday, March 25, 2008 10:27 AM
>>>> To: linux clustering
>>>> Subject: Re: [Linux-cluster] 3 node cluster problems
>>>>
>>>> I am currently running several 3-node cluster without a quorum
disk.
>>>>         
>
>   
>>>> However, If you want your cluster to run
>>>> if only one node is up then you will need a quorum disk. Can you
>>>>         
> send
>   
>>>>       
>>>>         
>>   
>>     
>>>> your /etc/hosts file
>>>> for all systems, Also, could there be another node name called 
>>>> csarcsys3-eth0 in your NIS or DNS
>>>>
>>>> I configured some using Conga and some with system-config-cluster.
>>>>     
>>>>       
>>>>         
>>> When 
>>>   
>>>     
>>>       
>>>> using the system-config-cluster
>>>> I basically run the config on all nodes; just adding the nodenames
>>>>       
>>>>         
>> and
>>   
>>     
>>>>     
>>>>       
>>>>         
>>>   
>>>     
>>>       
>>>> cluster name. I reboot all nodes
>>>> to make sure they see each other then go back and modify the config
>>>> files.
>>>>
>>>> The file /var/log/messages should also shed some light on the
>>>>       
>>>>         
>> problem.
>>   
>>     
>>>> Dalton, Maurice wrote:
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> Same problem.
>>>>>
>>>>> I now have qdiskd running.
>>>>>
>>>>> I have ran diff's on all three cluster.conf files.. all are the
>>>>>           
> same
>   
>>>>> [root at csarcsys1-eth0 cluster]# more cluster.conf
>>>>>
>>>>> <?xml version="1.0"?>
>>>>>
>>>>> <cluster config_version="6" name="csarcsys5">
>>>>>
>>>>> <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>>>>
>>>>> <clusternodes>
>>>>>
>>>>> <clusternode name="csarcsys1-eth0" nodeid="1" votes="1">
>>>>>
>>>>> <fence/>
>>>>>
>>>>> </clusternode>
>>>>>
>>>>> <clusternode name="csarcsys2-eth0" nodeid="2" votes="1">
>>>>>
>>>>> <fence/>
>>>>>
>>>>> </clusternode>
>>>>>
>>>>> <clusternode name="csarcsys3-eth0" nodeid="3" votes="1">
>>>>>
>>>>> <fence/>
>>>>>
>>>>> </clusternode>
>>>>>
>>>>> </clusternodes>
>>>>>
>>>>> <cman/>
>>>>>
>>>>> <fencedevices/>
>>>>>
>>>>> <rm>
>>>>>
>>>>> <failoverdomains>
>>>>>
>>>>> <failoverdomain name="csarcsysfo" ordered="0" restricted="1">
>>>>>
>>>>> <failoverdomainnode name="csarcsys1-eth0" priority="1"/>
>>>>>
>>>>> <failoverdomainnode name="csarcsys2-eth0" priority="1"/>
>>>>>
>>>>> <failoverdomainnode name="csarcsys3-eth0" priority="1"/>
>>>>>
>>>>> </failoverdomain>
>>>>>
>>>>> </failoverdomains>
>>>>>
>>>>> <resources>
>>>>>
>>>>> <ip address="172.24.86.177" monitor_link="1"/>
>>>>>
>>>>> <fs device="/dev/sdc1" force_fsck="0" force_unmount="1"
>>>>>           
> fsid="57739"
>   
>>>>>         
>>>>>           
>>   
>>     
>>>>> fstype="ext3" mountpo
>>>>>
>>>>> int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/>
>>>>>
>>>>> </resources>
>>>>>
>>>>> </rm>
>>>>>
>>>>> <quorumd interval="4" label="csarcsysQ" min_score="1" tko="30"
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>> votes="2"/>
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> </cluster>
>>>>>
>>>>> More info from csarcsys3
>>>>>
>>>>> [root at csarcsys3-eth0 cluster]# clustat
>>>>>
>>>>> msg_open: No such file or directory
>>>>>
>>>>> Member Status: Inquorate
>>>>>
>>>>> Member Name ID Status
>>>>>
>>>>> ------ ---- ---- ------
>>>>>
>>>>> csarcsys1-eth0 1 Offline
>>>>>
>>>>> csarcsys2-eth0 2 Offline
>>>>>
>>>>> csarcsys3-eth0 3 Online, Local
>>>>>
>>>>> /dev/sdd1 0 Offline
>>>>>
>>>>> [root at csarcsys3-eth0 cluster]# mkqdisk -L
>>>>>
>>>>> mkqdisk v0.5.1
>>>>>
>>>>> /dev/sdd1:
>>>>>
>>>>> Magic: eb7a62c2
>>>>>
>>>>> Label: csarcsysQ
>>>>>
>>>>> Created: Wed Feb 13 13:44:35 2008
>>>>>
>>>>> Host: csarcsys1-eth0.xxx.xxx.nasa.gov
>>>>>
>>>>> [root at csarcsys3-eth0 cluster]# ls -l /dev/sdd1
>>>>>
>>>>> brw-r----- 1 root disk 8, 49 Mar 25 14:09 /dev/sdd1
>>>>>
>>>>> clustat from csarcsys1
>>>>>
>>>>> msg_open: No such file or directory
>>>>>
>>>>> Member Status: Quorate
>>>>>
>>>>> Member Name ID Status
>>>>>
>>>>> ------ ---- ---- ------
>>>>>
>>>>> csarcsys1-eth0 1 Online, Local
>>>>>
>>>>> csarcsys2-eth0 2 Online
>>>>>
>>>>> csarcsys3-eth0 3 Offline
>>>>>
>>>>> /dev/sdd1 0 Offline, Quorum Disk
>>>>>
>>>>> [root at csarcsys1-eth0 cluster]# ls -l /dev/sdd1
>>>>>
>>>>> brw-r----- 1 root disk 8, 49 Mar 25 14:19 /dev/sdd1
>>>>>
>>>>> mkqdisk v0.5.1
>>>>>
>>>>> /dev/sdd1:
>>>>>
>>>>> Magic: eb7a62c2
>>>>>
>>>>> Label: csarcsysQ
>>>>>
>>>>> Created: Wed Feb 13 13:44:35 2008
>>>>>
>>>>> Host: csarcsys1-eth0.xxx.xxx.nasa.gov
>>>>>
>>>>> Info from csarcsys2
>>>>>
>>>>> root at csarcsys2-eth0 cluster]# clustat
>>>>>
>>>>> msg_open: No such file or directory
>>>>>
>>>>> Member Status: Quorate
>>>>>
>>>>> Member Name ID Status
>>>>>
>>>>> ------ ---- ---- ------
>>>>>
>>>>> csarcsys1-eth0 1 Offline
>>>>>
>>>>> csarcsys2-eth0 2 Online, Local
>>>>>
>>>>> csarcsys3-eth0 3 Offline
>>>>>
>>>>> /dev/sdd1 0 Online, Quorum Disk
>>>>>
>>>>> *From:* linux-cluster-bounces at redhat.com 
>>>>> [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of
*Panigrahi,
>>>>>           
>
>   
>>>>> Santosh Kumar
>>>>> *Sent:* Tuesday, March 25, 2008 7:33 AM
>>>>> *To:* linux clustering
>>>>> *Subject:* RE: [Linux-cluster] 3 node cluster problems
>>>>>
>>>>> If you are configuring your cluster by system-config-cluster then
>>>>>           
> no
>   
>>>>>         
>>>>>           
>>   
>>     
>>>>> need to run ricci/luci. Ricci/luci needed for configuring the
>>>>>         
>>>>>           
>> cluster
>>   
>>     
>>>>>       
>>>>>         
>>>>>           
>>>   
>>>     
>>>       
>>>>> using conga. You can configure in either ways.
>>>>>
>>>>> On seeing your clustat command outputs, it seems cluster is 
>>>>> partitioned (spilt brain) into 2 sub clusters [Sub1-* 
>>>>> **(csarcsys1-eth0, csarcsys2-eth0*) 2-* **csarcsys3-eth0*].
Without
>>>>>         
>>>>>           
>> a
>>   
>>     
>>>>>       
>>>>>         
>>>>>           
>>>   
>>>     
>>>       
>>>>> quorum device you can more often face this situation. To avoid
this
>>>>>           
>
>   
>>>>> you can configure a quorum device with a heuristic like ping
>>>>>         
>>>>>           
>> message.
>>   
>>     
>>>>>       
>>>>>         
>>>>>           
>>>   
>>>     
>>>       
>>>>> Use the link 
>>>>>
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>
(http://www.redhatmagazine.com/2007/12/19/enhancing-cluster-quorum-with-
>   
>>   
>>     
>>>   
>>>     
>>>       
>>>> qdisk/) 
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> for configuring a quorum disk in RHCS.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> S
>>>>>
>>>>> -----Original Message-----
>>>>> From: linux-cluster-bounces at redhat.com 
>>>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Dalton,
>>>>>       
>>>>>         
>>>>>           
>>> Maurice
>>>   
>>>     
>>>       
>>>>> Sent: Tuesday, March 25, 2008 5:18 PM
>>>>> To: linux clustering
>>>>> Subject: RE: [Linux-cluster] 3 node cluster problems
>>>>>
>>>>> Still no change. Same as below.
>>>>>
>>>>> I completely rebuilt the cluster using system-config-cluster
>>>>>
>>>>> The Cluster software was installed from rhn, luci and ricci are
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>> running.
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> This is the new config file and it has been copied to the 2 other
>>>>>
>>>>> systems
>>>>>
>>>>> [root at csarcsys1-eth0 cluster]# more cluster.conf
>>>>>
>>>>> <?xml version="1.0"?>
>>>>>
>>>>> <cluster config_version="5" name="csarcsys5">
>>>>>
>>>>> <fence_daemon post_fail_delay="0" post_join_delay="3"/>
>>>>>
>>>>> <clusternodes>
>>>>>
>>>>> <clusternode name="csarcsys1-eth0" nodeid="1" votes="1">
>>>>>
>>>>> <fence/>
>>>>>
>>>>> </clusternode>
>>>>>
>>>>> <clusternode name="csarcsys2-eth0" nodeid="2" votes="1">
>>>>>
>>>>> <fence/>
>>>>>
>>>>> </clusternode>
>>>>>
>>>>> <clusternode name="csarcsys3-eth0" nodeid="3" votes="1">
>>>>>
>>>>> <fence/>
>>>>>
>>>>> </clusternode>
>>>>>
>>>>> </clusternodes>
>>>>>
>>>>> <cman/>
>>>>>
>>>>> <fencedevices/>
>>>>>
>>>>> <rm>
>>>>>
>>>>> <failoverdomains>
>>>>>
>>>>> <failoverdomain name="csarcsysfo" ordered="0"
>>>>>
>>>>> restricted="1">
>>>>>
>>>>> <failoverdomainnode
>>>>>
>>>>> name="csarcsys1-eth0" priority="1"/>
>>>>>
>>>>> <failoverdomainnode
>>>>>
>>>>> name="csarcsys2-eth0" priority="1"/>
>>>>>
>>>>> <failoverdomainnode
>>>>>
>>>>> name="csarcsys3-eth0" priority="1"/>
>>>>>
>>>>> </failoverdomain>
>>>>>
>>>>> </failoverdomains>
>>>>>
>>>>> <resources>
>>>>>
>>>>> <ip address="172.xx.xx.xxx" monitor_link="1"/>
>>>>>
>>>>> <fs device="/dev/sdc1" force_fsck="0"
>>>>>
>>>>> force_unmount="1" fsid="57739" fstype="ext3" mountpo
>>>>>
>>>>> int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/>
>>>>>
>>>>> </resources>
>>>>>
>>>>> </rm>
>>>>>
>>>>> </cluster>
>>>>>
>>>>> -----Original Message-----
>>>>>
>>>>> From: linux-cluster-bounces at redhat.com
>>>>>
>>>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie
>>>>>           
> Thomas
>   
>>>>> Sent: Monday, March 24, 2008 4:17 PM
>>>>>
>>>>> To: linux clustering
>>>>>
>>>>> Subject: Re: [Linux-cluster] 3 node cluster problems
>>>>>
>>>>> Did you load the Cluster software via Conga or manually ? You
would
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>> have
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> had to load
>>>>>
>>>>> luci on one node and ricci on all three.
>>>>>
>>>>> Try copying the modified /etc/cluster/cluster.conf from csarcsys1
>>>>>           
> to
>   
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>> the
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> other two nodes.
>>>>>
>>>>> Make sure you can ping the private interface to/from all nodes and
>>>>>
>>>>> reboot. If this does not work
>>>>>
>>>>> post your /etc/cluster/cluster.conf file again.
>>>>>
>>>>> Dalton, Maurice wrote:
>>>>>
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>>> Yes
>>>>>>       
>>>>>> I also rebooted again just now to be sure.
>>>>>>       
>>>>>> -----Original Message-----
>>>>>>       
>>>>>> From: linux-cluster-bounces at redhat.com
>>>>>>       
>>>>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie
>>>>>>           
>>>>>>             
>> Thomas
>>   
>>     
>>>>>>       
>>>>>> Sent: Monday, March 24, 2008 3:33 PM
>>>>>>       
>>>>>> To: linux clustering
>>>>>>       
>>>>>> Subject: Re: [Linux-cluster] 3 node cluster problems
>>>>>>       
>>>>>> When you changed the nodenames in the /etc/lcuster/cluster.conf
>>>>>>             
> and
>   
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>> made
>>>>>
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>>> sure the /etc/hosts
>>>>>>       
>>>>>> file had the correct nodenames (Ie. 10.0.0.100 csarcsys1-eth0
>>>>>>       
>>>>>> csarcsys1-eth0.xxxx.xxxx.xxx.)
>>>>>>       
>>>>>> Did you reboot all the nodes at the sametime ?
>>>>>>       
>>>>>> Dalton, Maurice wrote:
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> No luck. It seems as if csarcsys3 thinks its in his own cluster
>>>>>>>         
>>>>>>> I renamed all config files and rebuilt from
system-config-cluster
>>>>>>>         
>>>>>>> Clustat command from csarcsys3
>>>>>>>         
>>>>>>> [root at csarcsys3-eth0 cluster]# clustat
>>>>>>>         
>>>>>>> msg_open: No such file or directory
>>>>>>>         
>>>>>>> Member Status: Inquorate
>>>>>>>         
>>>>>>> Member Name ID Status
>>>>>>>         
>>>>>>> ------ ---- ---- ------
>>>>>>>         
>>>>>>> csarcsys1-eth0 1 Offline
>>>>>>>         
>>>>>>> csarcsys2-eth0 2 Offline
>>>>>>>         
>>>>>>> csarcsys3-eth0 3 Online, Local
>>>>>>>         
>>>>>>> clustat command from csarcsys2
>>>>>>>         
>>>>>>> [root at csarcsys2-eth0 cluster]# clustat
>>>>>>>         
>>>>>>> msg_open: No such file or directory
>>>>>>>         
>>>>>>> Member Status: Quorate
>>>>>>>         
>>>>>>> Member Name ID Status
>>>>>>>         
>>>>>>> ------ ---- ---- ------
>>>>>>>         
>>>>>>> csarcsys1-eth0 1 Online
>>>>>>>         
>>>>>>> csarcsys2-eth0 2 Online, Local
>>>>>>>         
>>>>>>> csarcsys3-eth0 3 Offline
>>>>>>>         
>>>>>>> -----Original Message-----
>>>>>>>         
>>>>>>> From: linux-cluster-bounces at redhat.com
>>>>>>>         
>>>>>>> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Bennie
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>> Thomas
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>> Sent: Monday, March 24, 2008 2:25 PM
>>>>>>>         
>>>>>>> To: linux clustering
>>>>>>>         
>>>>>>> Subject: Re: [Linux-cluster] 3 node cluster problems
>>>>>>>         
>>>>>>> You will also, need to make sure the clustered nodenames are in
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>> your
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>> /etc/hosts file.
>>>>>>>         
>>>>>>> Also, make sure your cluster network interface is up on all
nodes
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>> and
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>> that the
>>>>>>>         
>>>>>>> /etc/cluster/cluster.conf are the same on all nodes.
>>>>>>>         
>>>>>>> Dalton, Maurice wrote:
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> The last post is incorrect.
>>>>>>>>           
>>>>>>>> Fence is still hanging at start up.
>>>>>>>>                 
>
>   
>>>>>>>>           
>>>>>>>> Here's another log message.
>>>>>>>>           
>>>>>>>> Mar 24 19:03:14 csarcsys3-eth0 ccsd[6425]: Error while
>>>>>>>>                 
> processing
>   
>>>>>>>>           
>>>>>>>> connect: Connection refused
>>>>>>>>           
>>>>>>>> Mar 24 19:03:15 csarcsys3-eth0 dlm_controld[6453]: connect to
>>>>>>>>                 
> ccs
>   
>>>>>>>>           
>>>>>>>> error -111, check ccsd or cluster status
>>>>>>>>           
>>>>>>>> *From:* linux-cluster-bounces at redhat.com
>>>>>>>>           
>>>>>>>> [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of *Bennie
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>> Thomas
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>> *Sent:* Monday, March 24, 2008 11:22 AM
>>>>>>>>           
>>>>>>>> *To:* linux clustering
>>>>>>>>           
>>>>>>>> *Subject:* Re: [Linux-cluster] 3 node cluster problems
>>>>>>>>           
>>>>>>>> try removing the fully qualified hostname from the cluster.conf
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>> file.
>>>>>
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>>>>>> Dalton, Maurice wrote:
>>>>>>>>           
>>>>>>>> I have NO fencing equipment
>>>>>>>>           
>>>>>>>> I have been task to setup a 3 node cluster
>>>>>>>>           
>>>>>>>> Currently I have having problems getting cman(fence) to start
>>>>>>>>           
>>>>>>>> Fence will try to start up during cman start up but will fail
>>>>>>>>           
>>>>>>>> I tried to run /sbin/fenced -D - I get the following
>>>>>>>>           
>>>>>>>> 1206373475 cman_init error 0 111
>>>>>>>>           
>>>>>>>> Here's my cluster.conf file
>>>>>>>>           
>>>>>>>> <?xml version="1.0"?>
>>>>>>>>           
>>>>>>>> <cluster alias="csarcsys51" config_version="26"
>>>>>>>>               
>>>>>>>>                 
>> name="csarcsys51">
>>   
>>     
>>>>>>>>           
>>>>>>>> <fence_daemon clean_start="0" post_fail_delay="0"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> post_join_delay="3"/>
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> <clusternodes>
>>>>>>>>           
>>>>>>>> <clusternode name="csarcsys1-eth0.xxx.xxxx.nasa.gov" nodeid="1"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> votes="1">
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> <fence/>
>>>>>>>>           
>>>>>>>> </clusternode>
>>>>>>>>           
>>>>>>>> <clusternode name="csarcsys2-eth0.xxx.xxxx.nasa.gov" nodeid="2"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> votes="1">
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> <fence/>
>>>>>>>>           
>>>>>>>> </clusternode>
>>>>>>>>           
>>>>>>>> <clusternode name="csarcsys3-eth0.xxx.xxxxnasa.gov" nodeid="3"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> votes="1">
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> <fence/>
>>>>>>>>           
>>>>>>>> </clusternode>
>>>>>>>>           
>>>>>>>> </clusternodes>
>>>>>>>>           
>>>>>>>> <cman/>
>>>>>>>>           
>>>>>>>> <fencedevices/>
>>>>>>>>           
>>>>>>>> <rm>
>>>>>>>>           
>>>>>>>> <failoverdomains>
>>>>>>>>           
>>>>>>>> <failoverdomain name="csarcsys-fo" ordered="1" restricted="0">
>>>>>>>>           
>>>>>>>> <failoverdomainnode name="csarcsys1-eth0.xxx.xxxx.nasa.gov"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> priority="1"/>
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> <failoverdomainnode name="csarcsys2-eth0.xxx.xxxx.nasa.gov"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> priority="1"/>
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> <failoverdomainnode name="csarcsys2-eth0.xxx.xxxx.nasa.gov"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> priority="1"/>
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> </failoverdomain>
>>>>>>>>           
>>>>>>>> </failoverdomains>
>>>>>>>>           
>>>>>>>> <resources>
>>>>>>>>           
>>>>>>>> <ip address="xxx.xxx.xxx.xxx" monitor_link="1"/>
>>>>>>>>           
>>>>>>>> <fs device="/dev/sdc1" force_fsck="0" force_unmount="1"
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> fsid="57739"
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>>> fstype="ext3" mountpo
>>>>>>>>           
>>>>>>>> int="/csarc-test" name="csarcsys-fs" options="rw"
>>>>>>>>               
>>>>>>>>                 
>> self_fence="0"/>
>>   
>>     
>>>>>>>>           
>>>>>>>> <nfsexport name="csarcsys-export"/>
>>>>>>>>           
>>>>>>>> <nfsclient name="csarcsys-nfs-client"
>>>>>>>>                 
> options="no_root_squash,rw"
>   
>>>>>>>>           
>>>>>>>> path="/csarc-test" targe
>>>>>>>>           
>>>>>>>> t="xxx.xxx.xxx.*"/>
>>>>>>>>           
>>>>>>>> </resources>
>>>>>>>>           
>>>>>>>> </rm>
>>>>>>>>           
>>>>>>>> </cluster>
>>>>>>>>           
>>>>>>>> Messages from the logs
>>>>>>>>           
>>>>>>>> ar 24 13:24:19 csarcsys2-eth0 ccsd[24888]: Cluster is not
>>>>>>>>               
>>>>>>>>                 
>> quorate.
>>   
>>     
>>>>>>>>           
>>>>>>>> Refusing connection.
>>>>>>>>           
>>>>>>>> Mar 24 13:24:19 csarcsys2-eth0 ccsd[24888]: Error while
>>>>>>>>               
>>>>>>>>                 
>> processing
>>   
>>     
>>>>>>>>           
>>>>>>>> connect: Connection refused
>>>>>>>>           
>>>>>>>> Mar 24 13:24:20 csarcsys2-eth0 ccsd[24888]: Cluster is not
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> quorate.
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>>> Refusing connection.
>>>>>>>>           
>>>>>>>> Mar 24 13:24:20 csarcsys2-eth0 ccsd[24888]: Error while
>>>>>>>>               
>>>>>>>>                 
>> processing
>>   
>>     
>>>>>>>>           
>>>>>>>> connect: Connection refused
>>>>>>>>           
>>>>>>>> Mar 24 13:24:21 csarcsys2-eth0 ccsd[24888]: Cluster is not
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> quorate.
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>>> Refusing connection.
>>>>>>>>           
>>>>>>>> Mar 24 13:24:21 csarcsys2-eth0 ccsd[24888]: Error while
>>>>>>>>               
>>>>>>>>                 
>> processing
>>   
>>     
>>>>>>>>           
>>>>>>>> connect: Connection refused
>>>>>>>>           
>>>>>>>> Mar 24 13:24:22 csarcsys2-eth0 ccsd[24888]: Cluster is not
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> quorate.
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>>> Refusing connection.
>>>>>>>>           
>>>>>>>> Mar 24 13:24:22 csarcsys2-eth0 ccsd[24888]: Error while
>>>>>>>>               
>>>>>>>>                 
>> processing
>>   
>>     
>>>>>>>>           
>>>>>>>> connect: Connection refused
>>>>>>>>           
>>>>>>>> Mar 24 13:24:23 csarcsys2-eth0 ccsd[24888]: Cluster is not
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> quorate.
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>>> Refusing connection.
>>>>>>>>           
>>>>>>>> Mar 24 13:24:23 csarcsys2-eth0 ccsd[24888]: Error while
>>>>>>>>               
>>>>>>>>                 
>> processing
>>   
>>     
>>>>>>>>           
>>>>>>>> connect: Connection refused
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>
------------------------------------------------------------------------
>   
>>   
>>     
>>>   
>>>     
>>>       
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>>> --
>>>>>>>>           
>>>>>>>> Linux-cluster mailing list
>>>>>>>>           
>>>>>>>> Linux-cluster at redhat.com <mailto:Linux-cluster at redhat.com>
>>>>>>>>           
>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>
------------------------------------------------------------------------
>   
>>   
>>     
>>>   
>>>     
>>>       
>>>>   
>>>>     
>>>>       
>>>>         
>>>>>>>> --
>>>>>>>>           
>>>>>>>> Linux-cluster mailing list
>>>>>>>>           
>>>>>>>> Linux-cluster at redhat.com
>>>>>>>>           
>>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>>           
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> --
>>>>>>>         
>>>>>>> Linux-cluster mailing list
>>>>>>>         
>>>>>>> Linux-cluster at redhat.com
>>>>>>>         
>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>         
>>>>>>> --
>>>>>>>         
>>>>>>> Linux-cluster mailing list
>>>>>>>         
>>>>>>> Linux-cluster at redhat.com
>>>>>>>         
>>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>> --
>>>>>>       
>>>>>> Linux-cluster mailing list
>>>>>>       
>>>>>> Linux-cluster at redhat.com
>>>>>>       
>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>       
>>>>>> --
>>>>>>       
>>>>>> Linux-cluster mailing list
>>>>>>       
>>>>>> Linux-cluster at redhat.com
>>>>>>       
>>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>> --
>>>>>
>>>>> Linux-cluster mailing list
>>>>>
>>>>> Linux-cluster at redhat.com
>>>>>
>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>
>>>>> --
>>>>>
>>>>> Linux-cluster mailing list
>>>>>
>>>>> Linux-cluster at redhat.com
>>>>>
>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>
>>>>>
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>
------------------------------------------------------------------------
>   
>>   
>>     
>>>   
>>>     
>>>       
>>>>   
>>>>     
>>>>       
>>>>         
>>>>> --
>>>>> Linux-cluster mailing list
>>>>> Linux-cluster at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>>     
>>>>>       
>>>>>         
>>>>>           
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>   
>>>>     
>>>>       
>>>>         
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>   
>>>     
>>>       
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>   
>>     
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>   


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster




More information about the Linux-cluster mailing list