[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Two node cluster, start CMAN fence the other node



Alex,
 
What exactly did you configure for IGMP?
Did you also separate the cluster interconnect traffic in its own VLAN?
 
Thanks and regards,
 
Chris


From: linux-cluster-bounces redhat com [mailto:linux-cluster-bounces redhat com] On Behalf Of Alex Re
Sent: Friday, 16 April 2010 21:25
To: linux clustering
Subject: Re: [Linux-cluster] Two node cluster, start CMAN fence the other node

Good morning,

thanks for your replies!
Multicast was definetively my problem. I couldn't use a crossed cable as suggested by Jeff, because these servers are blades, but after checking/configuring the IGMP properties at the switches ports, the cluster started working fine!

Thanks again!
Alex.

On 04/15/2010 08:34 PM, Jeff Sturm wrote:

For two node clusters there's a convenient workaround:  crossover cable.

You'll need a spare Ethernet port but that's easier than getting certain switches to do multicast correctly.  (At least in my experience.)

From: linux-cluster-bounces redhat com [mailto:linux-cluster-bounces redhat com] On Behalf Of Jason_Henderson Mitel com
Sent: Thursday, April 15, 2010 1:44 PM
To: linux clustering
Cc: linux-cluster redhat com; linux-cluster-bounces redhat com
Subject: Re: [Linux-cluster] Two node cluster,start CMAN fence the other node


Most likely the multicast packet communication between the 2 nodes is not getting through your network.

linux-cluster-bounces redhat com wrote on 04/15/2010 01:05:01 PM:

> Good afternoon,
> I'm trying to form my first cluster of two nodes, using iLO fence
> devices. I need some help because I can't find what I've missed.
> My main problem is that the "service cman start" reboots the other
> node and I can't form the two nodes cluster.
> I'm using (at both nodea and nodeb, they are on the same VLAN and
> pings each other ok):
>
> [root nodea ~]# uname -a
> Linux nodea 2.6.18-164.15.1.el5 #1 SMP Wed Mar 17 11:30:06 EDT 2010
> x86_64 x86_64 x86_64 GNU/Linux
> [root nodea ~]# rpm -qa |grep cman
> cman-2.0.115-1.el5_4.9
>
> [root nodea ~]# cat /etc/cluster/cluster.conf (nodeb has the same file)
> <?xml version="1.0" ?>
> <cluster alias="VCluster" config_version="5" name="VCluster">
>     <fence_daemon post_fail_delay="0" post_join_delay="25"/>
>     <clusternodes>
>         <clusternode name="nodea" nodeid="1" votes="1">
>             <fence>
>                 <method name="1">
>                     <device name="nodeaILO"/>
>                 </method>
>             </fence>
>         </clusternode>
>         <clusternode name="nodeb" nodeid="2" votes="1">
>             <fence>
>                 <method name="1">
>                     <device name="nodebILO"/>
>                 </method>
>             </fence>
>         </clusternode>
>     </clusternodes>
>     <cman expected_votes="1" two_node="1"/>
>     <fencedevices>
>         <fencedevice agent="fence_ilo" hostname="nodeacn"
> login="user" name="nodeaILO" passwd="hp"/>
>         <fencedevice agent="fence_ilo" hostname="nodebcn"
> login="user" name="nodebILO" passwd="hp"/>
>     </fencedevices>
>     <rm>
>         <failoverdomains/>
>         <resources/>
>     </rm>
> </cluster>
>
> When I start the cman service, it hangs up for some time at the
> "Starting fencing..." step and after those configured 25secs it
> fences nodeb and reboots it.
> [root nodea ~]# service cman start
> Starting cluster:
>    Loading modules... done
>    Mounting configfs... done
>    Starting ccsd... done
>    Starting cman... done
>    Starting daemons... done
>    Starting fencing... done
>                                                            [  OK  ]
>
> "nodeb" gets rebooted:
> [root nodeb ~]#
> Broadcast message from root (Thu Apr 15 18:42:24 2010):
>
> The system is going down for system halt NOW!
>
> At the syslog I just can find:
> Apr 15 18:40:59 nodea ccsd[16930]: Initial status:: Quorate
> Apr 15 18:40:59 nodea openais[16936]: [CLM  ] Members Left:
> Apr 15 18:40:59 nodea openais[16936]: [CLM  ] Members Joined:
> Apr 15 18:40:59 nodea openais[16936]: [CLM  ] CLM CONFIGURATION CHANGE
> Apr 15 18:41:00 nodea openais[16936]: [CLM  ] New Configuration:
> Apr 15 18:41:00 nodea openais[16936]: [CLM  ]     r(0) ip(10.192.16.42)  
> Apr 15 18:41:00 nodea openais[16936]: [CLM  ] Members Left:
> Apr 15 18:41:00 nodea openais[16936]: [CLM  ] Members Joined:
> Apr 15 18:41:00 nodea openais[16936]: [CLM  ]     r(0) ip(10.192.16.42)  
> Apr 15 18:41:00 nodea openais[16936]: [SYNC ] This node is within
> the primary component and will provide service.
> Apr 15 18:41:00 nodea openais[16936]: [TOTEM] entering OPERATIONAL state.
> Apr 15 18:41:00 nodea openais[16936]: [CMAN ] quorum regained,
> resuming activity
> Apr 15 18:41:00 nodea openais[16936]: [CLM  ] got nodejoin message
> 10.192.16.42
> Apr 15 18:42:11 nodea fenced[16955]: nodeb not a cluster member
> after 25 sec post_join_delay
> Apr 15 18:42:11 nodea fenced[16955]: fencing node "nodeb"
> Apr 15 18:42:23 nodea fenced[16955]: fence "nodeb" success
>
> [root nodea ~]# clustat
> Cluster Status for VCluster @ Thu Apr 15 18:55:23 2010
> Member Status: Quorate
>
>  Member Name                                                     ID   Status
>  ------ ----                                                     ---- ------
>  nodea                                                              
> 1 Online, Local
>  nodeb                                                              2 Offline
>
> Then when nodeb starts again, I try to start cman there to join the
> cluster... but it again fences "nodea":
> [root nodeb ~]# clustat
> Could not connect to CMAN: No such file or directory
> [root nodeb ~]# service cman start
> Starting cluster:
>    Loading modules... done
>    Mounting configfs... done
>    Starting ccsd... done
>    Starting cman... done
>    Starting qdiskd... done
>    Starting daemons... done
>    Starting fencing... (wait for 25secs again) done
>                                                            [  OK  ]
> "nodea" gets rebooted:
> [root nodea ~]#
> Broadcast message from root (Thu Apr 15 18:58:40 2010):
>
> The system is going down for system halt NOW!
>
> Apr 15 18:57:31 nodeb openais[11789]: [CLM  ] Members Joined:
> Apr 15 18:57:31 nodeb openais[11789]: [CLM  ]     r(0) ip(10.192.16.44)  
> Apr 15 18:57:31 nodeb openais[11789]: [SYNC ] This node is within
> the primary component and will provide service.
> Apr 15 18:57:31 nodeb openais[11789]: [TOTEM] entering OPERATIONAL state.
> Apr 15 18:57:31 nodeb openais[11789]: [CMAN ] quorum regained,
> resuming activity
> Apr 15 18:57:31 nodeb openais[11789]: [CLM  ] got nodejoin message
> 10.192.16.44
> Apr 15 18:57:34 nodeb qdiskd[10323]: <info> Quorum Daemon Initializing
> Apr 15 18:57:34 nodeb qdiskd[10323]: <crit> Initialization failed
> Apr 15 18:58:42 nodeb fenced[11816]: nodea not a cluster member
> after 25 sec post_join_delay
> Apr 15 18:58:42 nodeb fenced[11816]: fencing node "nodea"
> Apr 15 18:58:54 nodeb fenced[11816]: fence "nodea" success
>
> And I can't get the two nodes, joining the cluster...
> I guess I'm missing something at the cluster.conf file??? I can't
> find what I'm making wrong.
>
> Thanks for any help!
>
> Alex Re--
> Linux-cluster mailing list
> Linux-cluster redhat com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- Linux-cluster mailing list Linux-cluster redhat com https://www.redhat.com/mailman/listinfo/linux-cluster

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]