[Linux-cluster] Two node cluster, start CMAN fence the other node

Alex Re are at gmx.es
Fri Apr 16 11:25:03 UTC 2010


Good morning,

thanks for your replies!
Multicast was definetively my problem. I couldn't use a crossed cable as 
suggested by Jeff, because these servers are blades, but after 
checking/configuring the IGMP properties at the switches ports, the 
cluster started working fine!

Thanks again!
Alex.

On 04/15/2010 08:34 PM, Jeff Sturm wrote:
>
> For two node clusters there's a convenient workaround:  crossover cable.
>
> You'll need a spare Ethernet port but that's easier than getting 
> certain switches to do multicast correctly.  (At least in my experience.)
>
> *From:* linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] *On Behalf Of 
> *Jason_Henderson at Mitel.com
> *Sent:* Thursday, April 15, 2010 1:44 PM
> *To:* linux clustering
> *Cc:* linux-cluster at redhat.com; linux-cluster-bounces at redhat.com
> *Subject:* Re: [Linux-cluster] Two node cluster,start CMAN fence the 
> other node
>
>
> Most likely the multicast packet communication between the 2 nodes is 
> not getting through your network.
>
> linux-cluster-bounces at redhat.com wrote on 04/15/2010 01:05:01 PM:
>
> > Good afternoon,
> > I'm trying to form my first cluster of two nodes, using iLO fence
> > devices. I need some help because I can't find what I've missed.
> > My main problem is that the "service cman start" reboots the other
> > node and I can't form the two nodes cluster.
> > I'm using (at both nodea and nodeb, they are on the same VLAN and
> > pings each other ok):
> >
> > [root at nodea ~]# uname -a
> > Linux nodea 2.6.18-164.15.1.el5 #1 SMP Wed Mar 17 11:30:06 EDT 2010
> > x86_64 x86_64 x86_64 GNU/Linux
> > [root at nodea ~]# rpm -qa |grep cman
> > cman-2.0.115-1.el5_4.9
> >
> > [root at nodea ~]# cat /etc/cluster/cluster.conf (nodeb has the same file)
> > <?xml version="1.0" ?>
> > <cluster alias="VCluster" config_version="5" name="VCluster">
> > <fence_daemon post_fail_delay="0" post_join_delay="25"/>
> > <clusternodes>
> > <clusternode name="nodea" nodeid="1" votes="1">
> > <fence>
> > <method name="1">
> > <device name="nodeaILO"/>
> > </method>
> > </fence>
> > </clusternode>
> > <clusternode name="nodeb" nodeid="2" votes="1">
> > <fence>
> > <method name="1">
> > <device name="nodebILO"/>
> > </method>
> > </fence>
> > </clusternode>
> > </clusternodes>
> > <cman expected_votes="1" two_node="1"/>
> > <fencedevices>
> > <fencedevice agent="fence_ilo" hostname="nodeacn"
> > login="user" name="nodeaILO" passwd="hp"/>
> > <fencedevice agent="fence_ilo" hostname="nodebcn"
> > login="user" name="nodebILO" passwd="hp"/>
> > </fencedevices>
> > <rm>
> > <failoverdomains/>
> > <resources/>
> > </rm>
> > </cluster>
> >
> > When I start the cman service, it hangs up for some time at the
> > "Starting fencing..." step and after those configured 25secs it
> > fences nodeb and reboots it.
> > [root at nodea ~]# service cman start
> > Starting cluster:
> >    Loading modules... done
> >    Mounting configfs... done
> >    Starting ccsd... done
> >    Starting cman... done
> >    Starting daemons... done
> >    Starting fencing... done
> >                                                            [  OK  ]
> >
> > "nodeb" gets rebooted:
> > [root at nodeb ~]#
> > Broadcast message from root (Thu Apr 15 18:42:24 2010):
> >
> > The system is going down for system halt NOW!
> >
> > At the syslog I just can find:
> > Apr 15 18:40:59 nodea ccsd[16930]: Initial status:: Quorate
> > Apr 15 18:40:59 nodea openais[16936]: [CLM  ] Members Left:
> > Apr 15 18:40:59 nodea openais[16936]: [CLM  ] Members Joined:
> > Apr 15 18:40:59 nodea openais[16936]: [CLM  ] CLM CONFIGURATION CHANGE
> > Apr 15 18:41:00 nodea openais[16936]: [CLM  ] New Configuration:
> > Apr 15 18:41:00 nodea openais[16936]: [CLM  ]     r(0) ip(10.192.16.42)
> > Apr 15 18:41:00 nodea openais[16936]: [CLM  ] Members Left:
> > Apr 15 18:41:00 nodea openais[16936]: [CLM  ] Members Joined:
> > Apr 15 18:41:00 nodea openais[16936]: [CLM  ]     r(0) ip(10.192.16.42)
> > Apr 15 18:41:00 nodea openais[16936]: [SYNC ] This node is within
> > the primary component and will provide service.
> > Apr 15 18:41:00 nodea openais[16936]: [TOTEM] entering OPERATIONAL 
> state.
> > Apr 15 18:41:00 nodea openais[16936]: [CMAN ] quorum regained,
> > resuming activity
> > Apr 15 18:41:00 nodea openais[16936]: [CLM  ] got nodejoin message
> > 10.192.16.42
> > Apr 15 18:42:11 nodea fenced[16955]: nodeb not a cluster member
> > after 25 sec post_join_delay
> > Apr 15 18:42:11 nodea fenced[16955]: fencing node "nodeb"
> > Apr 15 18:42:23 nodea fenced[16955]: fence "nodeb" success
> >
> > [root at nodea ~]# clustat
> > Cluster Status for VCluster @ Thu Apr 15 18:55:23 2010
> > Member Status: Quorate
> >
> >  Member Name                                                     ID   
> Status
> >  ------ ----                                                     ---- 
> ------
> >  nodea
> > 1 Online, Local
> >  nodeb                                                              2 
> Offline
> >
> > Then when nodeb starts again, I try to start cman there to join the
> > cluster... but it again fences "nodea":
> > [root at nodeb ~]# clustat
> > Could not connect to CMAN: No such file or directory
> > [root at nodeb ~]# service cman start
> > Starting cluster:
> >    Loading modules... done
> >    Mounting configfs... done
> >    Starting ccsd... done
> >    Starting cman... done
> >    Starting qdiskd... done
> >    Starting daemons... done
> >    Starting fencing... (wait for 25secs again) done
> >                                                            [  OK  ]
> > "nodea" gets rebooted:
> > [root at nodea ~]#
> > Broadcast message from root (Thu Apr 15 18:58:40 2010):
> >
> > The system is going down for system halt NOW!
> >
> > Apr 15 18:57:31 nodeb openais[11789]: [CLM  ] Members Joined:
> > Apr 15 18:57:31 nodeb openais[11789]: [CLM  ]     r(0) ip(10.192.16.44)
> > Apr 15 18:57:31 nodeb openais[11789]: [SYNC ] This node is within
> > the primary component and will provide service.
> > Apr 15 18:57:31 nodeb openais[11789]: [TOTEM] entering OPERATIONAL 
> state.
> > Apr 15 18:57:31 nodeb openais[11789]: [CMAN ] quorum regained,
> > resuming activity
> > Apr 15 18:57:31 nodeb openais[11789]: [CLM  ] got nodejoin message
> > 10.192.16.44
> > Apr 15 18:57:34 nodeb qdiskd[10323]: <info> Quorum Daemon Initializing
> > Apr 15 18:57:34 nodeb qdiskd[10323]: <crit> Initialization failed
> > Apr 15 18:58:42 nodeb fenced[11816]: nodea not a cluster member
> > after 25 sec post_join_delay
> > Apr 15 18:58:42 nodeb fenced[11816]: fencing node "nodea"
> > Apr 15 18:58:54 nodeb fenced[11816]: fence "nodea" success
> >
> > And I can't get the two nodes, joining the cluster...
> > I guess I'm missing something at the cluster.conf file??? I can't
> > find what I'm making wrong.
> >
> > Thanks for any help!
> >
> > Alex Re--
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100416/5e7f6491/attachment.htm>


More information about the Linux-cluster mailing list