[Linux-cluster] cannot add 3rd node to running cluster

Senol Erdogan alkol6 at gmail.com
Fri Jan 22 16:00:08 UTC 2010


hi,
maybe u have your cluster have Votes problem,

when u use "cman_tool status" and if u see like a "Quorum: 2 Activity
Blocked" message, then
use "cman_tool expected -e 1". Again use "cman_tool status", u will see
removed "Activity Blocked" on "Quorum:".

This command pull down Qourum of cluster and cluster will runing (expected
vote: n/2+1). This solution a tepmrolary solution for inadequate cluster
qourum. (before check config_version="{ver.num}" , u know, must be same
number in all nodes)

i hope server for your problem

(yep, i know.. my englsh are ultra-professionel-imba :) )


2010/1/22 Terry <td3201 at gmail.com>

> On Fri, Jan 22, 2010 at 9:00 AM, King, Adam <adam.king at intechnology.com>
> wrote:
> > I'm assuming you have read this?
> http://sources.redhat.com/cluster/wiki/FAQ/CMAN#cman_2to3
> >
> >
> >
> >
> > Adam King
> > Systems Administrator
> > adam.king at intechnology.com
> >
> >
> > InTechnology plc
> > Support 0845 120 7070
> > Telephone 01423 850000
> > Facsimile 01423 858866
> > www.intechnology.com
> >
> >
> > -----Original Message-----
> >
> > From: linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] On Behalf Of Terry
> > Sent: 22 January 2010 14:45
> > To: linux clustering
> > Subject: Re: [Linux-cluster] cannot add 3rd node to running cluster
> >
> > On Mon, Jan 4, 2010 at 1:34 PM, Abraham Alawi <a.alawi at auckland.ac.nz>
> wrote:
> >>
> >> On 1/01/2010, at 5:13 AM, Terry wrote:
> >>
> >>> On Wed, Dec 30, 2009 at 10:13 AM, Terry <td3201 at gmail.com> wrote:
> >>>> On Tue, Dec 29, 2009 at 5:20 PM, Jason W. <jwellband at gmail.com>
> wrote:
> >>>>> On Tue, Dec 29, 2009 at 2:30 PM, Terry <td3201 at gmail.com> wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> I have a working 2 node cluster that I am trying to add a third node
> >>>>>> to.   I am trying to use Red Hat's conga (luci) to add the node in
> but
> >>>>>
> >>>>> If you have two node cluster with two_node=1 in cluster.conf - such
> as
> >>>>> two nodes with no quorum device to break a tie - you'll need to bring
> >>>>> the cluster down, change two_node to 0 on both nodes (and rev the
> >>>>> cluster version at the top of cluster.conf), bring the cluster up and
> >>>>> then add the third node.
> >>>>>
> >>>>> For troubleshooting any cluster issue, take a look at syslog
> >>>>> (/var/log/messages by default). It can help to watch it on a
> >>>>> centralized syslog server that all of your nodes forward logs to.
> >>>>>
> >>>>> --
> >>>>> HTH, YMMV, HANW :)
> >>>>>
> >>>>> Jason
> >>>>>
> >>>>> The path to enlightenment is /usr/bin/enlightenment.
> >>>>
> >>>> Thank you for the response.  /var/log/messages doesn't have any
> >>>> errors.  It says cman started then says can't connect to cluster
> >>>> infrastructure after a few seconds.  My cluster does not have the
> >>>> two_node=1 config now.  Conga took that out for me.  That bit me last
> >>>> night because I needed to put that back in.
> >>>>
> >>>
> >>> CMAN still will not start and gives no debug information.  Anyone know
> >>> why cman_tool -d join would not print any output at all?
> >>> Troubleshooting this is kind of a nightmare.  I verified that two_node
> >>> is not in play.
> >>>
> >>> --
> >>> Linux-cluster mailing list
> >>> Linux-cluster at redhat.com
> >>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >>
> >> Try this line in your cluster.conf file:
> >> <logging debug="on" logfile="/var/log/rhcs.log" to_file="yes"/>
> >>
> >> Also, if you are sure your cluster.conf is correct then copy it manually
> to all the nodes and add clean_start="1" to the fence_daemon line in
> cluster.conf and run 'service cman start' simultaneously on all the nodes
> (probably a good idea to do that from runlevel 1 but make sure you have the
> network up first)
> >>
> >> Cheers,
> >>
> >>  -- Abraham
> >>
> >> ''''''''''''''''''''''''''''''''''''''''''''''''''''''
> >> Abraham Alawi
> >>
> >> Unix/Linux Systems Administrator
> >> Science IT
> >> University of Auckland
> >> e: a.alawi at auckland.ac.nz
> >> p: +64-9-373 7599, ext#: 87572
> >>
> >> ''''''''''''''''''''''''''''''''''''''''''''''''''''''
> >>
> >>
> >
> > I am still battling this.  I stopped the cluster completely, modified
> > the config and then started it, but that didn't work either.  Same
> > issue.  I noticed clurgmgrd wasn't staying running so I then tried
> > this:
> >
> > [root at omadvnfs01c ~]# clurgmgrd -d -f
> > [7014] notice: Waiting for CMAN to start
> >
> > Then in another window I issued:
> > [root at omadvnfs01c ~]# cman_tool join
> >
> >
> > Then back in the other window below "[7014] notice: Waiting for CMAN
> > to start", I got:
> > failed acquiring lockspace: Transport endpoint is not connected
> > Locks not working!
> >
> > Anyone know what could be going on?
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > This is an email from InTechnology plc, Central House, Beckwith Knowle,
> Harrogate, UK, HG3 1UG.
> > Registered in England 3916586.
> >
> > The contents of this message may be privileged and confidential. If you
> have received this message in error, you may not use,
> >
> > disclose, copy or distribute its content in any way. Please notify the
> sender immediately. All messages are scanned for all viruses.
> >
> > --
>
> I didn't but I performed those steps anyways.  As it sits, I have a
> three node cluster with only two nodes in it.  Which is bad too but it
> is what it is until I figure this out.  Here's my cluster.conf just
> for completeness:
>
> <cluster alias="omadvnfs01" config_version="53" name="omadvnfs01">
>        <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="3"/>
>        <clusternodes>
>                <clusternode name="omadvnfs01a.sec.jel.lc" nodeid="1"
> votes="1">
>                        <fence>
>                                <method name="1">
>                                        <device name="omadvnfs01a-drac"/>
>                                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="omadvnfs01b.sec.jel.lc" nodeid="2"
> votes="1">
>                        <fence>
>                                <method name="1">
>                                        <device name="omadvnfs01b-drac"/>
>                                </method>
>                        </fence>
>                </clusternode>
>                <clusternode name="omadvnfs01c.sec.jel.lc" nodeid="3"
> votes="1">
>                        <fence>
>                                <method name="1">
>                                        <device name="omadvnfs01c-drac"/>
>                                </method>
>                        </fence>
>                </clusternode>
>        </clusternodes>
>        <cman/>
>        <fencedevices>
>                <fencedevice agent="fence_drac" ipaddr="10.98.1.211"
> login="root" name="omadvnfs01a-drac" passwd="foo"/>
>                <fencedevice agent="fence_drac" ipaddr="10.98.1.212"
> login="root" name="omadvnfs01b-drac" passwd="foo"/>
>                <fencedevice agent="fence_drac" ipaddr="10.98.1.213"
> login="root" name="omadvnfs01c-drac" passwd="foo"/>
>        </fencedevices>
>        <rm>
>                <failoverdomains>
>                        <failoverdomain name="fd_omadvnfs01a-nfs"
> nofailback="1" ordered="1" restricted="0">
>                                <failoverdomainnode
> name="omadvnfs01a.sec.jel.lc" priority="1"/>
>                        </failoverdomain>
>                        <failoverdomain name="fd_omadvnfs01b-nfs"
> nofailback="1" ordered="1" restricted="0">
>                                <failoverdomainnode
> name="omadvnfs01b.sec.jel.lc" priority="2"/>
>                        </failoverdomain>
>                        <failoverdomain name="fd_omadvnfs01c-nfs"
> nofailback="1" ordered="1" restricted="0">
>                                <failoverdomainnode
> name="omadvnfs01c.sec.jel.lc" priority="1"/>
>                        </failoverdomain>
>                </failoverdomains>
>
> I am not sure if I did a restart after I did the work though.  When it
> says "shutdown cluster software" that is simply a 'service cman stop'
> on redhat, right?  Want to make sure I don't need to kill any other
> components before updating the configuration manually.  I appreciate
> the help.  I am probably going to try it again this afternoon to
> double check my work.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100122/f6368c53/attachment.htm>


More information about the Linux-cluster mailing list