[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

R: [Linux-cluster] CS4 U2 / problem to configure a 3 nodes cluster



 

> -----Messaggio originale-----
> Da: linux-cluster-bounces redhat com 
> [mailto:linux-cluster-bounces redhat com] Per conto di Alain Moulle
> Inviato: martedì 11 aprile 2006 12.59
> A: linux-cluster redhat com
> Oggetto: R: [Linux-cluster] CS4 U2 / problem to configure a 3 
> nodes cluster
> 
> >>Hi
> >>
> >>>> >>
> >>>> >> I'm trying to configure a simple 3 nodes cluster with simple 
> >>>> >> tests scripts.
> >>>> >> But I can't start cman, it remains stalled with this 
> message in 
> >>>> >> syslog :
> >>>> >> Apr 10 11:37:44 s_sys yack21 ccsd: startup succeeded Apr
> >
> >> 10 11:38:00
> >
> >>>> >> s_kernel yack21 kernel: CMAN 2.6.9-39.5 (built Sep 20 2005
> >>>> >> 16:04:34) installed
> >>>> >> Apr 10 11:38:00 s_kernel yack21 kernel: NET: 
> Registered protocol 
> >>>> >> family 30 Apr 10 11:38:00 s_sys yack21 ccsd[25004]:
> >>>> >> cluster.conf (cluster name = HA_METADATA_3N, version 
> = 8) found.
> >>>> >> Apr 10 11:38:00 s_kernel yack21 kernel: CMAN: Waiting to
> >
> >> join or form
> >
> >>>> >> a Linux-cluster Apr 10 11:38:01 s_sys yack21
> >>>> >> ccsd[25004]: Connected to cluster infrastruture
> >>>> >> via: CMAN/SM Plugin v1.1.2
> >>>> >> Apr 10 11:38:01 s_sys yack21 ccsd[25004]: Initial status::
> >>>> >> Inquorate Apr 10 11:38:32 s_kernel yack21 kernel: CMAN:
> >>>> >> forming a new cluster
> >>>> >>
> >>>> >> and nothing more.
> >>>> >>
> >>>> >> The graphic tool dos not detect any error in configuration; I 
> >>>> >> 've attached my cluster.conf for the three nodes, knowing that
> >
> >> I wanted
> >
> >>>> >> two nodes (yack10 and yack21) running theirs applications
> >
> >> and the 3rd
> >
> >>>> >> one (yack23) as a backup for yack10 and/or yack21, but I
> >
> >> don't want
> >
> >>>> >> any failover between yack10 and yack21.
> >>>> >>
> >>>> >> PS : I 've verified all ssh connections between the 3
> >
> >> nodes, and all
> >
> >>>> >> the fence paths as described in the cluster.conf.
> >>>> >> Thanks again for your help.
> >>>> >>
> >>>> >> Alain
> >>>> >>
> >
> >>
> >>
> >
> >>> >Are you starting the cman on all three nodes in the same
> >
> >> time? A node
> >
> >>> >doesn't start until each other node is starting. Timing is
> >
> >> important during booting.
> >>
> >
> >>> >Leandro
> >
> >>
> >> Hi, no I wasn't ...
> >> I've tried now, and this is ok on yack21 and yack23, but not on 
> >> yack10, is there something wrong in the cluster.conf to 
> explain this 
> >> behavior ?
> >> On yack10 , cman is trying to :
> >> CMAN: forming a new cluster
> >> but fails with a timeout ...
> >>
> >> ??
> >> Thanks
> >> Alain
> >> --
> >>
> 
> 
> >Maybe this time is due to a firewall setup, as already stated on the 
> >list. A tcpdump from yack10 to the other nodes may help you 
> catch the bug.
> >Leandro
> 
> No firewall setup on yack10, neither on yack21 nor yack23. 
> Besides the ssh connections are all valid between the three 
> nodes in all combinations without passwd request. And still 
> the problem ...
> Any other idea ?
> Is my cluster.conf correct ?
> 
> Besides, with regard to you first answer, I've tested on 
> yack21 and yack23 :
> if I start cman only on yack21, it does end in timeout.
> And if I start cman quite at the same time on yack21 and 
> yack23, it works on both nodes.
> I haven't found in documentation any recommandation about this point.
> Besides, if one node is breakdowned, that mean that we will 
> never be able to reboot the other node and launch the CS4 
> again with all applications ... sounds strange, doesn't it ?
> 

No, this doesn't sound strange. Cluster must be quorate to operate. Quorum can be reduced while a node is down, fencing it or simply removing it, by cman or by hand editing cluster.conf. Try this: start all the node without cman, gfs and other GFS suite packages. Then start by hand, one a time on each node, ccsd, cman, lock_gulm(?), fenced, clvmd and rgmanager init scripts. After each run, check the /var/log/messages output and connectivity between nodes. Unfortunately the configuration is far different from the one I use, so I cannot help you.

Leandro


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]