[Linux-cluster] clvmd hangs when third node tries to connect to cluster

s.c.graham at gmail.com s.c.graham at gmail.com
Wed Oct 31 19:26:12 UTC 2007


> > I have a cluster with three nodes (all clone HL DL380 G4s) attached to
> > a Fibre SAN (HP MSA1000) and serving a number of GFS filesystems.  My
> > OS is Ubuntu Dapper (6.06) and my kernel is 2.6.15-29-amd64-server.
> > These machines have been working nicely for a long time.
> >
> > On the weekend I "apt-get updated" to the latest version of the Dapper
> > redhat-cluster-suite package (1.20060222-0ubuntu6.1).  Now, when the
> > cluster boots the first two nodes to come up are able to see the GFS
> > filesystem. However, the third node to come up hangs at the point of
> > starting the clvm service.  Concomitantly, I see the following message
> > in /var/log/syslog of one of the other machines in the cluster:
> >
> > Oct 28 14:42:18 machinea kernel: [ 1681.325152] CMAN: node machinec rejoining
> > Oct 28 14:42:20 machinea kernel: [ 1683.528299] Extra connection from
> > node 2 attempted
> >
> > It does not seem to matter which order the nodes come up in - it is
> > always the third node to boot that will hang when starting clvmd.  I
> > have included my cluster.conf file below for reference - I can include
> > any additional diagnostics as required.
> >
> > Any help would be most appreciated!
>
> That sounds like a bug that has already been fixed. I don't have the reference
> to hand as I've just returned from holiday, sorry.

Does anyone else remember this bug (or could someone please point me
in the direction of the correct bugzilla so I can try and track it
down myself)?

Thanks,

Stephen




More information about the Linux-cluster mailing list