[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] GFS join hang



On Thu, Apr 20, 2006 at 09:56:20AM +0200, Fernando Nino wrote:
> I am running GFS 6.1 with dlm on a cluster (4 nodes + front-end) of 
> dual-headed Opterons and RHEL4U3. Because of some problems (kernel 
> panic...) I had to hard boot some nodes of the cluster.  Now, some gfs 
> partitions won't mount.  They will simply keep waiting forever for the 
> "join" of the GFS group:
> 
> So... three questions:
> 
> - What is the join exactly doing ? Cluster status is fine, everybody is 
> member ...

>From all 5 nodes it would be good to see:
- cman_tool services
- /var/log/messages
- /proc/cluster/lock_dlm/debug

> - What does the status code mean in the cman_tool output ?
> S-2,2,4

S-2: join event state is SEST_JOIN_ACKWAIT
,2: join event flag is SEFL_ALLOW_JOIN
,4: number of acks to our join request is 4

So, the node is waiting for acks to its join request.  It needs 5 but has
only got 4, someone hasn't sent a reply for some reason.  We might be able
to figure out who and why given all the info from the other nodes.
Rebooting the node that's not replied might resolve things.

Dave


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]