[Linux-cluster] GFS join hang

Fernando Nino Fernando.Nino at medias.cnes.fr
Thu Apr 20 07:56:20 UTC 2006


Dear all,


 I am running GFS 6.1 with dlm on a cluster (4 nodes + front-end) of 
dual-headed Opterons and RHEL4U3. Because of some problems (kernel 
panic...) I had to hard boot some nodes of the cluster.  Now, some gfs 
partitions won't mount.  They will simply keep waiting forever for the 
"join" of the GFS group:

So... three questions:

 - What is the join exactly doing ? Cluster status is fine, everybody is 
member ...
 - What does the status code mean in the cman_tool output ?
 - What can I do to restart this cluster ?

NB: Before testing this (below) I rebooted the complete cluster and 
gfs_fsck'ed /all nodes /with everything unmounted.

---------------------------------------------------------------------------------------------------- 

root # service clvmd start

root #: service gfs start
Mounting GFS filesystems:    # forever !

in another console I get:
root # dmesg | tail
...
GFS: fsid=globcover:baieGC2b.0: jid=14: Done
GFS: fsid=globcover:baieGC2b.0: jid=15: Trying to acquire journal lock...
GFS: fsid=globcover:baieGC2b.0: jid=15: Looking at journal...
GFS: fsid=globcover:baieGC2b.0: jid=15: Done
GFS: Trying to join cluster "lock_dlm", "globcover:baieGC3a"


root #  cman_tool services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                          11   2 run       -
[1 5 4 3 2]

DLM Lock Space:  "clvmd"                            12   3 run       -
[1 5 4 3 2]

DLM Lock Space:  "baieGC2b"                         13   4 run       -
[1 5]

DLM Lock Space:  "baieGC3a"                         15   6 run       -
[1 5 2 4 3]

GFS Mount Group: "baieGC2b"                         14   5 run       -
[1 5]

GFS Mount Group: "baieGC3a"                          0   7 join      
S-2,2,4
[]


root # cman_tool status
Protocol version: 5.0.1
Config version: 8
Cluster name: globcover
Cluster ID: 53692
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 5
Expected_votes: 5
Total_votes: 5
Quorum: 3
Active subsystems: 9
Node name: globcover-fe
Node addresses: 10.1.1.1

root # cman_tool nodes
Node  Votes Exp Sts  Name
   1    1    5   M   globcover-fe
   2    1    5   M   compute-0-3
   3    1    5   M   compute-0-2
   4    1    5   M   compute-0-1
   5    1    5   M   compute-0-0

---------------------------------------------------------------------------------------------------- 




 Thanks,
-- 
------------------------------------------------------------------------
Fernando NIÑO 	CNES - BPi 2102
Medias-France/IRD 	18, Av. Edouard Belin
Tél: 05.61.27.40.74 	31401 Toulouse Cedex 9






More information about the Linux-cluster mailing list