[Linux-cluster] GFS join hang

Fernando Nino Fernando.Nino at medias.cnes.fr
Wed Apr 19 16:26:05 UTC 2006


Dear all,


  I am running GFS 6.1 with dlm on a cluster (4 nodes + front-end) of 
dual-headed Opterons and RHEL4U3. Because of some problems (kernel 
panic...) I had to hard boot some nodes of the cluster.  Now, some gfs 
partitions simply won't mount.  In some nodes, they will simply keep 
waiting forever for the join of the GFS group:

So three questions:

  - What is the join exactly waiting for ? Cluster status is fine, 
everybody is member ...
  - What does the status code mean in the cman_tool output ?
  - What can I do to restart this cluster ?

NB: Before testing this (below) I rebooted the complete cluster and 
gfs_fsck'ed /all nodes /with everything unmounted.

----------------------------------------------------------------------------------------------------
root # service clvmd start

root #: service gfs start
Mounting GFS filesystems:    # forever !

in another console I get:
root # dmesg | tail
...
GFS: fsid=globcover:baieGC2b.0: jid=14: Done
GFS: fsid=globcover:baieGC2b.0: jid=15: Trying to acquire journal lock...
GFS: fsid=globcover:baieGC2b.0: jid=15: Looking at journal...
GFS: fsid=globcover:baieGC2b.0: jid=15: Done
GFS: Trying to join cluster "lock_dlm", "globcover:baieGC3a"


root #  cman_tool services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                          11   2 run       -
[1 5 4 3 2]

DLM Lock Space:  "clvmd"                            12   3 run       -
[1 5 4 3 2]

DLM Lock Space:  "baieGC2b"                         13   4 run       -
[1 5]

DLM Lock Space:  "baieGC3a"                         15   6 run       -
[1 5 2 4 3]

GFS Mount Group: "baieGC2b"                         14   5 run       -
[1 5]

GFS Mount Group: "baieGC3a"                          0   7 join      S-2,2,4
[]
----------------------------------------------------------------------------------------------------



  Thanks,
-- 
------------------------------------------------------------------------
Fernando NIÑO 	CNES - BPi 2102
Medias-France/IRD 	18, Av. Edouard Belin
Tél: 05.61.27.40.74 	31401 Toulouse Cedex 9






More information about the Linux-cluster mailing list