[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] GFS join hang



Dear all,


I am running GFS 6.1 with dlm on a cluster (4 nodes + front-end) of dual-headed Opterons and RHEL4U3. Because of some problems (kernel panic...) I had to hard boot some nodes of the cluster. Now, some gfs partitions won't mount. They will simply keep waiting forever for the "join" of the GFS group:

So... three questions:

- What is the join exactly doing ? Cluster status is fine, everybody is member ...
- What does the status code mean in the cman_tool output ?
- What can I do to restart this cluster ?

NB: Before testing this (below) I rebooted the complete cluster and gfs_fsck'ed /all nodes /with everything unmounted.

----------------------------------------------------------------------------------------------------
root # service clvmd start

root #: service gfs start
Mounting GFS filesystems:    # forever !

in another console I get:
root # dmesg | tail
...
GFS: fsid=globcover:baieGC2b.0: jid=14: Done
GFS: fsid=globcover:baieGC2b.0: jid=15: Trying to acquire journal lock...
GFS: fsid=globcover:baieGC2b.0: jid=15: Looking at journal...
GFS: fsid=globcover:baieGC2b.0: jid=15: Done
GFS: Trying to join cluster "lock_dlm", "globcover:baieGC3a"


root #  cman_tool services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                          11   2 run       -
[1 5 4 3 2]

DLM Lock Space:  "clvmd"                            12   3 run       -
[1 5 4 3 2]

DLM Lock Space:  "baieGC2b"                         13   4 run       -
[1 5]

DLM Lock Space:  "baieGC3a"                         15   6 run       -
[1 5 2 4 3]

GFS Mount Group: "baieGC2b"                         14   5 run       -
[1 5]

GFS Mount Group: "baieGC3a" 0 7 join S-2,2,4
[]


root # cman_tool status
Protocol version: 5.0.1
Config version: 8
Cluster name: globcover
Cluster ID: 53692
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 5
Expected_votes: 5
Total_votes: 5
Quorum: 3
Active subsystems: 9
Node name: globcover-fe
Node addresses: 10.1.1.1

root # cman_tool nodes
Node  Votes Exp Sts  Name
  1    1    5   M   globcover-fe
  2    1    5   M   compute-0-3
  3    1    5   M   compute-0-2
  4    1    5   M   compute-0-1
  5    1    5   M   compute-0-0

----------------------------------------------------------------------------------------------------



Thanks,
--
------------------------------------------------------------------------
Fernando NIÑO 	CNES - BPi 2102
Medias-France/IRD 	18, Av. Edouard Belin
Tél: 05.61.27.40.74 	31401 Toulouse Cedex 9




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]