[Linux-cluster] clvmd hangs

David Teigland teigland at redhat.com
Fri May 4 13:38:54 UTC 2007


On Thu, May 03, 2007 at 11:27:08AM +0200, Sebastian Walter wrote:
> Does anybody have a solution for this? Is there any documentation about 
> the Code messages?
> 
> 
> Sebastian Walter wrote:
> >Thanks for your help. These are /proc/cluster/services:
> >
> >###master
> >Service          Name                              GID LID State     Code
> >Fence Domain:    "default"                           6   2 run       -
> >[3 2 1]
> >
> >DLM Lock Space:  "clvmd"                             5   3 join      
> >S-6,20,3
> >[3 2 1]
> >
> >### node1:
> >Service          Name                              GID LID State     Code
> >Fence Domain:    "default"                           6   2 run       -
> >[3 2 1]
> >
> >DLM Lock Space:  "clvmd"                             5   3 update    
> >U-4,1,1
> >[2 3 1]
> >
> >### node2:
> >Service          Name                              GID LID State     Code
> >Fence Domain:    "default"                           6   3 run       -
> >[3 2 1]
> >
> >DLM Lock Space:  "clvmd"                             5   4 update    
> >U-4,1,1
> >[2 3 1]

This says that the dlm is stuck in recovery on all the nodes.
Which version of the code are you using?
Has this happened more than once?
Does the cluster have quorum? (cman_tool status)
What does /proc/cluster/dlm_debug show from all nodes?
What are the dlm threads waiting on? (ps ax -o pid,stat,wchan,cmd | grep dlm)

Dave




More information about the Linux-cluster mailing list