[Linux-cluster] 2 node rm hang more info - dlm hang?

Sat Dec 11 02:20:36 UTC 2004

On Fri, Dec 10, 2004 at 05:07:42PM -0800, Daniel McNeil wrote:
> cl032.ld.decipher Glock (rgrp[3], 17)
>   gl_flags = lock[1] dirty[5]
>   gl_count = 6
>   gl_state = exclusive[1]
>   lvb_count = 1
>   object = yes
>   aspace = 5
>   reclaim = no
>   Request
>     owner = none[-1]
>     gh_state = unlocked[0]
>     gh_flags = try[0]
>     error = 0
>     gh_iflags = demote[2] alloced[4] dealloc[5]
>   Waiter2
>     owner = none[-1]
>     gh_state = unlocked[0]
>     gh_flags = try[0]
>     error = 0
>     gh_iflags = demote[2] alloced[4] dealloc[5]
>   Waiter3
>     owner = 23528
>     gh_state = exclusive[1]
>     gh_flags = local_excl[5]
>     error = 0
>     gh_iflags = promote[1]

Given the the dirty bit is still set on the glock, this is probably stuck
in between the time that GFS marks request as being in progress and when
it calls down into the lock module.  I've looked at the code and I
don't see anything obviously wrong.  But...

You wouldn't be able to install KDB or something that will let you
get backtraces of the processes in the runnable state, would you?

-- 
Ken Preslan <kpreslan at redhat.com>