[Linux-cluster] GFS2/DLM deadlock

Bob Peterson rpeterso at redhat.com
Sat Sep 8 13:38:42 UTC 2012


----- Original Message -----
| A question on the inode numbers in the hangalyzer output.
| 
| In the glock dump for node2 you have these lines:
| G:  s:SH n:2/81523 f:dq t:SH d:UN/0 l:0 a:0 r:4 m:100
|     I: n:126/529699 t:4 f:0x10 d:0x00000001 s:3864/3864
| 
| >From docs I've read I understand that the glock field 'n:2/81523'
| tells me that 81523 is the inode number in hex (if the type is 2 or
| 5).
| What are the fields in the inode line following the glock mean (at
| least the n: field)?

The numbers after the n: are the glock identifier. It consists of
the glock type (2 for inode, 3 for rgrp, 5 for i_open, and a bunch
of special ones) followed by "/" followed by the inode number
(disk inode's block address) in hex.

After f: are the glock flags. For example, "q" means the glock is
queued. There are a bunch of flags with a bunch of meanings.

t: is the target glock state; the lock state it's trying to achieve.
SH is for a shared lock, EX is exclusive, UN is unlocked, etc.
d: is the demote glock state; the lock state it needs to transition
to when the lock is demoted. In this case, demote to UNlocked.
The number after the slash is the demote time.
a: is active items count, or the number of "live" buffers to be written.
r: is the revoke count, or the number of journal items needing to be
   revoked due to delete, etc.
m: is the minimum hold time for the glock, in milliseconds.

On the next line, I: indicates this glock is for an inode.
n:126 is a formal inode number (can be ignored). The number after the
slash, 529699, is the inode disk address in decimal.
t: is the mode, f: are the inode flags, d: are the disk flags, and
s: is the inode's size in decimal. Before the slash is the size stored 
in one of our internal structures. After the slash is the size
according to the vfs inode. In almost all cases they should be the same.

Note that the format of these fields, the flags, and everything
differs from release to release. For example, newer versions of GFS2
don't have two different numbers for inode size.

| We can't move the production clusters to 6.3 because other product
| integration issues prevent that.
| Would I need more than the updated kernel in 6.3 to get the extra
| tracing? Perhaps we could compile the updated 6.3 kernel for the 5.8
| release? There have been a lot of kernel build changes so I don't
| know
| if that is even possible at this point.

I don't think it's possible. There are too many interdependencies.

| Thank you for the input. We want to be able to gather enough info to
| submit a bug report, if it turns out to be that, so the suggestions
| on
| what else to capture are very valuable. FYI, we only have self
| support
| licenses from RedHat at this point which is why we have not engaged
| RedHat support directly yet, but we are highly motivated to find the
| problem.
| 
| Jason

Regards,

Bob Peterson
Red Hat File Systems




More information about the Linux-cluster mailing list