[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Question about gfs2_tools lockdump



----- "Scooter Morris" <scooter cgl ucsf edu> wrote:
| Hi all,
|      I've got a 5 node RHEL 5.5 cluster with a number of gfs2 
| filesystems.  After a lot of effort (and help from RedHat) we've
| gotten 
| to the stage where the cluster is quite stable, but now we're starting
| 
| to see some performance degradation.   In investigating this, I've
| been 
| poking around and I'm seeing some things that I can't explain.  In 
| particular, on a quite filesystem (no processes according to lsof on
| all 
| nodes), a gfs2_tool lockdump gives 1,000's of lock entries (G: lines).
|  
| Of those several have R: entries (resource group?) and several have H:
| 
| entries.  The H: entries are particularly strange because all H:
| entries 
| are of the form:
|      H: s:EX f:H e:0 p:8953 [(ended)] ...
| 
| My understanding is that this indicates a lock holder with an
| exclusive 
| lock, but the process has ended (?).   Why aren't these locks going 
| away?  Shouldn't they be cleared after the process ends (particularly
| 
| since some of them are exclusive locks...)?  Any help in understanding
| 
| these entries would be very helpful.
| 
| -- scooter
| 
| --
| Linux-cluster mailing list
| Linux-cluster redhat com
| https://www.redhat.com/mailman/listinfo/linux-cluster

Hi Scooter,

There are lots of different types of glocks, and the type is given
before the slash.  Type 2 is inode, so 2/9009 is for a disk inode
located at block 0x9009 (in hex).  Type 3 is for resource groups,
so 3/170003 is for the resource group starting at block 0x170003.
Type 5 is for i_open glocks, which also correspond mostly to files.
So if you open a file and write some data, you can get both a inode
glock for 2/9009 and a corresponding i_open glock for 5/9009.
The inode glocks will also have a corresponding "I:" entry.
The resource group glocks may have an R: entry as well.

Each "H:" corresponds to a process that is holding or trying to hold
that particular glock.  A holder may persist even after a process
has ended.  For example, if I'm the first process to write to a
gfs2 file system, I could cause all the resource groups to be read in,
but the resource groups and their corresponding glocks will stay
in memory.

A holder record is said to be holding the glock if it has the
f:H flag.  It's waiting for the lock if it has the f:W flag.
If it says "s:SH", that's a shared hold.  If it says "s:EX"
that's an exclusive hold on the glock, etc.  So for example,
"s:EX f:W" corresponds to someone waiting for an exclusive lock
for that glock.

Another complication is that some versions of gfs2 sometimes
did not keep track of the process id (pid) when a glock was
transferred.  So some older versions report the pid as the
old pid, which would have ended, and not the correct holder.
That made debugging glock issues difficult, but it didn't hurt
anything.  I think that issue is fixed in 5.5 or 5.6.

It's a lot more complicated than that, but those are the basics.

I think Steve Whitehouse wrote a paper on glocks, but I don't
have the info handy.

Regards,

Bob Peterson
Red Hat File Systems


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]