[Linux-cluster] GFS locks recovery

David Teigland teigland at redhat.com
Thu Mar 29 19:31:15 UTC 2007


On Thu, Mar 29, 2007 at 08:52:18PM +0200, Christos Triantafillou wrote:
> Hello,
> 
> I am using a RHEL4 cluster with 2 nodes, GFS and fencing.
> 
> As a test, I started a process on node2 that got an fcntl() exclusive lock 
> on a GFS file
> and then a process on node1 that started the same program waiting for an 
> exclusive
> lock on the same GFS file.
> Node2 was then switched off and rebooted.
> 
> What I observed was that node1 did not acquire the lock immediately after 
> the switch-off  but only when node2 finished rebooting.
> 
> A few questions:
> 1. when a node goes down, shouldn't all its GFS locks be (almost) 
> immediately released as part of the fencing proces or the GFS recovery on 
> the other nodes?

Yes.  Did the remaining node have quorum when you killed the other?  If
not, then you should set two_node=1 in cluster.conf so it will.  Fencing,
dlm recovery and gfs recovery won't happen unless there's quorum; after
this recovery, the locks you want should be granted (regardless of whether
the other node has rebooted or not).

> 2. during the lock wait, it was impossible to interrupt/kill the process on 
> node1.  Is it possible to interrupt a process waiting on a POSIX lock?

no

> 3. if the previous are not possible, would it be preferable to use POSIX 
> locks on an NFS file instead?
> Or would you recommend using DLM?

Either, possibly; you'd have to try it out.  GFS works much better with
flock (although that's not interruptible either), if that's an option.

Dave




More information about the Linux-cluster mailing list