[Linux-cluster] fcntl locking lockup (dlm 1.07, GFS 6.1.5, kernel 2.6.9-67.EL)

Charlie Brady charlieb-linux-cluster at e-smith.com
Wed Jan 9 03:43:16 UTC 2008


On Tue, 8 Jan 2008, David Teigland wrote:

> On Fri, Jan 04, 2008 at 04:18:45PM -0500, Charlie Brady wrote:
> > We've reduced the application code to a simple test case. The following 
> > code run on each node will soon block, and doesn't receive signals until 
> > the peer node is shutdown:
...
> Yes, this stresses a problematic design limitation in the RHEL4 dlm where
> the dlm master node is ping-ponging all over the place and becomes so
> unstable that everything comes to a halt.  One possible work-around is to
> modify the program to hold a lock on filedes to keep the master stable,
> e.g.  hold a zero length lock at some unused offset like 0xFFFFFF.

Thanks. I've passed the advice on.

--
Charlie




More information about the Linux-cluster mailing list