[Cluster-devel] [GFS2 patch] fix hangup when multiple processes are trying to write to the same file

Fri Feb 23 18:13:37 UTC 2007

Hi,

Now in the -nmw git tree. Thanks,

Steve.

On Fri, 2007-02-23 at 12:49 -0500, Josef Whiter wrote:
> On Fri, Feb 23, 2007 at 04:29:19PM +0000, Steven Whitehouse wrote:
> > Hi,
> > 
> > On Fri, 2007-02-23 at 09:05 -0500, Josef Whiter wrote:
> > > On Fri, Feb 23, 2007 at 08:38:32AM -0500, Josef Whiter wrote:
> > > > This fixes a problem I encountered while running bonnie++.  When you have one
> > > > thread that opens a file and starts to write to it, and then another thread that
> > > > tries to open and write to the same file, the second thread will loop forever
> > > > trying to grab the inode lock for that inode.  Basically we come in through
> > > > generic_buffered_file_write, which calls gfs2_prepare_write, which then attempts
> > > > to grab the glock.  Because we don't own the lock, gfs2_prepare_write gets
> > > > GLR_TRYFAILED, which returns AOP_TRUNCATED_PAGE to generic_buffered_file_write.
> > > > At this point generic_buffered_file_write loops around again and immediately
> > > > retries the prepare_write.  This means that the second process never gets off of
> > > > the processor in order to allow the process that holds the lock to finish its
> > > > work and let go of the lock.  This patch makes gfs2_glock_nq schedule() if it
> > > > gets back a GLR_TRYFAILED, which resolves this problem.  Please let me know if
> > > > this is in the wrong place, ie if it should go in gfs2_prepare_write instead of
> > > > gfs2_glock_nq, or if its completely wrong :).
> > > >
> > > 
> > > Ok at the suggestion of Steven I have made it so that we just yield in the
> > > places where this would be a problem, so in gfs2_prepare_write and
> > > gfs2_readpage.
> > > 
> > > Signed-off-by: Josef Whiter <jwhiter at redhat.com>
> > > 
> > Thats looks good, but one small comment:
> > 
> > > --- linux-2.6/fs/gfs2/ops_address.c.josef	2007-02-23 09:51:40.000000000 -0500
> > > +++ linux-2.6/fs/gfs2/ops_address.c	2007-02-23 09:54:27.000000000 -0500
> > > @@ -266,8 +266,10 @@ skip_lock:
> > >  out:
> > >  	return error;
> > >  out_unlock:
> > > -	if (error == GLR_TRYFAILED)
> > > +	if (error == GLR_TRYFAILED) {
> > >  		error = AOP_TRUNCATED_PAGE;
> > > +		yield();
> > > +	}
> > >  	unlock_page(page);
> > Can you move the above unlock_page to directly after the out_unlock: ?
> > That way we won't schedule while holding the page lock. The other part
> > of the patch does that in the right order already and looks good,
> > 
> > Steve.
> > 
> 
> Doh sorry about that.  Here's the correct version
> 
> 
> --- linux-2.6/fs/gfs2/ops_address.c.josef	2007-02-23 09:51:40.000000000 -0500
> +++ linux-2.6/fs/gfs2/ops_address.c	2007-02-23 13:44:53.000000000 -0500
> @@ -266,9 +266,11 @@ skip_lock:
>  out:
>  	return error;
>  out_unlock:
> -	if (error == GLR_TRYFAILED)
> -		error = AOP_TRUNCATED_PAGE;
>  	unlock_page(page);
> +	if (error == GLR_TRYFAILED) {
> +		error = AOP_TRUNCATED_PAGE;
> +		yield();
> +	}
>  	if (do_unlock)
>  		gfs2_holder_uninit(&gh);
>  	goto out;
> @@ -364,6 +366,7 @@ static int gfs2_prepare_write(struct fil
>  		if (error == GLR_TRYFAILED) {
>  			unlock_page(page);
>  			error = AOP_TRUNCATED_PAGE;
> +			yield();
>  		}
>  		goto out_uninit;
>  	}