[Cluster-devel] [PATCH] dlm: send_bast_queue() skip list loop not only sending basts to convertqueue

David Teigland teigland at redhat.com
Tue Jan 4 21:27:38 UTC 2011


On Tue, Jan 04, 2011 at 06:06:51PM -0200, cmaiolino at redhat.com wrote:
> The resource groups got corrupted without this patch:

I could see an extraneous bast leading to confusion in gfs2 about the lock
state, but gfs2 should probably be asserting somewhere before it actually
corrupts anything...

> diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
> index 64e5f3e..565c519 100644
> --- a/fs/dlm/lock.c
> +++ b/fs/dlm/lock.c
> @@ -1847,7 +1847,7 @@ static void send_bast_queue(struct dlm_rsb *r, struct list_head *head,
>  
>  	list_for_each_entry(gr, head, lkb_statequeue) {
>  		/* skip self when sending basts to convertqueue */
> -		if (gr == lkb)
> +		if (head == &r->res_grantqueue && gr == lkb)
>  			continue;
>  		if (gr->lkb_bastfn && modes_require_bast(gr, lkb)) {
>  			queue_bast(r, gr, lkb->lkb_rqmode);

I haven't been able to figure out the problem or the fix; some printk's
around the case in question would be revealing.

This is the specific case where a TRY_1CB (NOQUEUBAST) conversion fails.
Here's how I step through the code for that case:

_convert_lock(lkb)
    error = do_convert(lkb)
        when error equals -EAGAIN, lkb remains on grantqueue
    do_convert_effects(lkb, -EAGAIN)
        -EAGAIN and NOQUEUEBAST -> send_blocking_asts_all ->
        send_bast_queue(grantqueue, lkb)
           [lkb is expected to be here, skip sending bast to self]
        send_bast_queue(convertqueue, lkb):
           [lkb should not be on here, but your patch implies there
            are cases where it can be?  I think that would be a bug.]

Dave




More information about the Cluster-devel mailing list