[Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock

Mon Oct 8 12:56:29 UTC 2018

Hi,

On 08/10/18 13:36, Mark Syms wrote:
> During a VM stress test we encountered a system lockup and kern.log contained
>
> kernel: [21389.462707] INFO: task python:15480 blocked for more than 120 seconds.
> kernel: [21389.462749] Tainted: G O 4.4.0+10 #1
> kernel: [21389.462763] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kernel: [21389.462783] python D ffff88019628bc90 0 15480 1 0x00000000
> kernel: [21389.462790] ffff88019628bc90 ffff880198f11c00 ffff88005a509c00 ffff88019628c000
> kernel: [21389.462795] ffffc90040226000 ffff88019628bd80 fffffffffffffe58 ffff8801818da418
> kernel: [21389.462799] ffff88019628bca8 ffffffff815a1cd4 ffff8801818da5c0 ffff88019628bd68
> kernel: [21389.462803] Call Trace:
> kernel: [21389.462815] [<ffffffff815a1cd4>] schedule+0x64/0x80
> kernel: [21389.462877] [<ffffffffa0663624>] find_insert_glock+0x4a4/0x530 [gfs2]
> kernel: [21389.462891] [<ffffffffa0660c20>] ? gfs2_holder_wake+0x20/0x20 [gfs2]
> kernel: [21389.462903] [<ffffffffa06639ed>] gfs2_glock_get+0x3d/0x330 [gfs2]
> kernel: [21389.462928] [<ffffffffa066cff2>] do_flock+0xf2/0x210 [gfs2]
> kernel: [21389.462933] [<ffffffffa0671ad0>] ? gfs2_getattr+0xe0/0xf0 [gfs2]
> kernel: [21389.462938] [<ffffffff811ba2fb>] ? cp_new_stat+0x10b/0x120
> kernel: [21389.462943] [<ffffffffa066d188>] gfs2_flock+0x78/0xa0 [gfs2]
> kernel: [21389.462946] [<ffffffff812021e9>] SyS_flock+0x129/0x170
> kernel: [21389.462948] [<ffffffff815a57ee>] entry_SYSCALL_64_fastpath+0x12/0x71
>
> On examination of the code it was determined that this code path
> is only taken if the selected glock is marked as dead, the supposition
> therefore is that by the time schedule as called the glock had been
> cleaned up and therefore nothing woke the schedule. Instead of calling
> schedule, call schedule_timeout(HZ) so at least we get a chance to
> re-evaluate.
>
> On repeating the stress test, the printk message was seen once in the logs
> across four servers but no further occurences nor were there any stuck task
> log entries. This indicates that when the timeout occured the code repeated
> the lookup and did not find the same glock entry but as we hadn't been
> woken this means that we would never have been woken.
>
> Signed-off-by: Mark Syms <mark.syms at citrix.com>
> Signed-off-by: Tim Smith <tim.smith at citrix.com>
> ---
>   fs/gfs2/glock.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> index 4614ee2..0a59a01 100644
> --- a/fs/gfs2/glock.c
> +++ b/fs/gfs2/glock.c
> @@ -758,7 +758,8 @@ static struct gfs2_glock *find_insert_glock(struct lm_lockname *name,
>   	}
>   	if (gl && !lockref_get_not_dead(&gl->gl_lockref)) {
>   		rcu_read_unlock();
> -		schedule();
> +		if (schedule_timeout(HZ) == 0)
> +			printk(KERN_INFO "find_insert_glock schedule timed out\n");
>   		goto again;
>   	}
>   out:

That is a bit odd. In fact that whole function looks odd. I wonder why 
it needs to wait in the first place. It should be a simple comparison 
here. If the glock is dead then nothing else should touch it, so we are 
safe to add a new one into the hash table. The wait is almost certainly 
a bug,

Steve.