[Cluster-devel] [PATCH 2/2] GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads

Steven Whitehouse swhiteho at redhat.com
Mon Oct 8 12:57:02 UTC 2018


Hi,


On 08/10/18 13:36, Mark Syms wrote:
> From: Tim Smith <tim.smith at citrix.com>
>
> Flushing the workqueue can cause operations to happen which might
> call gfs2_log_reserve(), or get stuck waiting for locks taken by such
> operations.  gfs2_log_reserve() can io_schedule(). If this happens, it
> will never wake because the only thing which can wake it is gfs2_logd()
> which was already stopped.
>
> This causes umount of a gfs2 filesystem to wedge permanently if, for
> example, the umount immediately follows a large delete operation.
>
> When this occured, the following stack trace was obtained from the
> umount command
>
> [<ffffffff81087968>] flush_workqueue+0x1c8/0x520
> [<ffffffffa0666e29>] gfs2_make_fs_ro+0x69/0x160 [gfs2]
> [<ffffffffa0667279>] gfs2_put_super+0xa9/0x1c0 [gfs2]
> [<ffffffff811b7edf>] generic_shutdown_super+0x6f/0x100
> [<ffffffff811b7ff7>] kill_block_super+0x27/0x70
> [<ffffffffa0656a71>] gfs2_kill_sb+0x71/0x80 [gfs2]
> [<ffffffff811b792b>] deactivate_locked_super+0x3b/0x70
> [<ffffffff811b79b9>] deactivate_super+0x59/0x60
> [<ffffffff811d2998>] cleanup_mnt+0x58/0x80
> [<ffffffff811d2a12>] __cleanup_mnt+0x12/0x20
> [<ffffffff8108c87d>] task_work_run+0x7d/0xa0
> [<ffffffff8106d7d9>] exit_to_usermode_loop+0x73/0x98
> [<ffffffff81003961>] syscall_return_slowpath+0x41/0x50
> [<ffffffff815a594c>] int_ret_from_sys_call+0x25/0x8f
> [<ffffffffffffffff>] 0xffffffffffffffff
Good spotting! Definitely need to fix this :-)

Steve.

> Signed-off-by: Tim Smith <tim.smith at citrix.com>
> Signed-off-by: Mark Syms <mark.syms at citrix.com>
> ---
>   fs/gfs2/super.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index c212893..a971862 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -854,10 +854,10 @@ static int gfs2_make_fs_ro(struct gfs2_sbd *sdp)
>   	if (error && !test_bit(SDF_SHUTDOWN, &sdp->sd_flags))
>   		return error;
>   
> +	flush_workqueue(gfs2_delete_workqueue);
>   	kthread_stop(sdp->sd_quotad_process);
>   	kthread_stop(sdp->sd_logd_process);
>   
> -	flush_workqueue(gfs2_delete_workqueue);
>   	gfs2_quota_sync(sdp->sd_vfs, 0);
>   	gfs2_statfs_sync(sdp->sd_vfs, 0);
>   




More information about the Cluster-devel mailing list