[dm-devel] [PATCH]dm-mpath: fix for race condition between multipath_dtr and pg_init_done.

Mike Snitzer snitzer at redhat.com
Thu Oct 17 21:10:16 UTC 2013


On Thu, Oct 17 2013 at  5:47pm -0400,
Hannes Reinecke <hare at suse.de> wrote:

> On 10/17/2013 08:53 PM, Mike Snitzer wrote:
> >Thanks for reporting this.  Much appreciated.  More comments below.
> >
> >On Thu, Oct 17 2013 at  1:31pm -0400,
> >Merla, ShivaKrishna <ShivaKrishna.Merla at netapp.com> wrote:
> >
> >>Whenever multipath_dtr is happening, we should prevent queueing any further path
> >>activation work. There was a kernel panic where after pg_init_done() decrements
> >>pg_init_in_progress to 0, wait_for_pg_init_completion call assumes there are no
> >>more pending path management commands. But if pg_init_required is set by
> >>pg_init_done call due to retriable mode_select errors , then process_queued_ios()
> >>will again queue the path activation work. If free_multipath call has been
> >>completed by the time activate_path work is called, kernel panic was seen on
> >>accessing multipath members.
> >
> >Your locking looks suspect to me, see comment inlined below multipath_dtr
> >
> >But shouldn't we just train multipath_wait_for_pg_init_completion() to
> >look at m->pg_init_required?  Have it wait for both pg_init_required and
> >pg_init_in_progress to be zero?  We'd also have to audit that
> >pg_init_required cannot be set while pg_init_in_progress.
> >
> Hmm.
> 
> We _could_ try to resolve it by pushing I/O back onto the request queue
> (cf my earlier post 'requeue I/O during pg_init').
> 
> I was hoping to excite some comments with that, but seems to be my
> fate nowadays to send out patches with no reply.

patchwork caught it:
https://patchwork.kernel.org/patch/2969111/

I've just been distracted with other stuff the past week; but I'll be
looking closer at this issue (and your earlier patch) shortly and we'll
get a fix queued for 3.13.
 
> Anyway, maybe this will be giving it some more attention.
> It definitely would avoid this problem, by virtue of not having to
> queue I/O internally during pg_init, so we could easily tear down
> the queue.

Sounds good.




More information about the dm-devel mailing list