[dm-devel] RE: panic in dm_pg_init_complete+0x10

goggin, edward egoggin at emc.com
Wed Mar 16 01:04:34 UTC 2005


On Thursday, Mar 010, 2005 at 17:29 UTC, Kergon, Alasdair wrote:
> > Looks to me like two issues here -- (1) probably should not allow
> > concurrent pg_init i/os for the same multipath and 
> 
> Some statements that the code should be checked against.
> 
> Only one pg_init can happen at once.
>   - enforced by pg_init_required only getting set to 1 in one place,
>     then set to 0 when the init is issued.

Yes.  I misread this code.  I agree it is solid.

>   - BUG? it looks possible for __switch_pg to get called 
>     while there's still a pg_init outstanding.
>

Yes, during recent testing I trapped a __switch_pg() while a previously
initiated pg_init request was still outstanding.  Not clear how to fix
this scenario -- (1) fail subsequent pg_inits if one is outstanding,
(2) serialize the pg_inits in __switch_pg(), or (3) allow multiple
pg_inits to be outstanding.  The patch I sent before was an attempt
at (3).  Overall, I think option (2) is best, but this would likely
require blocking and releasing the multipath spin lock in __switch_pg()
and having to revalidate the multipath state already acquired in map_io().
The included patch does __NOT__ include a fix for this problem.
 
> Suspend cannot complete until there is no I/O outstanding, including
> any pg_init I/O.
>   - queueing is disabled while suspended
>   - the last I/O through cannot complete with an error until the last 
>     path has failed, which can't happen until the pg_init for its PG
>     has completed.
>

I disagree with your last assertion.  I think there are two other
use cases besides the __switch_pg() case described above whereby
a pg_init i/o can be left outstanding after its multipath structure's
memory is freed.  I have definitely caught the first case below with
traps.  I forget whether I've actually seen the 2nd case though :((

First, the most common case -- it is possible for an uncompleted PG
switch to be pending with no i/os queued to the multipath
when a dev_suspend initiated i/o suspension followed by table swap
occurs.  process_queued_ios() when called by multipath_presuspend(),
will initiate a pg_init even though there are no queued user i/os.
Because there are no i/os queued, the suspend will not block
waiting for the pg_init to complete.  If the dm_table_put() call
on the old table frees the multipath structure memory before the
pg_init i/o completes (likely case), the problem occurs.  This case
can be prevented simply by having process_queued_ios() not initiate
a pg_init i/o if there are no queued i/os pended on the multipath
structure.  The included patch includes this fix.

Second, a somewhat less common case -- a pg_init i/o is outstanding when
a dev_suspend() call is made on a multipath for which the kernel thinks
it currently has no valid paths.  I think the sequence of events for the
use case is as enumerated below.   I think this case can be prevented
by not calling dispatch_queued_ios() if there is a oustanding pg_init.
This state is detected by the conditional logic 

	(m->queue_io && !m->pg_init_required)

In this case, any queued i/os will be dispatched only after the
outstanding pg_init i/o completes via the call to process_queued_io()
from dm_pg_init_complete().  dm_pg_init_complete() must also be
changed to reset the queue_io flag if the suspended flag is set
even in the case of an error on the pg_init i/o, otherwise the
i/os would never be dispatched in the "all-paths-down" use case.
The included patch includes this fix.

(1)	pg_init i/o is sent and causes queuing of one or more user i/os
(2)	all paths are detected as failed in kernel
(3)	multipath reloads new table causing via dev_suspend() ioctl causing
	i/o suspend followed by table swap
	(a)	pre-suspend causes even the last i/o to fail on
		the last path -- even before the pg_init for its
		PG has completed

> multipath_dtr can only be called when there is no I/O in-flight
>   - dm_swap_table checks the device is suspended before 
> calling __unbind
>   - BUG? dm_put only calls pre/post suspend but fails to wait 
> for I/O to clear

----------------------------------PATCH -------------------------------
*** drivers/md/dm-mpath.c.orig	2005-03-15 06:19:06.000000000 -0500
--- drivers/md/dm-mpath.c.patched	2005-03-15 14:34:36.000000000 -0500
***************
*** 374,385 ****
  	}
  }
  
  static void process_queued_ios(void *data)
  {
  	struct multipath *m = (struct multipath *) data;
  	struct hw_handler *hwh = &m->hw_handler;
  	struct pgpath *pgpath;
! 	unsigned init_required, must_queue = 0;
  	unsigned long flags;
  
  	spin_lock_irqsave(&m->lock, flags);
--- 374,387 ----
  	}
  }
  
+ #define	PG_INIT_OUTSTANDING(qio, init_required)	(qio &&
!init_required)
+ 
  static void process_queued_ios(void *data)
  {
  	struct multipath *m = (struct multipath *) data;
  	struct hw_handler *hwh = &m->hw_handler;
  	struct pgpath *pgpath;
! 	unsigned init_required, must_queue = 0, qio;
  	unsigned long flags;
  
  	spin_lock_irqsave(&m->lock, flags);
***************
*** 389,394 ****
--- 391,398 ----
  
  	pgpath = m->current_pgpath;
  
+ 	qio = m->queue_io;
+ 
  	if ((pgpath && m->queue_io) ||
  	    (!pgpath && m->queue_if_no_path && !m->suspended))
  		must_queue = 1;
***************
*** 397,408 ****
  	if (init_required)
  		m->pg_init_required = 0;
  
! 	spin_unlock_irqrestore(&m->lock, flags);
! 
! 	if (init_required)
! 		hwh->type->pg_init(hwh, pgpath->pg->bypassed,
&pgpath->path);
  
! 	if (!must_queue)
  		dispatch_queued_ios(m);
  }
  
--- 401,425 ----
  	if (init_required)
  		m->pg_init_required = 0;
  
! 	if (init_required) {
! 		// only send pg_init if there are queued i/os, otherwise the
! 		// pg_init i/o is not accounted for during i/o suspension
! 		if (m->queue_size == 0) {
! 			spin_unlock_irqrestore(&m->lock, flags);
! 			hwh->type->pg_init(hwh, pgpath->pg->bypassed,
! 					   &pgpath->path);
! 			init_required = 0;
! 		}
! 		else {
! 			m->pg_init_required = 1;
! 			spin_unlock_irqrestore(&m->lock, flags);
! 		}
! 	}
! 	else
! 		spin_unlock_irqrestore(&m->lock, flags);
  
! 	// must not dispatch i/os if there is a pg_init outstanding
! 	if (!must_queue && !PG_INIT_OUTSTANDING(qio, init_required))
  		dispatch_queued_ios(m);
  }
  
***************
*** 751,756 ****
--- 768,775 ----
  static void multipath_dtr(struct dm_target *ti)
  {
  	struct multipath *m = (struct multipath *) ti->private;
+ 
+ 	flush_scheduled_work();
  	free_multipath(m);
  }
  
***************
*** 965,970 ****
--- 984,992 ----
  	if (!err_flags)
  		m->queue_io = 0;
  	else {
+ 		// reset queue_io in order to dispatch queued i/os if
suspended
+ 		if (m->suspended)
+ 			m->queue_io = 0;
  		m->current_pgpath = NULL;
  		m->current_pg = NULL;
  	}
***************
*** 986,992 ****
  
  	spin_lock(&m->lock);
  	if (!m->nr_valid_paths) {
! 		if (!m->queue_if_no_path) {
  			spin_unlock(&m->lock);
  			return -EIO;
  		} else {
--- 1008,1014 ----
  
  	spin_lock(&m->lock);
  	if (!m->nr_valid_paths) {
! 		if (!m->queue_if_no_path || m->suspended) {
  			spin_unlock(&m->lock);
  			return -EIO;
  		} else {




More information about the dm-devel mailing list