[dm-devel] Re: [PATCH] dm mpath: Try recover from I/O failure by re-initializing the PG if device is running on one path

Grant Grundler grundler at google.com
Wed Apr 22 17:41:44 UTC 2009


On Mon, Apr 20, 2009 at 11:05 AM, Moger, Babu <Babu.Moger at lsi.com> wrote:
> This patch introduces the mechanism to recover from I/O failures by re-initializing the path if the device is running on only one path.
>
> Problem: Device mapper fails the path for every I/O error.
> It does not care about the type of error.

This is the fundamental problem.  Different layers of the block IO
path have to agree on how to handle each possible type of error that
can be returned. I don't know where to find such an agreement and
think an implementation that does discriminate is needed.

> There are certain errors which can be recovered by re-initializing the path again. I have seen this problem during my testing on rdac device handler. I have observed I/O errors when there is a change in Lun ownership. When Lun ownership changes device will return back with check condition with sense 0x05/0x94/0x01(SK/ASC/ASCQ -meaning Lun ownership changed). Currently, device mapper fails the path for this error and eventually this will lead to I/O error. We don't want to see I/O error for this reason.

1) This patch isn't discriminating between transport, media, or other
device errors. Wouldn't it make sense to discriminate?
"LUN ownership changed" sounds like some of the events possible in
multi-inititiator enviroment would want to be notified about and
perhaps even take some action (renegotiate access to

2) Will this result in resetting a SATA device?
I ask because device reset may result in data loss due to WCE enabled.
I just don't know the higher parts of the block SW stack and how
errors flow up the stack.

thanks,
grant

>
> The patch will set the flag pg_init_required if the device is running on single path. The process_queued_ios will re-initialize path if required. I have tested this patch on LSI rdac handler.
>
> Signed-off-by: Babu Moger <babu.moger at lsi.com>
> ---
>
> --- linux-2.6.30-rc2/drivers/md/dm-mpath.c.orig 2009-04-17 16:49:33.000000000 -0500
> +++ linux-2.6.30-rc2/drivers/md/dm-mpath.c      2009-04-17 17:09:51.000000000 -0500
> @@ -1152,6 +1152,15 @@ static int do_end_io(struct multipath *m
>                return error;
>
>        spin_lock_irqsave(&m->lock, flags);
> +       /*
> +        * If this is the only path left, then lets try to
> +        * re-initialize the PG one last time..
> +        */
> +       if (m->nr_valid_paths == 1 && m->hw_handler_name) {
> +               m->pg_init_required = 1;
> +               spin_unlock_irqrestore(&m->lock, flags);
> +               goto requeue;
> +       }
>        if (!m->nr_valid_paths) {
>                if (__must_push_back(m)) {
>                        spin_unlock_irqrestore(&m->lock, flags);
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>




More information about the dm-devel mailing list