[dm-devel] [RFC] training mpath to discern between SCSI errors (was: Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush)

Mike Snitzer snitzer at redhat.com
Wed Aug 25 15:59:18 UTC 2010


On Wed, Aug 25 2010 at  4:00am -0400,
Kiyoshi Ueda <k-ueda at ct.jp.nec.com> wrote:

> > I'm not sure how to proceed here.  How much work would
> > discerning between transport and IO errors take?  If it can't be done
> > quickly enough the retry logic can be kept around to keep the old
> > behavior but that already was a broken behavior, so...  :-(
> 
> I'm not sure how long will it take.

We first need to understand what direction we want to go with this.  We
currently have 2 options.  But any other ideas are obviously welcome.

1)
Mike Christie has a patchset that introduce more specific
target/transport/host error codes.  Mike shared these pointers but he'd
have to put the work in to refresh them:
http://marc.info/?l=linux-scsi&m=112487427230642&w=2
http://marc.info/?l=linux-scsi&m=112487427306501&w=2
http://marc.info/?l=linux-scsi&m=112487431524436&w=2
http://marc.info/?l=linux-scsi&m=112487431524350&w=2

errno.h new EXYZ
http://marc.info/?l=linux-kernel&m=107715299008231&w=2

add block layer blkdev.h error values
http://marc.info/?l=linux-kernel&m=107961883915068&w=2

add block layer blkdev.h error values (v2 convert more drivers)
http://marc.info/?l=linux-scsi&m=112487427230642&w=2

I think that patchset's appoach is fairly disruptive just to be able to
train upper layers to differentiate (e.g. mpath).  But in the end maybe
that change takes the code in a more desirable direction?

2)
Another option is Hannes' approach of having DM consume req->errors and
SCSI sense more directly.

I've refreshed Hannes' previous patchset against 2.6.36-rc2 but I
haven't finished testing it yet (should be OK.. it boots, but still have
FIXME to move scsi_uld_should_retry to scsi_error.c):
http://people.redhat.com/msnitzer/patches/dm-scsi-sense/

Would be great if James, Hannes and others had a look at this
refreshed RFC patchset.  It's clearly not polished but it gives an idea
of the approach.  Does this look worthwhile?

Follow-on work is needed to refine scsi_uld_should_retry further.  Keep
in mind that scsi_error.c is the intended location for this code.

James, please note that I've attempted to make REQ_TYPE_FS set
req->errors only for "genuine errors" by (ab)using
scsi_decide_disposition:
http://people.redhat.com/msnitzer/patches/dm-scsi-sense/scsi-Always-pass-error-result-and-sense-on-request-completion.patch

If others think this may be worthwhile I can finish testing, cleanup the
patches further, and post them.

Mike




More information about the dm-devel mailing list