[dm-devel] [RFC] training mpath to discern between SCSI errors

Hannes Reinecke hare at suse.de
Mon Oct 18 11:55:26 UTC 2010


On 10/18/2010 10:09 AM, Jun'ichi Nomura wrote:
> Hi Hannes,
> 
> Thank you for working on this issue and sorry for very late reply...
> 
> (08/30/10 23:52), Hannes Reinecke wrote:
>> From: Hannes Reinecke <hare at suse.de>
>> Date: Mon, 30 Aug 2010 16:21:10 +0200
>> Subject: [RFC][PATCH] scsi: Detailed I/O errors
>>
>> Instead of just passing 'EIO' for any I/O errors we should be
>> notifying the upper layers with some more details about the cause
>> of this error.
>> This patch updates the possible I/O errors to:
>>
>> - ENOLINK: Link failure between host and target
>> - EIO: Retryable I/O error
>> - EREMOTEIO: Non-retryable I/O error
>>
>> 'Retryable' in this context means that an I/O error _might_ be
>> restricted to the I_T_L nexus (vulgo: path), so retrying on another
>> nexus / path might succeed.
> 
> Does 'retryable' of EIO mean retryable in multipath layer?
> If so, what is the difference between EIO and ENOLINK?
> 
Yes, EIO is intended for errors which should be retried at the
multipath layer. This does _not_ include transport errors, which are
signalled by ENOLINK.

Basically, ENOLINK is a transport error, and EIO just means
something is wrong and we weren't able to classify it properly.
If we were, it'd be either ENOLINK or EREMOTEIO.

> I've heard of a case where just retrying within path-group is
> preferred to (relatively costly) switching group.
> So, if EIO (or other error code) can be used to indicate such type
> of errors, it's nice.
> 
Yes, that was one of the intention.

> 
> Also (although this might be a bit off topic from your patch),
> can we expand such a distinction to what should be logged?
> Currently, it's difficult to distinguish important SCSI/block errors
> and less important ones in kernel log.
> For example, when I get a link failure on sda, kernel prints something
> like below, regardless of whether the I/O is recovered by multipathing or not:
>   end_request: I/O error, dev sda, sector XXXXX
> 
Indeed, when using the above we could be modifying the above
message, eg by

end_request: transport error, dev sda, sector XXXXX

or

end_request: target error, dev sda, sector XXXXX

which would improve the output noticeable.

> Setting REQ_QUIET in dm-multipath could mask the message
> but also other important ones in SCSI.
> 
Hmm. Not sure about that, but I think the above modifications will
be useful already.

I'll be sending an updated patch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare at suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)




More information about the dm-devel mailing list