[dm-devel] [RESUBMIT][Patch] scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn

James Bottomley James.Bottomley at suse.de
Tue Oct 26 19:58:59 UTC 2010


On Wed, 2010-10-27 at 01:02 +0530, Shyam_Iyer at Dell.com wrote:
> -----Original Message-----
> > From: linux-scsi-owner at vger.kernel.org [mailto:linux-scsi-owner at vger.kernel.org] On Behalf Of James
> > Bottomley
> > Sent: Tuesday, October 26, 2010 3:22 PM
> > To: Mike Christie
> > Cc: device-mapper development; Chauhan, Vijay; James Bottomley; linux-scsi at vger.kernel.org
> > Subject: Re: [dm-devel] [RESUBMIT][Patch] scsi_dh_rdac: retry IO for 06/3f/03 in rdac_check_sense fn
> > 
> > On Tue, 2010-10-26 at 14:18 -0500, Mike Christie wrote:
> > > On 10/26/2010 08:53 AM, Chauhan, Vijay wrote:
> > > > Resubmitting this patch to get the attention.
> > > >
> > > > This patch adds retry for the IO returned with 06/3f/03((INQUIRY_DATA_CHANGED)) sense code  in
> > rdac_check_sense(). IO returned with 06/3f/03 from controller are currently failed by scsi mid layer,
> > as a reason momentarily path failure is noticed by DM multipath.
> > > >
> > >
> > > Is it getting failed by accident? In scsi_io_completion we check for UAs
> > > and will retry if the removable bit is not set. That check is after
> > > scsi_end_request though (is the scsi_end_request call failing the IO).
> > >
> > > Did you guys also want REPORTED_LUNS_DATA_HAS_CHANGED to be retried too.
> > > I think scsi_dh_alua's REPORTED_LUNS_DATA_HAS_CHANGED maybe should be
> > > genericly retried, because it seems for both errors we will want to
> > > retry for all devices.
> > 
> > So my primary worry about patches like this is that it eats AENs ...
> > this is fine because, as Mike says, we should just ignore them.
> > 
> > However, the moment we start processing AENs (as another set of dm
> > people promise they have in process) we'll lose them from rdac arrays
> > and people will get unhappy.
> > 
> > If the generic UA retry isn't working, let's fix it there rather than
> > these hacks that would be hard to spot and pull out when (if) we ever
> > get a generic AEN infrastructure.
> > 
> > James
> 
> Sometimes the default way to handle a UA may be not the correct one.
> One arrays implementation to respond to the UA could be different from
> another array.

I don't quite understand this observation in the context of this patch.
What the patch does is retry the command (i.e. ignore the UA) which
should pretty much be our default response.

> Example: A thin provisioning threshold exceed check condition. The
> device handler infrastructure can be a savior with such hacks.. 

I'm going to regret asking this (especially given all the noise there's
been on thin provisioning thresholds):  Which arrays don't actually
issue them correctly?

James





More information about the dm-devel mailing list