[dm-devel] blk_abort_queue on failed paths?

Mike Anderson andmike at linux.vnet.ibm.com
Fri Jun 5 07:56:54 UTC 2009


Mike Christie <michaelc at cs.wisc.edu> wrote:
> Mike Christie wrote:
>> adding linux-scsi and Mike Anderson
>>
>> David Strand wrote:
>>> After updating to kernel 2.6.28 I found that when I performed some
>>> cable break testing during device i/o, I would get unwanted device or
>>> host resets. Ultimately I traced it back to this patch:
>>>
 
>>> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.29.y.git;a=commit;h=224cb3e981f1b2f9f93dbd49eaef505d17d894c2 
>>> 
>>>
>>>
>>> The call to blk_abort_queue causes the block layer to call
>>> scsi_times_out for pending i/o, which can (or will) ultimately lead to
>>> device, and/or bus and/or host resets, which of course cause all the
>>> other devices significant disruption.
>>>
>>
>> What driver were you using? 
>
> Oh yeah, I do not think this should happen in new kernels if the driver  
> is failing the IO with DID_TRANSPORT_DISRUPTED when it is deleting the  
> rport. That should cause the IO to requeue and wait for fast io fail to  
> fire.
>
> Maybe we just need to convert some more drivers?

Yes, I am seeing this in my test runs using a DS4K storage device and the
RDAC device handler.
"Jun  5 00:39:58 elm3c244 kernel: [  873.180267] sd 1:0:0:1: [sdd] Result:
hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK"

-andmike
--
Michael Anderson
andmike at linux.vnet.ibm.com




More information about the dm-devel mailing list