Tore Anderson wrote:
* Hannes ReineckeThat's the dev_loss_tmo setting. Just increase it to something to your liking.Oh, sweet. This knob won't affect how long the layer will hold I/O before failing it (like lpfc_nodev_tmo), I assume? (I'm worried about it taking longer for dm-multipath to detect failed paths).
With newer versions of lpfc you can set /sys/class/fc_rport/rportXYZ/fast_io_fail_tmo to a low value so that IO is failed quickly, and then set the dev_loss_tmo to a high value so the device is not removed quickly.
The only problem may be that there is a race where dm-multpiath could be queueing IO to the scsi layer while the scsi layer is reporting a failure. That IO that was getting queued will then sit in the scsi layer until dev_loss_tmo fires. That is fixed with this patchset
http://marc.info/?l=linux-scsi&m=117399843216280&w=2 but I never finished testing it out.