[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [dm-devel] Failed path will not be recovered when disabling/enabling remote port
- From: Hannes Reinecke <hare suse de>
- To: device-mapper development <dm-devel redhat com>
- Subject: Re: [dm-devel] Failed path will not be recovered when disabling/enabling remote port
- Date: Thu, 02 Jul 2009 13:44:18 +0200
Christian May wrote:
> Hi,
>
> I've setup an IBM z10 LPAR (mainframe server) with 2.6.30-kernel.
> Attached to the System z10 was an IBM DS8000 storage server. 10x SCSI
> LUNs were assigned to LPAR via two pathes:
>
> Example:
> 36005076303ffc1040000000000001269 dm-9 IBM,2107900
> size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=-2 status=active
> |- 0:0:0:1080639506 sdw 65:96 active undef running
> `- 1:0:1:1080639506 sdt 65:48 active undef running
>
> Special parameter setting: dev_loss_tmo=90sec; fast_io_fail_tmo=5sec
>
> multipath tools: multipath-tools v0.4.9 (04/04, 2009)
> device-mapper: device-mapper-1.02.27-7.fc10.s390x,
> device-mapper-libs-1.02.27-7.fc10.s390x
>
> When removing a remote port (disabling a port on the BROCADE FC switch)
> one path failed.
>
> root h42lp26/ESAME:~]
>> multipath -l
> 36005076303ffc1040000000000001268 dm-8 ,
> size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=-2 status=active
> |- #:#:#:# - #:# failed undef running
> `- 1:0:1:1080573970 sdr 65:16 active undef running
>
> After a while (>90sec) SCSI LUNs were removed from system:
>
[ .. ]
>
> When re-enabling the path, SCSI LUNS were reassigned to system but path
> didn't recover:
>
[ .. ]
>
>
> [root h42lp26/ESAME:~]
>> multipath -l
> 36005076303ffc1040000000000001268 dm-8 ,
> size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=-2 status=active
> |- #:#:#:# - #:# failed undef running
> `- 1:0:1:1080573970 sdr 65:16 active undef running
>
>
> Running "multipath" command will recover the failed path but that's not
> way it should be...can somebody help to fix this? Why is the path not
> recovered automatically?
>
It should, really.
The problem is that the paths have _not_ been reconnected;
the hashes indicates that the in-kernel multipath code references
a device for which no information is available.
And the new device has _not_ been reconnected, as otherwise
you'd end up with _three_ paths here.
Probably missing udev integration.
I really have to push my patches upstream ... sigh.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare suse de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]