[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] different LUN numbers under the same dm device



On 06/06/2012 10:59 PM, Brian Bunker wrote:
> Mike,
> 
> The devices for LUN 12 are failed and correspond to LUN's not currently shared
> to the initiator at all. They were at one point and were likely
used by dm-11
> for its underlying paths. The inquiry data of those LUN's when the
problem happened was like this:
> 
> [root r13init32 ~]# sg_inq /dev/sde
> standard INQUIRY: [qualifier indicates no connected LU]
>   PQual=1  Device_type=31  RMB=0  version=0x06  [SPC-4]
>   [AERC=0]  [TrmTsk=0]  NormACA=0  HiSUP=0  Resp_data_format=2
>   SCCS=0  ACC=0  TPGS=0  3PC=0  Protect=0  BQue=0
>   EncServ=0  MultiP=1 (VS=0)  [MChngr=0]  [ACKREQQ=0]  Addr16=0
>   [RelAdr=0]  WBus16=0  Sync=0  Linked=0  [TranDis=0]  CmdQue=1
>   [SPI: Clocking=0x0  QAS=0  IUS=0]
>     length=96 (0x60)   Peripheral device type: no physical device on this lu
>  Vendor identification: PURE    
>  Product identification: FlashArray      
>  Product revision level: 100 
> 
> There is no NAA number, page code 0x83 or LUN serial number available, page code 0x80
> since there is no LUN 12 attached as a disk device at the time
multipath -ll was run.
> Different LUN's from our array would ever have the same NAA value,
what I think you are calling UUID.
> 
Yep. Hmm. So the devices are unmapped from the storage, but still
visible from the initiator?
Have you run 'rescan-scsi-bus.sh -r' here?
That should clean up these devices.

> The sequence is something like share a LUN from the array with two paths to
> the initiator, a dm device gets created presumably like this at
first (except
> that the status would be active and ready and not failed and faulty:
> 
>   3624a93700a14254d729923840001000b dm-11 PURE,FlashArray
>   size=500G features='0' hwhandler='0' wp=rw
>    `-+- policy='round-robin 0' prio=1 status=active
>    |- 1:0:0:12 sde  8:64   failed faulty running
>    |- 0:0:0:12 sdd  8:48   failed faulty running
> 
> Then that LUN 12 is taken away from the initiator and the dm device dm-11 is
> reused later by LUN 10 when it is shared to the initiator, but the
LUN 12
> devices still remain as part of the dm device. Then I would expect:
> 
>  3624a93700a14254d729923840001000b dm-11 PURE,FlashArray
>  size=500G features='0' hwhandler='0' wp=rw
>  `-+- policy='round-robin 0' prio=1 status=active
>    |- 0:0:0:10 sdar 66:176 active ready  running
>    !- 1:0:0:10 sdba 67:64  active ready  running
> 

Yeah, but still: it means that at one point LUN 12 had the same NAA
value than LUN 10, correct?
It _might_ happen that multipath created a dm-device for LUN12, set
them to 'faulty' during unsharing, and then added the then-new LUN10
to the same device, given that the NAA number is identical.

So the point still stands: LUN10 must have had the same NAA value
than LUN12 now has.
So unless the original LUN10 referred to the same storage entity as
LUN12 now does, this is a definite no-no.
And if it does, we're pretty much in the clear, as then LUN10 would
now refer to a stale device (with status 'failed faulty'), and
should be cleared up with 'rescan-scsi-bus.sh -r'.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare suse de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]