[dm-devel] HDS multipathing prioritizer not doing what it should

Thu May 10 12:48:09 UTC 2012

On 05/10/2012 09:28 AM, Christian Schausberger wrote:
> Hi all,
> 
> 
> I think I found a bug in the HDS prioritizer module at
> http://git.kernel.org/gitweb.cgi?p=linux/storage/multipath/hare/multipath-tools.git;a=blob_plain;f=libmultipath/prioritizers/hds.c;hb=HEAD
> 
> In there the following is stated for assigning the priority:
> 
> * CONTROLLER ODD and LDEV ODD: PRIORITY 1
> * CONTROLLER ODD and LDEV EVEN: PRIORITY 0
> * CONTROLLER EVEN and LDEV ODD: PRIORITY 0
> * CONTROLLER EVEN and LDEV EVEN: PRIORITY 1
> 
> When watching multipathing with debug output one can see that the
> controllers returned are 1 and 2:
> 
> May 08 14:44:00 | sdo: hds prio: VENDOR:  HITACHI
> May 08 14:44:00 | sdo: hds prio: PRODUCT: DF600F         
> May 08 14:44:00 | sdo: hds prio: SERIAL:  0x0089
> May 08 14:44:00 | sdo: hds prio: LDEV:    0x0004
> May 08 14:44:00 | sdo: hds prio: CTRL:    1                       
> <= This is really controller 0
> May 08 14:44:00 | sdo: hds prio: PORT:    C
> May 08 14:44:00 | sdo: hds prio: CTRL ODD, LDEV EVEN, PRIO 0
> May 08 14:44:00 | sdo: hds prio = 0
> 
> May 08 14:44:00 | sdk: hds prio: VENDOR:  HITACHI
> May 08 14:44:00 | sdk: hds prio: PRODUCT: DF600F         
> May 08 14:44:00 | sdk: hds prio: SERIAL:  0x0089
> May 08 14:44:00 | sdk: hds prio: LDEV:    0x0004
> May 08 14:44:00 | sdk: hds prio: CTRL:    2                       
> <= This is really controller 1
> May 08 14:44:00 | sdk: hds prio: PORT:    C
> May 08 14:44:00 | sdk: hds prio: CTRL EVEN, LDEV EVEN, PRIO 1
> May 08 14:44:00 | sdk: hds prio = 1
> 
> This looks fine, but afaik HDS starts counting controllers from 0
> (so actually I have 0 and 1). So when assigning LUN ownership in the
> storage, a LUN with an active/passive path will actually always be
> accessed through the wrong controller. This has a huge performance
> penalty when the system is under stress, because of the additional
> overhead generated by this.
> 
Have you tested whether the situation improves when the priority is
reversed?

I'd be very much surprised if it did, though.

I suspect more the internal queue size of the Hitachi to be a
problem here. I've seen instances where we overload the internal
queue size, causing the array to seize up.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare at suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)