[dm-devel] HDS multipathing prioritizer not doing what it should
Hannes Reinecke
hare at suse.de
Thu May 10 12:48:09 UTC 2012
On 05/10/2012 09:28 AM, Christian Schausberger wrote:
> Hi all,
>
>
> I think I found a bug in the HDS prioritizer module at
> http://git.kernel.org/gitweb.cgi?p=linux/storage/multipath/hare/multipath-tools.git;a=blob_plain;f=libmultipath/prioritizers/hds.c;hb=HEAD
>
> In there the following is stated for assigning the priority:
>
> * CONTROLLER ODD and LDEV ODD: PRIORITY 1
> * CONTROLLER ODD and LDEV EVEN: PRIORITY 0
> * CONTROLLER EVEN and LDEV ODD: PRIORITY 0
> * CONTROLLER EVEN and LDEV EVEN: PRIORITY 1
>
> When watching multipathing with debug output one can see that the
> controllers returned are 1 and 2:
>
> May 08 14:44:00 | sdo: hds prio: VENDOR: HITACHI
> May 08 14:44:00 | sdo: hds prio: PRODUCT: DF600F
> May 08 14:44:00 | sdo: hds prio: SERIAL: 0x0089
> May 08 14:44:00 | sdo: hds prio: LDEV: 0x0004
> May 08 14:44:00 | sdo: hds prio: CTRL: 1
> <= This is really controller 0
> May 08 14:44:00 | sdo: hds prio: PORT: C
> May 08 14:44:00 | sdo: hds prio: CTRL ODD, LDEV EVEN, PRIO 0
> May 08 14:44:00 | sdo: hds prio = 0
>
> May 08 14:44:00 | sdk: hds prio: VENDOR: HITACHI
> May 08 14:44:00 | sdk: hds prio: PRODUCT: DF600F
> May 08 14:44:00 | sdk: hds prio: SERIAL: 0x0089
> May 08 14:44:00 | sdk: hds prio: LDEV: 0x0004
> May 08 14:44:00 | sdk: hds prio: CTRL: 2
> <= This is really controller 1
> May 08 14:44:00 | sdk: hds prio: PORT: C
> May 08 14:44:00 | sdk: hds prio: CTRL EVEN, LDEV EVEN, PRIO 1
> May 08 14:44:00 | sdk: hds prio = 1
>
> This looks fine, but afaik HDS starts counting controllers from 0
> (so actually I have 0 and 1). So when assigning LUN ownership in the
> storage, a LUN with an active/passive path will actually always be
> accessed through the wrong controller. This has a huge performance
> penalty when the system is under stress, because of the additional
> overhead generated by this.
>
Have you tested whether the situation improves when the priority is
reversed?
I'd be very much surprised if it did, though.
I suspect more the internal queue size of the Hitachi to be a
problem here. I've seen instances where we overload the internal
queue size, causing the array to seize up.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare at suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
More information about the dm-devel
mailing list