[dm-devel] Designing a new prio_callout

Ethan John ethan.john at gmail.com
Thu Aug 16 17:30:28 UTC 2007


I don't think it's possible to get the IP of the iSCSI session from within
multipath. If anyone knows a way, I could easily write a dumb version of a
callout like you describe.

I'm not convinced, though, that I could do much better than prio_random with
rr_min_io > 2 billion without some extensive work. You'd have to re-check
path priorities for each call to the script, which would involve walking
through all existing paths and priorities and deciding on path priorities in
some intelligent way. It would get hairy pretty quickly.

Thanks for the tidbit on how round robin works. Great to know!

On 8/16/07, Stefan Bader <Stefan.Bader at de.ibm.com> wrote:
>
> > Now what happens? If mytarget has multiple LUs associated with it, the
> > multipath output will look like it did below if failover is being used
> --
> > two paths for each of two devices. The problem for us is that by
> default,
> > multipath just uses the first path that it sees. Which means that for
> every
> > device in mytarget, all data will be read and written across just the
> first
> > path -- 10.53.151.22, in this case.
> >
> > We need a way to load balance connections across all available
> connections.
> >
> > There are several ways that I can see to do this. Ideally, we would
> > implement ALUA on our end and advise people to use mpath_prio_alua as
> their
> > callout. But this has a development cost. We could also implement a
> custom
> > system as your suggest, but this also has a development cost.
> >
> > If we could advise users to manually set priorities on the client side,
> that
> > would be acceptable, but this is impossible with the current version of
> > multipath.
> >
>
> Can you find the IP address and UID of a device with the node name?
> For example you get /dev/sdc and then look for UID (can be retrieved
> with scsi_id) and the IP address of the connection (not sure this is
> possible). Then manually create a file containing mappings:
>
> <uid>:<ip>:<priority>
> ...
>
> Create a script that is used as the callout which takes a node name
> looks into the file and prints out the priority. This way the priority
> of a path does not change like it does with random priorities. The
> other path will only be used on failure and switched back as soon as
> the other one is back again (with failback immediate).
>
>
> > On a related note, I've read the reports of people experiencing higher
> > levels of performance with lower settings of rr_min_io, but it seems to
> me
> > that as rr_min_io gets smaller, the system becomes less like
> active/passive
> > MPIO and more like active/active MPIO, so users experiencing this
> > performance improvement would be better off using group_by_serial, so
> that
> > all paths are excitable simultaneously.
> >
>
> The setting of rr_min_io only matters if you have more than one path
> per path group. Otherwise  you only can use one path at a time and
> there is no round-robin. If you have more than one path in a group
> then lower values help since paths are more likely to be used
> concurrently. The default of 1000 is to high. Kiyoshi Ueda and
> Jun'ichi Nomura have done some measurements while looking for a way to
> improve performance more generally
> (https://ols2006.108.redhat.com/2007/Reprints/ueda-Reprint.pdf). But
> again, rr_min_io is only relevant to load-balance paths within the
> same path group (multibus or as you mentioned group_by_serial). The
> reason for you path changes (except for real failures) might be rather
> that random_prio results in different priorities whenever any priority
> value is checked again.
>
> Stefan
>
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>



-- 
Ethan John
http://www.flickr.com/photos/thaen/
(206) 841.4157
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20070816/a6f34cde/attachment.htm>


More information about the dm-devel mailing list