[dm-devel] [Multipath] Round-robin performance limit

Tue Oct 4 20:19:08 UTC 2011

Unfortunately even with playing around with various settings, queues,
and other techniques, I was never able to exceed the bandwidth of more
than one of the Ethernet links when accessing a single multipathed
LUN.

When communicating with two different multipathed LUNs, which present
as two different multipath devices, I can saturate two links, but it
is still a one to one ratio of multipath devices to link saturation.

After further research on multipathing, it appears people are using md
raid to achieve multipathed devices. My initial testing of using raid0
md-raid device produces the behavior I expect of multipathed devices.
I can easily saturate both links during read operations.

I feel using md-raid is a less elegant solution than using
dm-multipath, but it will have to suffice until someone can provide me
some additional guidance.

Thanks,
Adam

On Mon, Oct 3, 2011 at 11:08 PM, Adam Chasen <adam at chasen.name> wrote:
> Malahal,
> After your mentioning bio vs request based I attempted to determine if
> my kernel contains the request based mpath. It seems in 2.6.31 all
> mpath was switched to request based. I have a kernel 2.6.31+ (actually
> .35 and .38), so I believe I have requrest-based mpath.
>
> All,
> There also appears to be a new multipath configuration option
> documented in the RHEL 6 beta documentation:
> rr_min_io_rq    Specifies the number of I/O requests to route to a path
> before switching to the next path in the current path group, using
> request-based device-mapper-multipath. This setting should be used on
> systems running current kernels. On systems running kernels older than
> 2.6.31, use rr_min_io. The default value is 1.
>
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6-Beta/html/DM_Multipath/config_file_multipath.html
>
> I have not tested using this setting vs rr_min_io yet or even if my
> system supports the configuration directive.
>
> If I trust some of the claims of several VMware ESX iscsi multipath
> setups, it is possible (possibly using different software) to gain a
> multiplicative throughput by adding additional Ethernet links. This
> makes me hopeful that we can do this with open-iscsi and dm-mulitpath
> as well.
>
> It could be something obvious I am missing, but it appears a lot of
> people experience this same issue.
>
> Thanks,
> Adam
>
> On Tue, May 3, 2011 at 6:12 AM, John A. Sullivan III
> <jsullivan at opensourcedevel.com> wrote:
>> On Mon, 2011-05-02 at 22:04 -0700, Malahal Naineni wrote:
>>> John A. Sullivan III [jsullivan at opensourcedevel.com] wrote:
>>> > I'm also very curious about your findings on rr_min_io.  I cannot find
>>> > my benchmarks but we tested various settings heavily.  I do not recall
>>> > if we saw more even scaling with 10 or 100.  I remember being surprised
>>> > that performance with it set to 1 was poor.  I would have thought that,
>>> > in a bonded environment, changing paths per iSCSI command would give
>>> > optimal performance.  Can anyone explain why it does not?
>>>
>>> rr_min_io of 1 will give poor performance if your multipath kernel
>>> module doesn't support request based multipath. In those BIO based
>>> multipath, multipath receives 4KB requests. Such requests can't be
>>> coalesced if they are sent on different paths.
>> <snip>
>> Ah, that makes perfect sense and why 3 seems to be the magic number in
>> Linux (4000 / 1460 (or whatever IP payload is)).  Does that change with
>> Jumbo frames? In fact, how would that be optimized in Linux?
>>
>> 9KB seems to be a reasonable common jumbo frame value for various
>> vendors and that should contain two pages but, I would guess, Linux
>> can't utilize it as each block must be independently acknowledged. Is
>> that correct? Thus a frame size of a little over 4KB would be optimal
>> for Linux?
>>
>> Would that mean that rr_min_io of 1 would become optimal? However, if
>> each block needs to be acknowledged before the next is sent, I would
>> think we are still latency bound, i.e., even if I can send four requests
>> down four separate paths, I cannot send the second until the first has
>> been acknowledged and since I can easily place four packets on the same
>> path within the latency period of four packets, multibus gives me
>> absolutely no performance advantage for a single iSCSI stream and only
>> proves useful as I start multiplexing multiple iSCSI streams.
>>
>> Is that analysis correct? If so, what constitutes a separate iSCSI
>> stream? Are two separate file requests from the same file systems to the
>> same iSCSI device considered two iSCSI streams and thus can be
>> multiplexed and benefit from multipath or are they considered all part
>> of the same iSCSI stream? If they are considered one, do they become two
>> if they reside on different partitions and thus different file systems?
>> If not, then do we only see multibus performance gains between a single
>> file system host and a single iSCSI host when we use virtualization each
>> with their own iSCSI connection (as opposed to using iSCSI connections
>> in the underlying host and exposing them to the virtual machines as
>> local storage)?
>>
>> I hope I'm not hijacking this thread and realize I've asked some
>> convoluted questions but optimizing multibus through bonded links for
>> single large hosts is still a bit of a mystery to me.  Thanks - John
>>
>> --
>> dm-devel mailing list
>> dm-devel at redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>>
>