[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] Designing a new prio_callout

Thanks again, Hannes. We really appreciate your time on this.

Stefan's suggestion will be a great option for round-robin failover for our first release. We'll try to figure out a way to do that. It also sounds like using the ALUA callout is going to be the best long-term solution, which answers the original question that I posed.

As for putting failover paths on a different subnet, that will be up to the user to manage. We won't prevent that sort of thing, and network configurations are extremely flexible on our systems. Subnet also doesn't necessarily effect physical network paths with our system.

Again, thanks so much for your help!

On 8/27/07, Hannes Reinecke <hare suse de> wrote:
Ethan John wrote:
> For the record, setting rr_min_io to something extremely large (we're using
> 2 billion now, since I'm assuming it's a C integer) solves the immediate
> problem that we're having (overhead in path switching causing poor
> performance). Telling people to use mpath_prio_random is still less than
> ideal for any small number of iSCSI targets, but it a better short-term
> solution for us than nothing.
In setting rr_min_io to something extremely large you effectively
disable the round-robin scheduler in multipathing.
That's okay for the failover scenario you have (as you only have
one path per group), but whenever you have more than one path
in a group that wouldn't work anymore.

> On 8/10/07, Ethan John <ethan john gmail com> wrote:
>> Hannes, thanks again for your help with this.
>> I haven't noticed that failback does the right thing, but I'll try it out
>> again. Could be something we're doing wrong. In any case, there's very
>> little documentation on all this, and I'm trying to develop some kind of
>> strategy for our Linux customers to use until we get ALUA implemented.
>> Being able to set path priorities manually would be ideal, but it seems
>> like this is impossible, right?
>> Here's the situation we have right now. I initiate two connections to one
>> target, across two sessions with two different IPs, with two LUs. Multipath
>> looks like this:
>> mpath45 (20002c9020020001a00151b6b46bb57b0) dm-1 company,iSCSI target
>> [size=15G][features=0][hwhandler=0]
>> \_ round-robin 0 [prio=1][active]
>>  \_ 22:0:0:1 sdc 8:32  [active][ready]
>> \_ round-robin 0 [prio=1][enabled]
>>  \_ 23:0:0:1 sde 8:64  [active][ready]
>> mpath44 (20002c9020020001200151b6b46bb57ae) dm-0 company,iSCSI target
>> [size=15G][features=0][hwhandler=0]
>> \_ round-robin 0 [prio=1][enabled]
>>  \_ 22:0:0:0 sdb 8:16  [active][ready]
>> \_ round-robin 0 [prio=1][enabled]
>>  \_ 23:0:0:0 sdd 8:48  [active][ready]
>> Note that there are only two active sessions:
>> # iscsiadm -m session
>> tcp: [20] ,1 iqn.2001-07.com.company:qaiscsi2:blah1
>> tcp: [21],2 iqn.2001-07.com.company:qaiscsi2:blah1
>> So the result is that all activity is routed to the first session that was
>> initiated. I want to change the priorities of the paths to allow for traffic
>> to go to the first IP for mpath45 and the second IP for mpath46.
That's a matter of the IP routing. Having both target on the same (sub-) net
doesn't work very well with multipathing. Please setup your system with
each iSCSI Target port in a different subnet eg,1 iqn.2001-07.com.company:qaiscsi2:blah1,2 iqn.2001-07.com.company:qaiscsi2:blah1

then you'll have one iSCSI target port per subnet and you can actually
do failover etc.

>> Obviously ALUA is the way to go for this in the future, but we won't have
>> the resources to implement that, so I'm looking for an interim solution that
>> will scale to thousands of clients. Right now, the only thing I can tell
>> people is to manually initiate connections to certain targets through
>> certain IP addresses -- basically, doing the load balancing themselves. Is
>> there a better way?
No, not really. But I'm not a network guru. You may want to ask on
the open-iscsi mailing list.

And you can get all information you need via sysfs, so it should
be possible to create a script like Stefan Bader suggested.


Dr. Hannes Reinecke                   zSeries & Storage
hare suse de                          +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

dm-devel mailing list
dm-devel redhat com

Ethan John
(206) 841.4157
[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]