[dm-devel] 2.6.10-rc1-udm1: multipath work in progress

christophe.varoqui at free.fr christophe.varoqui at free.fr
Tue Nov 2 16:07:58 UTC 2004


Selon Alasdair G Kergon <agk at redhat.com>:

> On Tue, Nov 02, 2004 at 12:03:01AM +0100, christophe varoqui wrote:
> > > But a table load flag to suppress the device size checks does sound OK.
> > Or just move the test at the end of the PG switch-on procedure for all
> > multipath tables ? It would keep the API less complex ...
> But if the test fails then what?  Drop the whole table?
> Rather I/O to the missing part gets errored i.e. same as always
> suppressing the check.  (And it's necessary for dm to cope with
> device resizing anyway.)
>
Anyway.
I won't argue on this one.

> > What problem do we try to solve here ? Planned outages, like controler
> > restart or firmware upgrades.
> ie all paths fail simultaneously, but recover quickly
>
> > If so, I guess we can go for queue_if_no_path for all and just ask
> > userspace the time & queued ios threshold before failing.
> Thought about a timer, but not persuaded it gains us anything:
> In the current model, only userspace can reinstate a path, so userspace
> is required to intervene to resolve the situation - it might as
> well handle any timeouts it wants itself.
>
> Having it as 'feature args' means we could implement alternatives
> later and add other things without breaking the API.
>
I respectfully disagree.

As you said to Stefan Bader, we need concrete example of use of the "feature
arguments" API : the only behaviour I see useful is "queue_if_no_path". You
suggesting userspace should take care of timers means there is no argument
needed at all, which I see as a good thing.

> > I don't quite see the benefits of PG disabling feature.
> > As far as I can see, all it brings is permiting kernel code to change
> > the maps, which seems like enabling policies in the kernel : from
> > userspace, we have the same effect by instanciating the PG at the tail
> > of the params string.
>
> Kernel multipath always chooses the first available path in PG order.
> The disabling/enabling of PGs copes with the case when switching
> PG incurs a serious performance penalty (maybe across a cluster).
> i.e. you don't want to call pg_init_fn more than necessary.
>
>   PG 1 - path A
>   PG 2 - path B
>   PG 3 - path C
>
> Path A fails; it starts using path B.
> Path A becomes available again.  If userspace reinstates it, then it
> will immediately start being used and PG 1's initialisation function
> will run.  But you'd prefer to continue using path B until it
> fails, and only then switch back to path A.  [Or revert to the
> preferred path at a time of your choosing when the system is not
> busy.]  So you set PG 1 & PG 3 to 'disabled' before reinstating path A.
>
> An alternative would have been to reload the table with PG 1 and PG2
> swapped over - and table reloading is also an expensive operation
> (and doesn't deal with queued I/O properly yet).
> Other interface options could have been to let you change the
> order of PGs dynamically (needs lots more code) or to just have a
> 'sticky' flag so it doesn't change PG until it has no choice (less
> flexible - maybe you want to switch to path C next rather than A).
>
Ok, from this description I still think work would be better invested in making
the table reload less annoying : it could benefit every other targets and keep
the multipath one simple.

I really fear making the API to rich (ie complex) will compromise the
reliability of the userspace decisions.

regards,
cvaroqui




More information about the dm-devel mailing list