[dm-devel] 2.6.10-rc1-udm1: multipath work in progress
christophe varoqui
christophe.varoqui at free.fr
Tue Nov 2 22:22:04 UTC 2004
Le mardi 02 novembre 2004 à 20:19 +0000, Alasdair G Kergon a écrit :
> Miscellaneous points:
>
> On Tue, Nov 02, 2004 at 08:46:04PM +0100, christophe varoqui wrote:
> > Let the kernel fail them ... as soon as the primary PG paths are
> > exhausted, it will switch to the secondary PG and an event will cause
> > multipathd to reconfigure the table. The secondary will become primary,
> > and failed paths will come back up, grouped in a low prio PG.
>
> Which may require rapid intervention by userspace, or the queue_if_no_paths
> pause to give userspace time to sort things out.
>
Is the following example illustrates what you have in mind ?
| pg1 | pg2 | pg1 maps paths to ctr1, pg2 - ctr2
====================================================================
| A A | A A | paths in pg2 are marked A but are unusable
| F F | A A | ctr1 shuts down, ctr2 takes over, now pg2 paths
are really up, maybe with a little help from
pg_init_fn. Event is caught by multipathd
|-A -A| A A | now you want multipathd to disable pg1 and reinstate
its paths
|-A -A| F F | so that when ctr2 shuts, kernel can switch over to pg1
and pray for its paths to be up
| A A |-A -A| then for multipathd to regularize.
The current model being :
====================================================================
| A A | A A | paths in pg2 are marked A but are unusable
| F F | A A | ctr1 shuts down, ctr2 takes over, now pg2 paths
are really up, maybe with a little help from
pg_init_fn. Event is caught by multipathd
| A A | A A | multipathd swaps pg1 and pg2, ctr1 paths are marked up
by the table reload
| F F | A A | so that when ctr2 (pg1) shuts, kernel can switch over
to pg2 and pray for its paths to be up
| A A | A A | then for multipathd to regularize.
=====================================================================
> [Consider the primary pg_init_fn finds the paths would be OK but
> aren't current, so fails them all so the currently-preferred secondary can
> be used. But the secondary paths turn out to have genuinely failed so you
> *do* want to use the primary after all, but you can't now. How do you tell
> the primary to *forcibly* use the paths? This method has effectively
> transferred the pg_init_fn to userspace.
>
Note I did see pg_init_fn as a best effort fn to try to activate the
paths in a PG that is going to be used as soon as the fn returns.
Whatever the return value.
If it is not the case, I should reconsider the whole thing but I
wouldn't understand why you would want to give it more wits.
> Or it requires giving the
> pg_init_fn complete knowledge of the configuration so it checks both primary
> and secondary PGs before deciding what to do - but then that has an
> equivalent effect to what's already implemented in these patches using PG
> enable/disable. Or you have a 3rd and 4th PG duplicating the 1st & 2nd ones
> but with a new 'force' flag.]
>
> [I see queue_if_no_paths very much as a last resort: it's there
> as an option for not-so-good hardware. In any decent system there should
> never be no paths without catastrophic hardware failure.]
>
So what is wrong with letting it be the default if it is not used at all
for sane hardware. Seems harmless.
> > We can failback already, with the current design.
> > As I see it, all the "disable PG" feature brings is save some table
> > reloads. Is it worth the added complexity ?
> Performing tables reloads is the complex option IMHO.
> [Even ignoring the suspend/resume queueing issues that aren't
> resolved yet.]
>
I would guess they need resolving anyway
> Table reloads wipe all knowledge of the existing state from the kernel and
> start afresh,
>
Hey, I actually use that property in the current design :)
> so pg_init_fn's have to be run again etc.
>
Don't they run too when a disabled PG is used as a last resort ?
> They also cannot
> avoid allocating memory, which might not be available immediately.
> You can't assume a table reload will succeed and must always have a
> fallback plan in case it fails.
>
That I can't argue against.
But in a low memory situation I feel your scheme won't bring much more
garanties : it relies on userspace too after all.
I guess I've gone near the bottom of my arguments chest. I'll stay a bit
more passive for a while and try to grasp the impact of all this on the
tools design.
regards,
--
christophe varoqui <christophe.varoqui at free.fr>
More information about the dm-devel
mailing list