[dm-devel] 2.6.10-rc1-udm1: multipath work in progress

Tue Nov 2 22:28:42 UTC 2004

On 2004-11-02T20:46:04, christophe varoqui <christophe.varoqui at free.fr> wrote:

> > The paths are _not_ failed. It's just that we have switched to the other
> > priority group.
> Let the kernel fail them ... as soon as the primary PG paths are
> exhausted, it will switch to the secondary PG and an event will cause
> multipathd to reconfigure the table. The secondary will become primary,
> and failed paths will come back up, grouped in a low prio PG.

But they are not failed! *whine* They'd be useable again if we sent them
a initialization command, very likely.

And this is what we'd have to do if we didn't have an healthy paths left
in the other PG(s). But which we couldn't if we had failed them.

> > Failing the paths is _wrong_. We _could_ force a failback to the PG if
> > we found that we had no other path in the other PG (because all of them
> > have failed, for example.)
> We can failback already, with the current design.

No, because you just failed all paths.

> > A PG not being active is _distinct_ from a PG having no healthy paths.
> > 
> > If and when to coordinate the switch-back to the default priority group
> > is a user-space issue.
> > 
> Our disagreement here seems to come from your wanting to do predictive
> switch-over, and my arguing that if we have a good "reactive behaviour",
> we'll be safe with it in any case.

I'm not saying anything about "predictive" behaviour. I'm saying we need
to _react_ correctly. If we get a "unit not ready" response, we switch
to the other PG. Why we got it, I'm not trying to predict; it's just
that "something" switched the PG away from under us. And because we
don't know what did it, we follow it's lead.

Failing the paths would be wrong. They are not failed. They are healthy.
They are just not used right now. We _could_ force a switch-back if we
absolutely had to.

Whether this is expressed by switching the orders of PGs around or by
having a bypassed flag, now that's something we could argue about, but
about the principle need for this distinction I'm very convinced.

The switch-back (to the default PG, if one is such defined) should not
automatically be initiated by the kernel, but by user-space (ie,
multipathd) after a certain time of paths being available again and if
we were the node which originally switched the PG. (ie, not the one
which just followed a lead.) This would catch most scenarios.

In more complex scenarios, the switching of LUs from one PG to the other
might even be coordinated by smarter cluster software.

In a single node scenario, it's easier. You can switch back to the
default PG as soon as a path there is healthy again (but even then,
giving it some time to settle may be wise).

Sincerely,
    Lars Marowsky-Brée <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business