[dm-devel] 2.6.10-udm1 (RFC for refactoring)

Fri Jan 21 12:01:18 UTC 2005

On 2005-01-12T20:56:19, Alasdair G Kergon <agk at redhat.com> wrote:

> ftp://sources.redhat.com/pub/dm/patches/2.6-unstable/2.6.10/2.6.10-udm1.tar.bz2
> 
> Rebases to 2.6.10.

Hi Alasdair,

we've been discussing whether the hardware handling should rather be at
the PG or even at the path level; and apparently some configurations
might require it to be at the path level.

(ie, mixing iSCSI with FC-AL for example.)

Come to think of it, that way leads to madness, and the table syntax
gets more and more complex, more state data to keep track off et
cetera. And the more state and configuration data, the less easy it is
to extend in a backwards compatible manner, one of the issues to keep in
mind before submitting upstream.

I propose a different approach, a more radical simplification of the
design as it stands.

Keep the hardware-specific handler at the whole map level. Abolish
priority groups. Instead, if someone wants priority groups, have them
stack multipath maps to a hierarchy. Introduce a "first-working-path"
scheduler in addition to the round-robin scheduler for use with
active/passive arrays at the top-level.

Instead of
0 280278656 multipath 0 1 emc 2 2 round-robin 0 2 1 66:128 1000 8:160 1000 round-robin 0 2 1 67:112 1000 65:144 1000

We'd have

0 280278656 multipath 0 1 emc 1 round-robin 0 2 1 8:0 1000 8:16 1000
0 280278656 multipath 0 1 emc 1 round-robin 0 2 1 8:32 1000 8:64 1000
0 280278656 multipath 0 0 1 first 0 2 0 253:0 253:1

A stacking mixing an active/passive array with active/active iSCSI could
look like this, for example: (Maybe it could be done more efficient
even, I'm not sure. It's just an example.)

0 280278656 multipath 0 1 emc 1 round-robin 0 2 1 8:0 1000 8:16 1000
0 280278656 multipath 0 1 emc 1 round-robin 0 2 1 8:32 1000 8:64 1000
0 280278656 multipath 0 1 iscsi 1 round-robin 0 2 1 8:80 1000 8:96 1000
0 280278656 multipath 0 0 1 round-robin 0 2 1 253:0 1000 253:2 1000
0 280278656 multipath 0 0 1 round-robin 0 2 1 253:1 1000 253:2 1000
0 280278656 multipath 0 0 1 first 0 2 0 253:3 253:4

The benefits I see:

The code would become more compact again. The more complexity we keep
adding right now, the more overlap arises between how to select a path
_within_ a priority group and the one-level-up decision with selecting
which priority group to use. The desire to describe more and more
complex scenarios would make this more and more complex in the code,
potentially requiring more and more layers to be introduced, or a
special priority-group-selector etc, which I don't like at all. Someone
might even come up with a recursive parser for the multipath table
syntax! ;-)

Instead, the above factors this into a neat little building block which
can be stacked as high as the user needs and desires, while keeping the
code compact and actually reducing the data structures by one level
again, and uses stacking of DM tables instead of having said recursive
parser.

Or would that be too extreme?

The actual computations happening would be much the same; I don't think
there'd be any performance penality at all. 

Sincerely,
    Lars Marowsky-Brée <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business