[dm-devel] Adding/removing multi-pathed disk partitions

Lars Marowsky-Bree lmb at suse.de
Wed Jan 26 08:52:11 UTC 2005


On 2005-01-25T23:34:25, christophe varoqui <christophe.varoqui at free.fr> wrote:

Yeah, the issue of partitioning device mapper targets has come up here
too. It's definetely an important discussion to have right now.

> May be that's not a problem, as we don't need the kernel partitioning
> code. Though it's not nice to have the kernel not in sync with reality.

This is the first issue: The lower level devices not showing the "right"
partition table when the partition table is changed via the higher level
device (in this case, dm multipath).

I think this probably needs to be solved in-kernel: When DM bd_claim()s
the whole disk (ie, /dev/sda), the kernel should unmap /dev/sda[0-9]+ -
bd_claim() has declared this one owner (the multipath table) to be the
one way of accessing the device, and that's that.

When the multipath table is released, the kernel can rereadpt the device
and the partition table entries would reappear.

(One issue: If any partition on the device was already bd_claim()ed at
the time where multipath tried to bd_claim the whole disk, the full disk
claim should error out.)

> If kpartx reads the partition table directly from disk, ie bypassing the
> cache, the correct layout will be mapped. Note it recquires a manual
> execution, as no hotplug event is generated upon fdisk's sync. This also
> applies to "blockdev --rereadpt" anyway.

Now, second issue: Having the kpartx generated mappings be updated when
someone uses the fdisk/sfdisk etc tools. This one is reasonably simple:
Generate a hotplug event for the rereadpt ioctl() and map it to kpartx
rescanning the table. Et voila.

There's a third issue, namely being able to figure out from user-space
that a /dev/498984348484304p1 is actually a partition of the ...304
device. (This might be needed for grouping the partition entries
correctly in various admin tools.) But, short of storing a list of
"parent" dev entries in sysfs, I think this would need to be solved by
them parsing the DM table if they really cared...

> 2) flames to me for suggesting we could get away altogether with the
> kernel partitioning code ?

This one has been suggested a number of times to me. ie, use the
in-kernel partition code to partition the DM (multipath) device, much
like the md entries can be partitioned.

For a while, I actually thought this was a good idea ;-) But then I've
had to reconsider. It's much more powerful to have the partitions be
feature-complete DMs, because then they can be remapped, snapshotted and
so on; so eventually one of the volume managers would want to implement
them this way, and then all the issues would pop up again. I'd rather
solve them like this right now.


Sincerely,
    Lars Marowsky-Brée <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business




More information about the dm-devel mailing list