[dm-devel] RFC: multipath IO multiplex
Lars Marowsky-Bree
lmb at novell.com
Sat Nov 6 17:03:38 UTC 2010
On 2010-11-06T05:32:03, Neil Brown <neilb at suse.de> wrote:
> Hi Lars,
> the only issue that occurs to me is that if you want to report the first
> success, then you need to copy the data to a private buffer before
> submitting the write. Then wait for all writes to complete before freeing
> the buffer. If you just return the first write the page would be unlocked
> and so could be changed will another path was still writing it out.
Right. This is, in a way, a mix of MPIO / RAID1 handling. We'd indeed
need to have the write block several times - thankfully, we write really
rarely and only one sector at a time, so the memory consumption is
trivial.
(However, we _really_ want to get those writes to disk. Right away.)
> Finding a way to signal 'write all paths sounds tricky. This flag needs to
> be state of the filedescriptor, not the whole device, so it would need to be
> an fcntl rather than an ioctl. And defining new fcntls is a lot harder
> because they need to be more generic - you cannot really make them device
> specific...
> Might it make sense to configure a range of the device where writes always
> went down all paths? That would seem to fit with your problem description
> and might be easiest??
Technically, it'd be possible, because that section is contiguous on
the disk, yes.
(Note that we don't open a real file in a file system, but use a raw
block device; however, that could be a partition on top of MPIO.)
But I'm a bit unclear how we'd define that; clearly, we don't want to
by-pass multipathd management of the MPIO mapping, that being the whole
point why we don't just handle that in user-space ;-)
Hrm. I already have a dm-linear mapping (thanks to kpartx; otherwise
it's trivially introduced). I could modify that to include a special
flag that would mangle the bios that pass through - so I could set a bio
flag that multipath could then act on ...?
(There's precedent; the failfast bio flag.)
Regards,
Lars
--
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
More information about the dm-devel
mailing list