[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] Re: [PATCH v2] dm: add topology support

>>>>> "Mike" == Mike Snitzer <snitzer redhat com> writes:

Mike> When a table is pushed to DM each target device in the table may
Mike> have different limits.  There is no one-size-fits-all default.

But incrementally finding the one size that does fit all (DM dev as well
as targets) is the raison d'ĂȘtre for blk_stack_limits().

The default limits should all be set by the block layer when setting up
the request queue.  So my reason for inquiring was to figure out whether
check_for_valid_limits() actually makes any difference?

Mike> With DM, the underlying device that you're sending IO to is a
Mike> function of offset into the DM device.  Therefore, the associated
Mike> IO limits should really be a function of offset.

This was one of the discussion topics at the Storage and Filesystems
workshop a couple of months ago.  My original topology code actually
permitted a list of topologies for a block device.  With all that
entails of nightmares splicing and dicing lists in the stacking
function.  It was not pretty, and the general consensus at the workshop
was that this was way too complex.  I agreed and the code was gutted.

And even having a topology list failed to correctly describe some valid
configurations.  One pathological case is having a 512-byte
logical/physical in a mirror with a 512/4KB odd-aligned one.  How do you
define a topology for that?  The solution is to adjust the alignment and
scale up io_min to match the 4K drive.  I.e. to change things for the
drive that isn't the "problem".

Mike> That means we use a device's alignment_offset in userland LVM2 to
Mike> push down a data area whose start+size is aligned.  This gives us
Mike> the guarantee that each device in a given DM table is aligned.


Mike> But blk_stack_limits() leads to a situation where the combined
Mike> limits (io_min, logical_block_size) are not ideal for all offsets
Mike> into the resulting DM device (e.g. issuing larger IOs to some
Mike> target devices than would otherwise be needed).

Yup.  But not smaller.  That's the whole point.  Making sure we align to
the lowest common denominator and never incur a read-modify-write cycle
on any storage device.

Martin K. Petersen	Oracle Linux Engineering

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]