[dm-devel] mirrored device with thousand of mappingtableentries

Mike Snitzer snitzer at redhat.com
Mon Mar 7 20:10:27 UTC 2011


On Sun, Mar 06 2011 at  9:59pm -0500,
Martin K. Petersen <martin.petersen at oracle.com> wrote:

> >>>>> "Zdenek" == Zdenek Kabelac <zkabelac at redhat.com> writes:
> 
> Zdenek> My finding seems to show that BIP-256 slabtop segment grow by
> Zdenek> ~73KB per each device (while dm-io is ab out ~26KB)
> 
> Ok, I see it now that I tried with a bunch of DM devices.
> 
> DM allocates a bioset per volume. And since each bioset has an integrity
> mempool you'll end up with a bunch of memory locked down. It seems like
> a lot but it's actually the same amount as we reserve for the data path
> (bio-0 + biovec-256).
> 
> Since a bioset is not necessarily tied to a single block device we can't
> automatically decide whether to allocate the integrity pool or not. In
> the DM case, however, we just set up the integrity profile so the
> information is available.
> 
> Can you please try the following patch? This will change things so we
> only attach an integrity pool to the bioset if the logical volume is
> integrity-capable.

Hey Martin,

I just took the opportunity to review DM's blk_integrity code a bit more
closely -- with an eye towards stacking devices.  I found an issue that
I think we need to fix that has to do with a DM device's limits being
established during do_resume() and not during table_load().

Unfortunately, a DM device's blk_integrity gets preallocated during
table_load().  dm_table_prealloc_integrity()'s call to
blk_integrity_register() establishes the blk_integrity's block_size.

But a DM device's queue_limits aren't stacked until a DM device is
resumed -- via dm_calculate_queue_limits().

For some background please see the patch header of this commit:
http://git.kernel.org/linus/754c5fc7ebb417

The final blk_integrity for the DM device isn't fully established until
do_resume()'s eventual call to dm_table_set_integrity() -- by passing a
template to blk_integrity_register().  dm_table_set_integrity() does
validate the 'block_size' of each DM devices' blk_integrity to make sure
they all match.  So the code would catch the inconsistency should it
arise.

All I'm saying is: it's possible for a table_load() to not have the
awareness that a newly added device's queue_limits will cause the DM
device's final queue_limits to be increased (say a 4K device was
added to dm_device2, and dm_device2 is now being added to another
dm_device1).

So it seems we need to establish bi->sector_size during the final stage
of blk_integrity_register(), e.g. when a template is passed.  Not sure
if you'd agree with that change in general but it'll work for DM because
the queue_limits are established before dm_table_set_integrity() is set.

Maybe revalidate/change the 'block_size' during the final stage in case
it changed?

Thanks,
Mike




More information about the dm-devel mailing list