[Date Prev][Date Next] [Thread Prev][Thread Next]
Re: [dm-devel] call for slideware ;)
- From: Joe Thornber <thornber redhat com>
- To: Mike Snitzer <snitzer redhat com>
- Cc: Heinz Mauelshagen <heinzm redhat com>
- Subject: Re: [dm-devel] call for slideware ;)
- Date: Wed, 23 Feb 2011 12:24:48 +0000
On Tue, 2011-02-22 at 20:22 -0500, Mike Snitzer wrote:
> I just had a look at the latest content and have some questions (way
> more than I'd imagine you'd like to see.. means I'm clearly missing a
Thanks a lot for taking the time to go through this. I'm updating the
document as I answer your questions. I'll put the git commit hashes in
square brackets to make it easier for you to pick out the changes for
> 1) from "Solution" slide:
> "Space comes from a preallocated ‘pool’, which is itself just another
> logical volume, thus can be resized on demand."
> "Separate metadata device simplifies extension, this is hidden by the
> LVM system so sys admin unlikely to be aware of it."
> Q: Can you elaborate on the role of the metadata? It maps between
> physical "area" (allocated from pool) for all writes to the
> logical address space?
> Q: can thinp and snapshot metadata coexist in the same pool? -- ask
> similar question below.
I've added a new introduction section at the start of the document that
tries to explain that the thinp target is just a simple thin
provisioning solution, whereas multisnap will provide both thinp and
> 2) from "Block size choice" slide:
> The larger the block size:
> - the less chance there is of fragmentation (describe this)
> Q: can you please "describe this"? :)
> - the less frequently we need the expensive mapping operation
> Q: "expensive" is all relative, seems you contradict the expense of
> the mapping operation in the "Performance" slide?
[938422d] You still want to minimise it. The performance at small
block sizes is better than I expected.
> - the smaller the metadata tables are, so more of them can be held in core
> at a time. Leading to faster access to the provisioned blocks by
> minimizing reading in mapping information
> Q: "more of them" -- "them" being metadata tables? So the take
> away is more thinp devices available on the same host?
No, fewer reads to load bit of the mapping table that aren't in the
> 3) from "Performance" slide:
> "Expensive operation is mapping in a new ‘area’"
> Q: is area the same as a block in the pool? Why not call block size:
> "area size"? "Block size" is familiar to people? Original snapshot
> had "chunk size".
I switched from 'chunk' to 'block' because we seem to be the only people
who use the term chunk (my fault) and I was reading lots of filesystem
papers in preparation for this work where block is more ubiquitous.
I've changed 'area' and 'region' to block [1c6a5352]. If you think it's
still confusing I'll change everything to 'chunk' (the LVM2 tools are
still going to use --chunksize etc.).
> 4) Q: what did you decide to run with for reads to logical address space
> that weren't previously mapped? Just return zeroes like was
> discussed on lvm-team?
I've added a 'target parameter' section [8332c43].
> The "Metadata object" section is where you lose me:
I've added some more background stuff [c8e1685].
> 5) I'm not clear on the notion of "external" vs "internal" snapshots.
> Q: can you elaborate on their characteristics?
See above commit.
> 6) I'm not clear on how you're going to clone the metadata tree for
> userspace to walk (for snapshot merge, etc). Is that "clone" really
> a snapshot of the metadata device? -- seems unlikely as you'd need a
> metadata device for your metadata device's snapshots?
> - you said: "Userland will be given the location of an alternative
> superblock for the metadata device. This is the root of a tree of
> blocks referring to other blocks in a variety of data structures
> (btrees, space maps etc.). Blocks will be shared with the ‘live’
> version of the metadata, their reference counts may change as
> sharing is broken, but we know the blocks will never be updated."
> - Q: is this describing an "internal snapshot"?
No. I don't really want to go into how the persistent-data library
works. I should start a separate document for that. If you think I'm
just confusing people by adding these issues then I can take this
> 7) from the "thin' target section:
> "All devices stored within a metadata object are instanced with this
> target. Be they fully mapped devices, thin provisioned devices, internal
> snapshots or external snapshots."
> Q: what is a fully mapped device?
A thinp that's fully mapped, I'll take it out [831c136].
> 8) "The target line:
> thin <pool object> <internal device id>"
> Q: so by <pool object>, that is the _id_ of a pool object that was
> returned from the 'create virtual device' message?
Yep, or rather the id that was passed in to that call. Userland is in
charge of allocating these numbers.
> In general my understanding of all this shared store infrastructure is a
> muddled. I need the audience to take away big concepts not get tripped
> up (or trip me up!) on the minutia.
Agreed, let's try and restrict this document to high level stuff. I'll
do a separate persistent-data doc with the detail in.
> Subtle inconsistencies and/or opaque explanation aren't helping, e.g.:
> 1) the detail of "Configuration/Use" for thinp volume
> - "Allocate (empty) logical volume for the thin provisioning pool"
> Q: how can it be "empty"? Isn't it the data volume you hand to
> the pool target?
Changed to 'possibly empty' [3ce2226]. I think this scenario will occur
quite often, for example a VM hosting service might create a new VM for
a client with a bunch of thinp devices, but not want to commit any space
to the VM until the client actually starts using the devices.
> - "Allocate small logical volume for the thin provisioning metadata"
> Q: before in "Solution" slide you said "Separate metadata device
> simplifies extension", can the metadata volume be extended too?
That's the plan. A userland library will make the necc. tweaks to the
metadata while the device is suspended.
> - "Set up thin provisioning mapped device on aforementioned 2 LVs"
> Q: so there is no distinct step for creating a pool?
For the thinp target, the data device that you pass in to the target is
the 'pool'. I hope the 'target parameters' section I've added helps
> Q: pool is implicitly created at the time the thinp device is
> created? (doubtful but how you enumerated the steps makes it
The LVM tools will implicitly create the data/backing device and the
metadata device. agk is envisioning a command line like:
lvcreate --target-type=thinp --chunksize=512k --low-water-mark=4 -L10G
> Q: can snapshot and thinp volumes share the same pool?
> (if possible I could see it being brittle?)
> (but expressing such capability will help the audience "get"
> the fact that the pool is nicely abstracted/sound design,
I'm not sure if you're talking thinp target or multisnap here. Why
> p.s. I was going to hold off sending this and take another pass of your
> slides but decided your feedback to all my Q:s would likely be much more
> helpful than me trying to parse the slides again.
You definitely did right to send these, it gives me a kick to keep
improving it. Have a read through it now and see if it's any better.
I'm quite happy to keep revising it for you.
[Date Prev][Date Next] [Thread Prev][Thread Next]