[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] dm thin: commit pool's metadata on last close of thin device

Dne 17.5.2012 06:00, Mike Snitzer napsal(a):
On Wed, May 16 2012 at  8:43pm -0400,
Mikulas Patocka<mpatocka redhat com>  wrote:

On Wed, 16 May 2012, Mike Snitzer wrote:

Reinstate dm_flush_all and dm_table_flush_all.  dm_blk_close will
now trigger the .flush method of all targets within a table on the last
close of a DM device.

In the case of the thin target, the thin_flush method will commit the
backing pool's metadata.

Doing so avoids a deadlock that has been observed with the following
sequence (as can be triggered via "dmsetup remove_all"):
- IO is issued to a thin device, thin device is closed
- pool's metadata device is suspended before the pool is
- because the pool still has outstanding IO we deadlock because the
   pool's metadata device is suspended

Signed-off-by: Mike Snitzer<snitzer redhat com>
Cc: stable vger kernel org

I'd say --- don't do this sequence.

Device mapper generally expects that devices are suspended top-down ---
i.e. you should first suspend the thin device and then suspend its
underlying data and metadata device. If you violate this sequence and
suspend bottom-up, you get deadlocks.

For example, if dm-mirror is resynchronizing and you suspend the
underlying leg or log volume and then suspend dm-mirror, you get a

If dm-snapshot is merging and you suspend the underlying snapshot or
origin volume and then suspend snapshot-merget target, you get a deadlock.

These are not bugs in dm-mirror or dm-snapshot, this is expected behavior.
Userspace shouldn't do any bottom-up suspend sequence.

In the same sense, if you suspend the underlying data or metadata pool and
then suspend dm-thin, you get a deadlock too. Fix userspace so that it
doesn't do it.

Yeah, I agree.  I told Zdenek it'd be best to just not do it.

'dmsetup remove_all' is a dumb command that invites these deadlocks when
more sophisticated stacking is being used.

But all said, in general I don't have a problem with triggering a target
specific "flush" on device close.

(Though the implementation of thin_flush could be made more
intelligent... as is we can see a pretty good storm of redundant
metadata commits if 100s of thin devices are closed simultaneously --
creating pmd->root_lock write lock contention).

I'd say it differently - the question here is whether we really want to support forced removal of devices or we keep them stuck in the system.

The system must expect that device become unavailable anytime (hw fault),
and if the device is not mounted and unused, it should not be a problem to
remove such device (even suspended).

However current code basically removes thinpool target (has open count 0),
but keeps data and metadata devices (the problem is worse, if the metadata device is replaced with error target in this moment).

Also not - there is not a problem in userspace code as such (except the case of discardless devices - where the change is simple quite unexpected new behavior of device table, thus older tools cannot work properly with it).
The error is technically in the 'teardown' code for the test case - which
used to assume that something with open count 0 could be easily removed - and unremovable targets might be replaced with error targets (so underlying devs could be detached as well) - however with current thinp target, we are in
some troubles and I'd like to see them behaving like all other dm targets.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]