[dm-devel] Re: dm: bind new table before destroying old
Mike Snitzer
snitzer at redhat.com
Wed Nov 11 23:45:55 UTC 2009
On Wed, Nov 11 2009 at 8:20am -0500,
Mike Snitzer <snitzer at redhat.com> wrote:
> On Tue, Nov 10 2009 at 8:16pm -0500,
> Alasdair G Kergon <agk at redhat.com> wrote:
>
> > Questions:
> >
> > Do all the targets correctly flush or push back everything during a
> > suspend (including workqueues)?
> >
> > Do all the targets correctly sync to disk all internal state that
> > needs to be preserved during a suspend?
> >
> > In other words, in the case of an already-suspended target, the target
> > 'dtr' functions should only be freeing memory and other resources and
> > not causing I/O to any of the table's devices.
> >
> > All targets are supposed to be behave this way already, but please
> > would you check the targets with which you are familiar anyway?
> >
> > Alasdair
> >
> >
> > From: Alasdair G Kergon <agk at redhat.com>
> >
> > When replacing a mapped device's table during a 'resume', delay the
> > destruction of the old table until the new one is successfully in place.
> >
> > This will make it easier for a later patch to transfer internal state
> > information from the old table to the new one (something we do not currently
> > support) while giving us more options for reversion if a later part
> > of the operation fails.
>
> I have confirmed that this patch allows handover to work within a single
> device.
Alasdair,
After further testing I've hit a lockdep trace. My testing was with
handing over on the same device. I had the snapshot (of an ext3 FS)
mounted and I was doing a sequential direct-io write to a file in the
FS. While writing I triggered a handover with the following:
echo "0 50331648 snapshot 253:2 253:3 P 8" | dmsetup reload test-testlv_snap
dmsetup resume test-testlv_snap
With that handover worked fine (with no IO errors), but the following
lockdep resulted (some "snapshot_*" tracing was added for context):
snapshot_ctr
snapshot_ctr: found snap_src
snapshot_presuspend
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-rc6-snitm #8
-------------------------------------------------------
dmsetup/1827 is trying to acquire lock:
(&md->suspend_lock){+.+...}, at: [<ffffffffa00678d8>] dm_swap_table+0x2d/0x249 [dm_mod]
but task is already holding lock:
(&journal->j_barrier){+.+...}, at: [<ffffffff8119192d>] journal_lock_updates+0xe1/0xf0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&journal->j_barrier){+.+...}:
[<ffffffff810857b3>] __lock_acquire+0xb6b/0xd13
[<ffffffff81086396>] lock_release_non_nested+0x1dc/0x23b
[<ffffffff8108656f>] lock_release+0x17a/0x1a5
[<ffffffff8139214b>] __mutex_unlock_slowpath+0xce/0x132
[<ffffffff813921bd>] mutex_unlock+0xe/0x10
[<ffffffff81147329>] freeze_bdev+0x104/0x110
[<ffffffffa0069038>] dm_suspend+0x119/0x2a1 [dm_mod]
[<ffffffffa006db3a>] dev_suspend+0x11d/0x1de [dm_mod]
[<ffffffffa006e30c>] ctl_ioctl+0x1c6/0x213 [dm_mod]
[<ffffffffa006e36c>] dm_ctl_ioctl+0x13/0x17 [dm_mod]
[<ffffffff8112a959>] vfs_ioctl+0x22/0x87
[<ffffffff8112aec2>] do_vfs_ioctl+0x488/0x4ce
[<ffffffff8112af5e>] sys_ioctl+0x56/0x79
[<ffffffff8100bb82>] system_call_fastpath+0x16/0x1b
-> #0 (&md->suspend_lock){+.+...}:
[<ffffffff8108565d>] __lock_acquire+0xa15/0xd13
[<ffffffff81085a37>] lock_acquire+0xdc/0x102
[<ffffffff81392372>] __mutex_lock_common+0x4b/0x37b
[<ffffffff81392766>] mutex_lock_nested+0x3e/0x43
[<ffffffffa00678d8>] dm_swap_table+0x2d/0x249 [dm_mod]
[<ffffffffa006db45>] dev_suspend+0x128/0x1de [dm_mod]
[<ffffffffa006e30c>] ctl_ioctl+0x1c6/0x213 [dm_mod]
[<ffffffffa006e36c>] dm_ctl_ioctl+0x13/0x17 [dm_mod]
[<ffffffff8112a959>] vfs_ioctl+0x22/0x87
[<ffffffff8112aec2>] do_vfs_ioctl+0x488/0x4ce
[<ffffffff8112af5e>] sys_ioctl+0x56/0x79
[<ffffffff8100bb82>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
1 lock held by dmsetup/1827:
#0: (&journal->j_barrier){+.+...}, at: [<ffffffff8119192d>] journal_lock_updates+0xe1/0xf0
stack backtrace:
Pid: 1827, comm: dmsetup Not tainted 2.6.32-rc6-snitm #8
Call Trace:
[<ffffffff81084825>] print_circular_bug+0xa8/0xb7
[<ffffffff8108565d>] __lock_acquire+0xa15/0xd13
[<ffffffff81085a37>] lock_acquire+0xdc/0x102
[<ffffffffa00678d8>] ? dm_swap_table+0x2d/0x249 [dm_mod]
[<ffffffffa00678d8>] ? dm_swap_table+0x2d/0x249 [dm_mod]
[<ffffffffa006da1d>] ? dev_suspend+0x0/0x1de [dm_mod]
[<ffffffff81392372>] __mutex_lock_common+0x4b/0x37b
[<ffffffffa00678d8>] ? dm_swap_table+0x2d/0x249 [dm_mod]
[<ffffffff81083933>] ? mark_lock+0x2d/0x22d
[<ffffffff81083b85>] ? mark_held_locks+0x52/0x70
[<ffffffff8139219d>] ? __mutex_unlock_slowpath+0x120/0x132
[<ffffffffa006da1d>] ? dev_suspend+0x0/0x1de [dm_mod]
[<ffffffff81392766>] mutex_lock_nested+0x3e/0x43
[<ffffffffa00678d8>] dm_swap_table+0x2d/0x249 [dm_mod]
[<ffffffff813921bd>] ? mutex_unlock+0xe/0x10
[<ffffffffa00691ae>] ? dm_suspend+0x28f/0x2a1 [dm_mod]
[<ffffffffa006da1d>] ? dev_suspend+0x0/0x1de [dm_mod]
[<ffffffffa006db45>] dev_suspend+0x128/0x1de [dm_mod]
[<ffffffffa006e30c>] ctl_ioctl+0x1c6/0x213 [dm_mod]
[<ffffffff81077d7f>] ? cpu_clock+0x43/0x5e
[<ffffffffa006e36c>] dm_ctl_ioctl+0x13/0x17 [dm_mod]
[<ffffffff8112a959>] vfs_ioctl+0x22/0x87
[<ffffffff81083e41>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff8112aec2>] do_vfs_ioctl+0x488/0x4ce
[<ffffffff811f3e5a>] ? __up_read+0x76/0x7f
[<ffffffff81076746>] ? up_read+0x2b/0x2f
[<ffffffff8100c635>] ? retint_swapgs+0x13/0x1b
[<ffffffff8112af5e>] sys_ioctl+0x56/0x79
[<ffffffff8100bb82>] system_call_fastpath+0x16/0x1b
snapshot_preresume
snapshot_preresume: snap_src is_handover_source
snapshot_preresume: resuming handover-destination
snapshot_resume
snapshot_resume: handing over exceptions
snapshot_dtr
More information about the dm-devel
mailing list