[dm-devel] [PATCH] deadlock with suspend and quotas

Mikulas Patocka mpatocka at redhat.com
Wed Nov 30 17:09:03 UTC 2011



On Wed, 30 Nov 2011, Jan Kara wrote:

> On Wed 30-11-11 13:05:14, Alasdair G Kergon wrote:
> > On Wed, Nov 30, 2011 at 07:14:18AM -0500, Mikulas Patocka wrote:
> > > On Wed, 30 Nov 2011, Jan Kara wrote:
> > > > > So if you skip sync of frozen filesystems, you introduce a data
> > > > > corruption if someone takes a snapshot of ext2.
> > > >   Yes, because ext2 cannot really be frozen, it is (errorneously) marked
> > > > as such but it is not frozen...
> > 
> > This is just getting into semantics.  AFAIK (and it was before my involvement)
> > LVM used the term 'lockfs' for this operation when it was introduced to ext2.
> > It later got renamed in-kernel to 'frozen' to bring it into line with newer
> > filesystems.  But userspace and the interface still retain the original
> > 'lockfs' name.
> > 
> > There is no further I/O sent to the filesystem during the 'lockfs' operation:
> > LVM uses dm to block that.
>   OK, so can we (at least in this discussion) discussion distinguish two
> things?
> a) Filesystems is frozen/locked - means filesystem is in a consistent state
>   and disallows new dirty data to be created until fs is thawed/unlocked.

Agreed. Note that if you observed any sync-related deadlocks when 
suspended, it means that the filesystem has some code path that allows 
creating dirty data on frozen filesystem.

This was observed on ext4 on RHEL-6 ... and maybe on upstream too. (I 
couldn't reproduce it on upstream, but maybe other people who started 
these sync-related patches could?)

> b) Device is frozen/locked - device does not process incoming writes, they
>   are held in the queue until the device is thawed/unlocked.
> 
>   They are two different things and we seem to conflate them in the
> discussion. In particular you can freeze a device under any filesystem
> while you cannot freeze every filesystem. Freezing the device is enough for
> LVM operations (e.g. snapshot) but if filesystem is not frozen, you have
> to run fsck / journal replay to make result usable. Do we agree here?
> 
> 								Honza

True. You have to run fsck on non-journaled filesystems when taking a 
snapshot.

Mikulas




More information about the dm-devel mailing list