[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: How safe is journalling?



Hi,

On Thu, Nov 01, 2001 at 09:52:02AM +0000, Jeremy Sanders wrote:

> > The journal guarantees that the filesystem metadata is always consistent
> > after a crash.  You may lose some operations from just before the crash,
> > but you probably would have anyways (probably not quite as many as with
> > a journal), because of buffering.
> 
> That's what I thought, but can the drive to strange things like
> re-ordering writes and caching data which may screw up journalling
> attempts?

Some disks enable writeback caching even when you disable it.  Some
disks lie about flushing the cache to disk, because that hurts
benchmarks.  If your hardware lies to the OS about when the data is
safe, there's basically very little you can do about it other than
using a UPS to ensure that the disk never loses power even if the OS
crashes.

This is not just theoretical: there's a recent errata against some IBM
laptop disks that implies that they were not flushing the cache
contents to the platters on powerdown (windows and ext2 were affected
too.)

But as long as the disk isn't actively lying to defeat the OS,
journaling is 100% atomic.  The entire point of the journal is that we
store both the old and new version of the disk contents on disk when
changes are done: the old version is kept on the main filesystem
contents, and the new version is kept in the journal.  Only once the
new version is entirely written to the journal, and the disk has
acknowledged that those writes have completed, do we write a single
item on the disk marking the journal copy as being uptodate.  If you
crash any time before or during that write, the old copy is maintained
afterwards; if you crash once the commit write is complete, the entire
new copy is honoured after the crash.  You never see an inconsistent
state.

Cheers,
 Stephen





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]