[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: LVM negates benefits of jounaling filesystems? [was RFE: autofsck]



Callum Lerwick wrote:

> I would like to put in my +1 for this. Performance is pointless on if
> you can not trust that your data is safe. I have on many occasions run
> fscks on my supposedly clean ext3 filesystems, only to find some mild
> corruption. How can this happen? Isn't journaling supposed to prevent
> this? One day I ran a fsck before doing some filesystem resizing, only
> to find one of my irreplacible personal photos had become corrupted. I
> had no way to know when or why this file got corrupted, it had been
> written to disk some time ago and never touched since. I trusted
> journaling, and it failed me. 

Filesystem corruption can happen for many reasons, and journaling cannot
save you from them all.  Think  about bad cables, memory, kernel bugs,
bad hardware, rogue writes to the block device, etc.  Journaling doesn't
help you in the face of any of these problems.  If you are talking about
data corruption, it could have been an application bug for example (did
a photo editor corrupt it when you wrote the edited version?)  There is
a long line of things which can go wrong, unfortunately.  Trusting
journalling to keep all your data safe now and forevermore is misguided.

> (Yes, I have a backup. I think...) After
> this, I now turn on autofsck on all my machines, so that corruption at
> least can't go undetected for years. Which means after a power fail it
> takes my primary desktop with a pretty full 250gb drive 20-30 minutes to
> come back up, which is incredibly irritating, but I have to know my data
> is safe. I've even picked up a habit of obsessively checksumming all my
> really important files. I wish the filesystem would help do this for me.
> (ZFS...)
> 
> Knowing is half the battle. See, what can happen here, is a file can get
> corrupted, and I may not notice until years later. By then I may have
> cycled through several full backups, and long since lost the backup I
> did have of the file...
> 
> This must be fixed. Only through a long painful process of losing faith
> have I learned to not trust my filesystems. I suspect there are many
> others out there who have been bitten by filesystem corruption and just
> don't know it yet.
> 
> Only now do I learn the likely reason for this corruption. How would I
> have reported this? I just assumed it was hardware glitches.

True, corruption from out-of-order writes due to lack of barriers is
hard to identify as such.  But unfortunately there are a few other
things that could have gone wrong too.  There are things in the works to
help on the integrity front, though (see
http://oss.oracle.com/projects/data-integrity/ for example).

-Eric


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]