[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: File system checking on ext3 after a system crash




We have backups etc but that's all very time consuming when trying to restore in an operational env.
So,  I thought well we could run e2image every night,
and if the file system is totally shot (ie. sometimes after e2fsck we don't have much of a file system left),
to the point where we have  to restore from backup,
then hey we could give e2image a shot and just lose a limited amount of data.
Is that too naive ?
I got the impression below, that creating an image may be too time consuming ?
I'm talking about filesystems  about 500Gb, and don't change a real lot.

Thanks.
-Sev


Theodore Tso wrote:
On Mon, Apr 09, 2007 at 02:29:57PM -0400, Sev Binello wrote:
3) Periodically, and at a non-peak time, use the e2image program to
save a backup copy of the filesystem metadata.  Do this *especially*
if you don't have space to do a real backup.  This will give you at
least some measure of a saving throw against a single bad disk write
(caused by malfunctioning storage hardware, or the aforementioned
buggy binary-only graphical driver written in C++ with the pointer
error) from destroying a huge numer of files.
   I noted this response with interest.
   I was unaware of this tool.

It's been around since e2fsprogs 1.20 (May 20, 2001), but it hasn't
gotten a lot of play outside of my "Recovering From Hard Drive
Disasters" Usenix tutorial.  Anyone feel like writing a HOWTO
document?  :-)

I did a quick test and looks simple to use, are there any caveats or hidden gotchas ? I understand it will only restore to the state it was in when the image was taken,
   but in a pinch that maybe an alternative we could use.

In general I'd recommend against using the e2image -I option.  As I've
stated in the man page, it is rarely the right answer.  It's there
primarily so I can do a demonstration of recovering from a mke2fs (and
it is quite the impressive demo), but unless the e2image is very
fresh, it is very likely that it will do more harm than good.

The main use of the e2image file is that you can use it with debugfs:

	debugfs -d /dev/sda2 -i sda2.e2i

Now you can use the dump and rdump commands to copy out critical files
from the damaged filesystem.

   Any idea how long it takes to create/restore ?

The main cost is the time to read the entire inode table from the
filesystem and write it back out to the e2image file, so it really
depends on the size of the filesystem.  On my
when-I-have-time-for-a-quick-hack list, I have adding a new option to
e2image which assumes that the filesystem bitmap blocks are
trustworthy and will only back up the portion of the inode table which
is actually in use.  That will almost certainly be in the next version
of e2fsprogs, since that's a pretty simple change.

   Would it make sense to run on a daily basis ?

If you have sufficient amounts of off-peak time, yes!
Also, wondering if you could point me to documentation explaining how to
   respond to e2fsck questions when it finds problems in the file system.

Hmm, there really isn't any.  In general the right answer is almost
always 'yes', but sometimes I'll take a quick look at the filesystem
using debugfs before answering yes just in case manual intervention
could do a better job.
The big thing is that if e2fsck wants to relocate an inode table, you
almost always want to stop and backup metadata blocks using e2image
first.  In fact I'm thinking about revamping that logic since right
now the potential for doing great harm to the filesystem is far too
high.  So the fact that you might want to say 'n' there is really more
of a sane of a e2fsck bug, or at least misdesign, more than anything
else.

Regards,

						- Ted


--

Sev Binello
Brookhaven National Laboratory
Upton, New York
631-344-5647
sev bnl gov


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]