RAID gotchas!

Roberto Ragusa mail at robertoragusa.it
Mon Jul 2 20:29:16 UTC 2007


Jeffrey Ross wrote:
> Roberto Ragusa wrote:
>>
>> Dump is considered a bad choice by Linus himself; read this:
>>
>>   http://lwn.net/2001/0503/a/lt-dump.php3
>>
>> (a few years ago, but the words are quite strong)
>>
>> Best regards.
>>   
> 
> I've read the arguments here's the rebuttal to the 2001 message:
> http://dump.sourceforge.net/isdumpdeprecated.html

Thank you for this link, very interesting.
Basically thay say that there was a bug in 2.4, now fixed.

They claim three advantages when using dump, but they are rather
weak, I have to say (IMHO).

1) dump unmounted filesystem; but why not just mount it read-only
and use a normal file copy tool? they talk about trying to dump
corrupted unmountable filesystems for rescue purposes, but it looks
like a very stretched motivation, especially when trying to prove
that dump is preferable for normal uncorrupted filesystems.

2) dump doesn't modify the access time of the files on the
filesystem; well, you can mount read-only or noatime.

3) dump is faster; no benchmark is available and they also doubt
that this is valid today, given great filesystem caches.

They, honestly, list one disadvantage:

1) dump needs knowledge of filesystem internals, so no reiser or
other kind of filesystem; normal file copy works on everything
(including nfs)

But they forget to say that:

1) parsing the internals of the filesystem is a duplication of
the filesystem code; as dump is certainly less tested than the actual
filesystem code, I'd trust the latter for my backups.

2) accessing the device twice at the same time (filesystem+dump)
is a recipe for inconsistency because of the filesystem cache;
they suggest to sync the filesystem and keep low activity
or, better, to unmount or mount read-only before running dump.
They also acknowledge that this is a fundamental issue that
cannot be fixed (paragraph "Is that problem peculiar to Linux?").

3) dump has to be costantly keep aligned to the filesystem
improvements: many things where added to ext3 during its life
(e.g. EA/ACL), and ext4 will be available in a few months.

4) it is true that backup of "live" data is a fundamentally
undefined problem, but with file copy tools I'm definitively sure
that working in a directory while copying another works well;
with dump I'm not so sure.

5) file copy tools can descend in other filesystems (or not,
as they often have the -x option).

6) dump can not create accessible backups; I want to be able
to use the files in my backup (find, grep,...), not just
restore them.

Finally they say that by using snapshots you can have a stable
read-only image of the filesystem to run dump on. But the same
is true for other tools too.

Certainly there is not a right way and wrong way to do things.
If dump gives you reliable backups and you are used to it,
it's a valid choice.

File copy tools will remain my preferred choice.
In this exact moment I have two backups running across the
LAN; they involve a couple of millions of files; one is
using tar|tar, another rsync. (I'm not kidding)
All filesystems are reiser here, so I couldn't try dump if I
wanted, but even if I could, I think I would not. :-)

You gave me an opportunity to understand dump better.
For what I've seen, it should be called e2dump and
should be part of ext2progs, together with e2fsck,
e2label, resize2fs and dumpe2fs (which is something else).
It is a filesystem tool, not a file tool.
Linux is not always ext2/ext3.

Maybe the summary of all this is just that dump is a
tool to backup a filesystem, but I want to backup the files.

Best regards.
-- 
   Roberto Ragusa    mail at robertoragusa.it




More information about the fedora-list mailing list