[linux-lvm] probably just another xfs and lvm snapshot problem?
Timo Veith
tv at rz.fh-kl.de
Wed Aug 16 15:43:44 UTC 2006
Hi list readers,
the archive has various messages about this topic but I am not sure if my
problem is caused by xfs and the lvm snapshot or not. I came a long way
to this place. Let me elaborate on this a little.
First of all my system config:
I am using Gentoo Linux, kernel 2.6.16-gentoo-r7, device-mapper-1.02.07,
lvm2-2.02.06, xfsprogs-2.7.11
and on the test box: 2.6.17-gentoo-r4, device-mapper-1.02.08,
lvm2-2.02.07, xfsprogs-2.7.11
My intension is to make full system (just everything) backups of a
reasonably busy mailserver. I am using cyrus imap as pop/imap server,
postfix as smtpd and amavisd-new as malware scanner. Pretty common setup.
I wrote a backup script that uses the hard linking techique of rsync
(described e.g. here
http://www.mikerubel.org/computers/rsync_snapshots/). Before starting
rsync I made lvm snapshots of /var and /var/spool/imap. These are the
most busiest partitions.
The output of rsync showed errors and everybody wants his backup to be the
same as the original data, right? Thus I wrote another script that
verifies source against destination data. I can provide the code for it.
The script fails when doing the md5sum check. Here is the output:
md5sum: WARNING: 8 of 70769 computed checksums did NOT match
md5sum check failed
./amavis/db/__db.001: FAILED
./amavis/db/__db.002: FAILED
./amavis/db/__db.003: FAILED
./imap/db/__db.001: FAILED
./imap/db/__db.002: FAILED
./imap/db/__db.003: FAILED
./imap/db/__db.004: FAILED
./imap/db/__db.005: FAILED
or like so:
md5sum: WARNING: 9 of 72522 computed checksums did NOT match
md5sum check failed
./amavis/db/__db.001: FAILED
./amavis/db/__db.002: FAILED
./amavis/db/__db.003: FAILED
./amavis/quarantine/spam-74v2QeUbeNRR.gz: FAILED
./imap/db/__db.001: FAILED
./imap/db/__db.002: FAILED
./imap/db/__db.003: FAILED
./imap/db/__db.004: FAILED
./imap/db/__db.005: FAILED
My first attempt to solve this was to stop the daemons that write these
files. I stopped amavisd-new and cyrus, took the lvm snapshot and
restarted the services. Then I did the backup. But the verify was still
failing.
I asked myself why only that few files out of more than 70000 are failing
the test. I took a closer look and compared the first file manually. The
size was equal but "cmp" said that the files were differing in the first
byte.
Then I thought it was a kind of a file system caching issue and I put
some "sync" commands after stopping the daemons and before taking the lvm
snapshot. The result was that only two files were failing the md5sum
check. However two files failing are two files too much.
Somehow I found the xfs_freeze command and I thought that it must be the
solution of my problems. Unfortunately I locked the mailserver ;) and had
to hard reset it, because lvcreate didn't give me the prompt back.
Hitting CTRL-C neither.
>From the archive I have learned that xfs_freeze shouldn't be necessary
with lvm2 anymore. Also I have read something about versioning problems.
I posted mine above, so maybe it's a version issue?
If not, I also read about the dmsetup command. I played around a little
with dmsetup and found a working combination of commands. Bit I am not
sure if this is the way one can do this.
dmsetup suspend vg0-var
xfs_freeze -f /var
dmsetup resume vg0-var
lvcreate -s -L 1G -n snapvar /dev/vg0/var
xfs_freeze -u /var
# do backup from the snapshot
lvremove -f /dev/vg0/snapvar
To me it doesn't sound good issueing these commands. Just a feeling ;).
Does there remain any other source of the above md5sum errors? I believe
not, because the other 70,000 files were checked and proved to be right.
Any hints?
TIA and kind regards,
Timo
More information about the linux-lvm
mailing list