I get severe data corruption using an logical volume larger
then 2 TB. Finally I was able to track down device mappper or
lvm as last suspects.
My first guess where problems with filesystems but recently
I tried using md / RAID0 - and didnt have any errors of any
kind. I would prefer using LVM since we want to use snapshots
to simplify backup, but I have no clue how to further debug.
On a system with 3 devices each larger then 1 TB and a logical
volume striped over all devices some data gets corrupted while
written (or read ?) from disk. This shows up as md5 or crc sums
changes on sequenced reads of files if filecache is not involved
(by reading a lot data).
On ext2fs there are error while writing data (kernel: EXT2-fs error
(device dm-0): ext2_new_block: Allocating block in system zone -
block = 722239884), on other filesystems successive fsck/repairs
shows corrupted metadata.
The system setup is
- Three 29160B Adaptec scsi-controller each with one
ATA-Disk Raid sized 1240 GB, (dual PIII, HP DL360 G2, 2 GB Ram)
- Volume group over all three devices, logical volume stripped
full size (3.7 TB)
- Filesystem either ext2fs/ext3fs (1.34), reiserfs (3.6.13) or
- host:~ # lvm version
LVM version: 2.00.33 (2005-01-07)
Library version: 1.00.21-ioctl (2005-01-07)
Driver version: 4.3.0
- 2.6.10 vanilla + 2.6.10-udm1 patches
The problems where initially discovered on 2.6.8, tracked on 2.6.9-udm
and also occurs if only 2 devices (sum 2.4 TB) are used.
For a limited time I will be able to further debug the system though
it takes some time to generate more then 2 TB of data
(max seq read/write rate is ~80 MB/s).
Nur tote Fische schwimmen mit dem Strom