file-copy corruption

T. Horsnell tsh at mrc-lmb.cam.ac.uk
Wed Jun 28 12:40:15 UTC 2006


I'm in the process of moving stuff from our Alpha fileserver
onto A linux replacement. I've been using gnu-tar to copy filesystems
from the Alpha to to the Linux NFS-exported disks over a 1Gbit LAN,
followed by diff -r to check that they have copied correctly (I wish
diff had an option to not follow symlinks..). I've so far transferred
about 3 TiB of data (spread over several weeks) and am concerned
that during this process, 3 files were mis-copied without any
apparent hardware-errors being flagged. There was nothing unusual
about these files, and re-copying them (with cp) fixed the problem.

Are occasional undetected errors like this to be expected?
I thought there were sufficient stages of checksumming/parity 
(both boxes have ECC memory) etc to render the probability
of this to be vanishingly small.

On all 3 files, multiple retries of the diff still resulted
in a compare error, which was then fixed by a re-copy. This
suggests that the problem occurs during the 'gtar' phase, rather
than the 'diff -r' phase.

Does anyone know of a network-exercise utility I can use
to check the LAN component of the data-path?

Cheers,
Terry.




More information about the fedora-list mailing list