SCSI tape [more info]

T. Horsnell tsh at mrc-lmb.cam.ac.uk
Tue May 3 11:54:21 UTC 2005


>T. Horsnell wrote:
>> I found A PC running FC3 (kernel 2.6.10-1.741_FC3) which had
>> an Adaptec 29160 SCSI adapter in a 32-bit PCI slot.
>> I connected my SDLT2 library to it and repeated my tests.
>> Everything worked!.
>> 
>> Adaptec 39320 uses aic79xx driver
>> Adaptec 29160 uses aic7xxx driver
>> 
>> I took the 29160 out of the PC, intalled it in the Opteron
>> box (into a 64-bit PCI-X slot) and repeated the tests. Failure.
>> The Opteron has 4 kernels available:
>>  2.6.10-1.770_FC3smp
>>  2.6.10-1.770_FC3smp
>>  2.6.10-1.770_FC3 
>>  2.6.10-1.770_FC3 
>> I tried them all. Failure.
>> I even booted from a Knoppix CD with 2.4.27.
>> (This presumably means I'm running a 32-bit kernel on a 64-bit box). Failure.
>> 
>> I remembered I had a desktop Compaq SDLT1 tapedrive on one of my systems.
>> I tried that on the 29160 adapter. Success.
>> I tried it on the 39320 adapter. Success. 
>> Is it some sort of datarate problem I ask myself (the
>> SDLT1 is about half the speed of the SDL2) and the SDLT2
>> worked on the PC.
>> I moved the 2960 out of the PCI-X slot into a PCI slot
>> and tried again with the 5 Kernels listed above. Failure.
>> 
>> I'm now running out of ideas/energy/hardware.
>> With my original configuration, (Opteron, 2.6.10-1.770_FC3smp,
>> Adaptec 39320, dual SDLT2) I can get verified tar dumps
>> of at least 4GB (I havent tried anything bigger) provided I
>> use a record size > 32768 (64*512). 65536 (256*512) works fine,
>> as does 131072 (256*512). 32768 fails.
>> 
>> Is it time to file a bug report? Who with?
>> 
>> Thanks for any and all suggestions,
>
>This is sounding a lot like a problem I ran into with a DDS-2 tape
>drive (Archive IBM4326NP/RP).  When the drive's internal counter of
>bytes written to the tape (compressed data) reached 4 GiB the drive
>would (a) skip writing one block of data, and (b) fail to log any
>further errors (e.g., write retries).  The byte counter would remain
>stuck at the 4 GiB mark.  No error was ever reported by the drive.
>This was using an Adaptec 2940 SCSI adapter and the aic7xxx driver.
>I'm reasonably sure that neither the SCSI adapter nor the driver was
>at fault because they see only the uncompressed data stream and have
>no way to know when the drive's internal counter hits 4 GiB.  In
>addition, the same adapter and driver work just fine on a DDS-3 tape
>drive.
>
>I first realized there was a problem when I noticed that the drive's
>block counter (shown by "mt tell") sometimes incremented by one less
>than the number of blocks reported by 'dd'.  After much testing and
>looking at the counters reported by "smartctl -l error" I was able
>to identify the problem.  Though the advertised capacity of a DDS-2
>tape is 4 GB compressed, the counters do not get reset when changing
>tapes, so it's possible to encounter the error at any point on the
>tape.

Thanks for this info Bob.
I dont think this applies in my case as my problems occur  when
writing files if the recordsize is small-ish (e.g. 10240 bytes) and
even happens on files 500Kbytes long, but I can successfully write files,
even large files (tar of 4.7GB) if I use a record size of 65536 and upward.
The symptom is that, like yourself, records are lost with no errors
logged.

e.g.
[root at ls1 ~]$ dd if=tapetest500K of=/dev/st1 bs=10240 ; dd if=/dev/st1 of=junk bs=10240 ; cmp tapetest 500K junk
48+1 records in
48+1 records out
36+1 records in
36+1 records out
tapetest500K junk differ: byte 215041, line 854

tapetest500K is full of random numbers, and in this and all other tests
the mismatches always start on a record-boundary, which makes me think
that whole records are being dropped. If I read the tapes
on my Alpha box (the SDLT2 drives work OK in this box) I get the same
read mismatches, so I assume its a write problem.
I guess I should write something to try and find out which records
they are.

My current suspicion is that its something to do with the use of
the buffer on the tapedrive itself, and the SCSI driver. Some sort
of timing issue maybe. I'm about to contact Quantum to try and get
some more info about this.

Cheers,
Terry.

>
>-- 
>Bob Nichols         rnichols42 at comcast.net
>
>-- 
>fedora-list mailing list
>fedora-list at redhat.com
>To unsubscribe: http://www.redhat.com/mailman/listinfo/fedora-list
>




More information about the fedora-list mailing list