input/output errors

Mon Nov 29 18:18:26 UTC 2004

Benjamin Hornberger wrote:
> At 01:41 PM 11/24/2004 -0800, you wrote:
> 
>> Benjamin Hornberger wrote:
>>
>>> Thanks for your help so far! Unfortunately I'm still a little lost... 
>>> see below.
>>> At 11:36 AM 11/23/2004 -0800, Rick Stevens wrote:
>>>
>>>> Benjamin Hornberger wrote:
>>>>
>>>>> Hi all,
>>>>> we have a machine (RHEL AS 3) on which last night suddenly a lot of 
>>>>> input /
>>>>> output errors occured. Basically, two partitions are unusable.
>>>>> Both of these partitions are RAID 1 devices which share the same 
>>>>> two IDE
>>>>> hard disks (/dev/hda and /dev/hdc are two 250 GB drives, and 
>>>>> /dev/md0 and
>>>>> /dev/md1 are two RAID 1 devices which take 70 and 180 GB from each 
>>>>> drive,
>>>>> respectively).
>>>>> Any hints? I looked into fsck, but I am not sure what is the right 
>>>>> thing to
>>>>> do.
>>>>
>>>>
>>>>
>>>> Ben,
>>>>
>>>> fsck is a tool that will (hopefully) fix filesystem inconsistencies.
>>>> You should boot up in single user mode and run fsck against the two
>>>> filesystems that have issues.  Note that you may lose some data when
>>>> you do that.  Data that can't be reattached to their files will end up
>>>> in the "lost+found" directory of the filesystem being fsck'd and given
>>>> filenames that refer to their inode number.  You may be able to rebuild
>>>> the file by looking at those files, but it's a tedious, error-fraught
>>>> process.
>>>
>>>
>>> What can I do with these files? I can't cat or more or tail them. 
>>> Some of them
>>> seem to be directories (starting with a "d" on ls -l), but when I try 
>>> to cd into
>>> them, I end up at /root.
>>
>>
>> That's the danger of them.  If they're directories, they don't have any
>> parents anymore and their "back link" will probably take you back to /
>> or your home directory.  You'd need to "ls" them to see which files are
>> contained in them--you may then sort out where they belong.
>>
>> As far as the regular files are concerned, you need to look at their 
>> contents to see if maybe you can concatenate them together to
>> reconstruct the original file.  As I said before, it's tedious and very
>> error-prone.
>>
>>>> Since you set up RAID 1, you should first split the RAIDs into two 
>>>> disks
>>>> and see if either disk has clean versions of the data.  If so, you may
>>>> be able to purge the bad drive and recreate the RAID.
>>>
>>>
>>> In the meantime I had done an fsck -cy already on /dev/md0 and /dev/md1.
>>
>>
>> Uh, ok.
>>
>>> If I mount the partitions by themselves (/dev/hda1,2 and /dev/hdc1,2 
>>> rather than
>>> /dev/md0,1), it looks like /dev/hda1,2 are missing data compared to 
>>> /dev/hdc1,2.
>>> But from what I list below, it seems clear that /dev/hdc has 
>>> problems. Did fsck
>>> remove (corrupted) data from /dev/hda1,2?
>>
>>
>> If you did the fsck on md0 and md1 before splitting the RAID1, yes, it's
>> very possible.
>>
>>>> The most important thing to figure out is why you started getting I/O
>>>> errors in the first place.  Is one of the drives dying?  Did you have
>>>> a power glitch?  Did a RAM stick start acting weird?  You must fix the
>>>> underlying issue or you're just going to get a repeat of this event.
>>>
>>>
>>> I am trying to figure that out. The machine is connected to a UPS, so 
>>> no power
>>> glitch. How can I check my RAM?
>>
>>
>> You can run memtest86 on it.  If you are running Fedora Core, boot the
>> first CD and at the "boot:" prompt, enter "memtest86".  If not, you can
>> download a floppy image of it from "http://www.memtest86.com", put it
>> on a floppy and boot that.  You can also get a couple of CDs that I keep
>> handy:
>>
>> The Ultimate Boot CD
>>     http://www.ultimatebootcd.com
>>
>> RIP (Recovery Is Possible)
>>     http://www.tux.org/pub/people/kent-robotti/looplinux/rip/
>>
>> They're both bootable and have lots of diagnostics and such on them.  I
>> keep current copies in my laptop case at all times--just in case I have
>> to bail out a buddy.
>>
>>> What is the best way to check the hard drives (besides fsck -c)? 
>>> Following the
>>> Software RAID How-to, I did the following:
>>> # cat /var/log/messages | grep hda
>>> [tons of blocks like:]
>>> kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
>>> kernel: hda: dma_intr: error=0x40 { UncorrectableError }, ...
>>> kernel: end_request: I/O error, dev 03:01 (hda), sector ...
>>> kernel: raid1: hda1: rescheduling block ...
>>> kernel: raid1: hda1: unrecoverable I/O read error for block ...
>>> # cat /var/log/message | grep hdc
>>> ...
>>> kernel: md: kicking non-fresh hdc2 from array!
>>> ...
>>> kernel: md: kicking non-fresh hdc2 from array!
>>> ...
>>> kernel: md: md1 already running, cannot run hdc2
>>> ...
>>> kernel: md: md0 already running, cannot run hdc1
>>> # more /proc/mdstat
>>> Personalities : [raid1]
>>> read_ahead 1024 sectors
>>> Event: 2
>>> md1: active raid1 hda2[0]
>>>         173429632 blocks [2/1] [U_]
>>> md0: active raid1 hda1[0]
>>>          71581920 blocks [2/1] [U_]
>>> unused devices: <none>
>>> # lsraid -a /dev/md0
>>> [dev   9,   0] /dev/md0 (...cryptic numbers...) online
>>> [dev   3,   1] /dev/hda1 (... cryptic numbers...) good
>>> [dev   ?,   ?] (unknown) (zeroes) missing
>>> same for /dev/md1
>>> # mdadm --detail /dev/md0
>>> ...
>>> Raid Devices: 2
>>> Total Devices: 1
>>> ...
>>> State: dirty, no-errors
>>> Active devices: 1
>>> Working devices: 1
>>> Failed devices: 0
>>> Spare devices: 0
>>> Number Major Minor RaidDevice State
>>>    0           3        1        0       active sync /dev/hda1
>>>    1           0        0        1       faulty removed
>>> ...
>>>
>>> same for /dev/md1
>>>
>>> I don't really understand what's going on. Part of it looks to me as 
>>> if /dev/hda has
>>> a problem, (the greater) part of it looks to me as if /dev/hdc has a 
>>> problem.
>>> So if I pop in a replacement drive for /dev/hdc and do raidhotadd (is 
>>> that the
>>> way to go?), you think the RAID device might be reconstructed 
>>> completely?
>>> But why did I get ioerrors in the first place then -- isn't RAID 
>>> supposed to avoid
>>> that? I mean, I thought even if one disk fails, the RAID array should 
>>> still work
>>> ok, and I just have to replace the broken drive??
>>
>>
>> At this point, you may very well be sunk.  Had you run the fsck on the
>> drives as individuals, you may have had a chance.  Once you ran it on
>> the RAID volumes, all bets are off.
>>
>> This is the inherent danger in using software RAID--you're depending on
>> the computer to be healthy to keep the RAID going.  If the computer is
>> healthy and one of the drives fails, the system will keep running.  If,
>> however, the computer gets sick (and this seems to be what happened),
>> the RAID is compromised.  Who knows what evil things it did?
>>
>> This is why I NEVER recommend software RAID.  If you must have
>> redundancy or high-availability, spend the extra $200 or so and use
>> hardware RAID.  It really is cheap insurance (as you have unfortunately
>> found out).
> 
> 
> I actually tried hardware RAID, but I couldn't get RHEL AS 3 to 
> recognize the Promise FastTrak TX 2000 RAID controller. Now that one is 
> collecting dust in a shelf.

AS3 doesn't recognize most IDE RAIDs, but things like the Adaptec RAID
(SCSI) are supported (actually, most i2o stuff is supported).

> So say I want to wipe out the complete RAID device and install it from 
> scratch. It's only home and data partitions, and I have a backup (which 
> hopefully didn't get corrupted), so shouldn't be too much work. How do I 
> really make sure my hard drives are ok? I run fsck with a bad-block 
> check, and then I can believe it will be ok?

I'd be more confident if you were to download the Ultimate Boot CD and
run a couple of the hard disk diagnostics on that CD.  A bad block check
may find some issues, but it may not find all errors (such as a flakey
seek).

> Say I do the memcheck, what else is there to check the system? Should I 
> run fsck on all partitions? Anyway, does fsck leave a report somewhere? 
> In the man pages, I read that the exit code tell me something. How do I 
> read the exit code?

memcheck86 simply tests the memory exhaustively.  You may still have
other issues.  Most common of those are dirty PCI slot connectors, a
loose CPU or a flakey power supply.  The first two are easy to fix up--
simply unplug and replug all of your PCI cards and CPUs.  The power
supply is harder to diagnose without a recording DVM.

fsck will report the i-node number of the questionable block and what it
thinks the problem is.  If you don't use the "-a" or "-y" options, you
can interactively control what gets fixed and what doesn't.  You can
also write down what fsck is telling you.  No, fsck doesn't leave a
log.  You can't trust the filesystems to hold a log or you wouldn't be
running fsck in the first place!  ;-)
----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens at vitalstream.com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-  Time: Nature's way of keeping everything from happening at once.  -
----------------------------------------------------------------------