Errors with Soft RAID 5 Array, no disks off line

Bruno Wolff III bruno at wolff.to
Mon Jan 1 19:04:00 UTC 2007


On Mon, Jan 01, 2007 at 12:16:01 -0500,
  vamythguy <vamythguy at gmail.com> wrote:
> 
> I was really hoping I hadn't lost any data.  Is there any use in the
> short-term to failing out, removing and then re-adding either drive and see
> if it gets rebuilt?  I have one replacement drive on hand I could then use
> to replace the other.  I *really* don't want to spend too much time figuring
> out which blocks are bad, unless I can map them easily to specific files.

I think I had misread your original post saying that one drive had already
been failed out of the array. If not that may give you some more options.
If all of the drives are still in the array, all of the information to
recover the data exists. The issue is that the normal rebuild process probably
won't work because there are errors on two drives.

Before trying anything, but up anything you don't want to lose, if you
hadn't already.

What you might do is find out which blocks are bad (using smart self tests)
in the drive with 3 unreadable blocks and then save copies of the
corresponding blocks on the other drives so that you can manually recover
the data later. The self tests will stop on the first bad block they find.
So after each test you will need to get the block remapped. Try doing a
couple of reads with dd with iflag=direct and then if that doesn't work
do a direct write to the block.

Once the three blocks are cleared then you can fail out and remove the drive
with 48 unreadable sectors. Then add in the new drive and let the array
rebuild.

Then you should run badblocks on the drive with 48 unreadable sectors
and the smart self tests to see if the drive might be worth keeping.
(If you have already decided to toos the drive skip that part.)

Then you should probably fail out and remove the drive which previous had
3 unreadable sectors and run badblocks on it and run some smart selftests
to see if it looks worth keeping. (Again skip if you have already decided
to toss the drive.)

At this point you are running in degraded mode and you either want to add
one of the drives above back into the array, or go buy another replacement
drive.




More information about the fedora-list mailing list