[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [linux-lvm] HDD failure - please help!



On 09/02/2010 07:50 AM, Patterson, James wrote:
If you wanted RAID5, your best bet on linux is the md driver.
Or else a hardware RAID controller.
I don't/didn't want RAID.

Based on your expectations, I think you *do* want at least RAID1. Raid 1 is simple to administer
and understand.
your first step is getting a copy of the metadata.
There should be a copy at the beginning of each drive.
Yes. How do I access it? None of the drives will mount. I am thinking here that I should create a special boot disk with the LVM tools on it (they are not present on the FC11 boot iso, afaik).
You don't mount the PVs. Use the metadata extraction tool, I don't remember the name atm. If this was your boot filesystem, then you will need a LiveCD or new install. Since you will need a new disk anyway, I suggest you get the new disk that is *bigger* than the failing drive and install to it (but *not* overwriting the others) and leave a partition big enough to contain the PV from the failing drive. Remove the failing drive, and access it via USB - even if you have another drive slot. By taking steps to keep it as cold as possible during recovery, you can coax a few more sectors out of it.
then you should look back a month or so in the archives
I looked...could you please be a bit more specific? I didn't see anything.
This should get you started: https://www.redhat.com/archives/linux-lvm/2010-July/msg00057.html
Well, truly, the only thing I've learned is never to use LVM if it's going to cause me to lose data on all 5 drives when one goes down. The logic behind it's use appears to be to just make life "easier"
With jbod (which you likely have), the failure scenario is exactly the same whether you have 1 disk or 5. Part of your filesystem gets trashed, and you have to use low level tools to recover what remains if you don't have backups.

What having 5 disks *does* do is make failure more likely. Suppose the probability of 1 disk *not* failing in a given year is .999 (3 sigmas). With jbod, the LV fails if *any* of the disks fail. The probability that none of them fail in a given year would then be .999^5 ~= .995. Your array is less reliable.

By using RAID, you can make the array more reliable. RAID works by using multiple copies of data so that you don't lose anything on a single drive failure.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]