[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[linux-lvm] FC6+LVM2 over RAID: drive failed and LVM hung



Hopefully someone can shed some light on how to proceed with solving an LVM hang problem.

Yesterday I get an email that one of the drives did not pass the self-check.

In /var/log/messages I see these lines related to the drive issue:

================================================================================
Apr 7 17:57:03 grp-01-10-01 smartd[2444]: Device: /dev/hdm, FAILED SMART self-check. BACK UP DATA NOW! Apr 7 17:57:03 grp-01-10-01 smartd[2444]: Sending warning via mail to root ... Apr 7 17:57:03 grp-01-10-01 smartd[2444]: Warning via mail to root: successful Apr 7 18:05:51 grp-01-10-01 kernel: hdm: task_out_intr: status=0x51 { DriveReady SeekComplete Error } Apr 7 18:05:52 grp-01-10-01 kernel: hdm: task_out_intr: error=0x10 { SectorIdNotFound }, LBAsect=238863867, high=14, low=3982843, sector=238863903
Apr  7 18:05:52 grp-01-10-01 kernel: ide: failed opcode was: unknown
Apr 7 18:05:57 grp-01-10-01 kernel: hdm: task_out_intr: status=0x51 { DriveReady SeekComplete Error } Apr 7 18:06:00 grp-01-10-01 kernel: hdm: task_out_intr: error=0x10 { SectorIdNotFound }, LBAsect=238814880, high=14, low=3933856, sector=238814887
Apr  7 18:06:00 grp-01-10-01 kernel: ide: failed opcode was: unknown

^^^^^^^^ LOTS OF THESE LINES IN LOG ^^^^^^^^^

Apr 8 02:05:10 grp-01-10-01 kernel: raid1: hdm2: rescheduling sector 54264480
...
Apr 8 02:05:21 grp-01-10-01 kernel: raid1:md0: read error corrected (8 sectors at 54264480 on hdm2) Apr 8 02:05:22 grp-01-10-01 kernel: raid1: hdc2: redirecting sector 54264480 to another mirror <<===== I DO NOT UNDERSTAND THIS MESSAGE. THE FAILING DRIVE hdm IS THE OTHER MIRROR FOR hdc2 ????
...
Apr 8 03:02:35 grp-01-10-01 kernel: raid1: hdm2: rescheduling sector 30555792
...
Apr 8 03:02:36 grp-01-10-01 kernel: raid1: Disk failure on hdm2, disabling device.
Apr  8 03:02:37 grp-01-10-01 kernel:    Operation continuing on 1 devices
Apr 8 03:02:37 grp-01-10-01 kernel: raid1: hdc2: redirecting sector 30555792 to another mirror <<===== AND NOW THERE IS NO OTHER MIRROR !
Apr  8 03:02:37 grp-01-10-01 kernel: RAID1 conf printout:
Apr  8 03:02:37 grp-01-10-01 kernel:  --- wd:1 rd:2
Apr  8 03:02:37 grp-01-10-01 kernel:  disk 0, wo:0, o:1, dev:hdc2
Apr  8 03:02:37 grp-01-10-01 kernel:  disk 1, wo:1, o:0, dev:hdm2
Apr  8 03:02:37 grp-01-10-01 kernel: RAID1 conf printout:
Apr  8 03:02:37 grp-01-10-01 kernel:  --- wd:1 rd:2
Apr  8 03:02:37 grp-01-10-01 kernel:  disk 0, wo:0, o:1, dev:hdc2
Apr 8 03:27:03 grp-01-10-01 smartd[2444]: Device: /dev/hdm, FAILED SMART self-check. BACK UP DATA NOW! Apr 8 03:27:03 grp-01-10-01 smartd[2444]: Device: /dev/hdm, 1 Currently unreadable (pending) sectors Apr 8 03:27:03 grp-01-10-01 smartd[2444]: Sending warning via mail to root ...

^^^^^^^^ LOTS OF THESE LINES IN LOG ^^^^^^^^^
================================================================================


So I check /proc/mdstat and yes the md0 raid1 array shows only 1 active drive, hdc2.

So I take a backup and then shutdown the system. I pull the bad drive out and put in a new drive and reboot. The system boots up until it gets to the LVM part and then just hangs at this message:
================================================================================
...
Setting Hostname
Setting up Logical Volume Management (boot hangs right here, icon stops spinning, cursor is locked)
================================================================================

So my setup consists of two Linux RAID arrays, a raid5 (md1) and a raid1 (md0) array. The drive partition that went bad (hdm2) is part of md0 and another partition (hdm1) also acts as a spare for md1.

There is an LVM VG over each array. So we have VolumeGroup00 and VolumeGroup01.


How should I tackle this problem? I tried rescue mode but then there are no VG's and I only see one of the arrays, md0.


????

Thanks,
Gerry



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]