[linux-lvm] Frozen root volume

Ray Morris support at bettercgi.com
Thu Apr 5 18:07:38 UTC 2012


Depending on syslog.conf, there may well be messages in the logs not on the console.

I have experienced similar issues with software RAID 5 which were not present with hardware RAID and they showed up around the same kernel version as your experience.

There is a fix headed to the mainline kernel for a deadlock in raid1.c which caused very similar symptoms under very similar loads. I suspect another variation of a similar bug exists in raid5.c but I'm not a kernel programer, so I'm speculating. You may wish to have a close look at raid5.c. Perhaps add some printk to see if you can narrow it down. Obviously putting some of your storage on HW raid and seeing if the problem persists on those devices could be informative. That would let you know whether or not to move this to the RAID list.
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Larkin Lowrey <llowrey at nuclearwinter.com> wrote:

I have the serial console output going to a logging terminal server so
I'm able to capture everything that is sent to the console and I've seen
no errors or any other unusual output prior to these freezes. Would
rsyslog produce different results?

My vg is atop 4 md raid devices, a tiny raid6 for the boot fs, an 8
drive raid5 for the root fs, and two 6 drive raid5s for a data fs.

The 8 drives of the root raid5 are connected to a 6 port AHCI controller
(AMD SB850) and a 2 port AHCI controller (Marvell 88SE9128).

Is there any way to determine which of these (md device, AHCI
controller, disk) is the culprit? I have been able to read from each of
the constituent drives so I know that I/O at that level can take place.
I can't think of a safe way to test writes non-destructively.

--Larkin

On 4/5/2012 11:02 AM, Ray Morris wrote:
> What does your storage stack look like? Something in the stack froze
> up. It could be your SAN storage device, if you're using one, the switch
> connecting to the SAN, if you're using one, the RAID card, if you're
> using one, the software RAID, if you're using software RAID ...
>
> A reasonable next step if there is nothing in the logs because are on
> the root device might be to use rsyslog to send the log data to a
> neighboring machine. That would you can see the messages in the log no
> matter which local storage has a problem. 

_____________________________________________

linux-lvm mailing list
linux-lvm at redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20120405/1ead7b8d/attachment.htm>


More information about the linux-lvm mailing list