[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[linux-lvm] Re: Reproducable OOPS with MD RAID-5 on 2.6.0-test11



On Wed, Dec 03, 2003 at 05:23:02PM -0800, Linus Torvalds wrote:

> On Wed, 3 Dec 2003, Simon Kirby wrote:
> >
> > In any event, this patch against 2.6.0-test11 compiles without warnings,
> > boots, and (bonus) actually works:
> 
> Really? This actually makes a difference for you? I don't see why it
> should matter: even if the sector offsets would overflow, why would that
> cause _oopses_?
> 
> [ Insert theme to "The Twilight Zone" ]

Without the patches, the box gets as far as assembling the array and
activating it, but dies on "mke2fs".  Running mke2fs through strace shows
that it stops during the early stages, before it even tries to write
anything.  mke2fs appears to seek through the whole device and do a bunch
of small reads at various points, and as soon as it tries to read from an
offset > 2 TB, it hangs.

When I first tried this, something with the configuration caused it to
hang so that even nmi_watchdog didn't work.  I first assumed it was the
result of some sort of current spike from all of the drives working at
once, but after gettng it to work with an array size < 2 TB and after
seeing different strange Oopses with different total sizes (by removing
some drives), the problem appeared to be software-related.  I added some
printk()s and found the problem occurred shortly after an overflow in
linear.c:which_dev().

As soon as I saw the overflow I made the connection and corrected the
variable types, but I didn't bother to figure out why it decided to
blow up before.

I can put an unpatched kernel back on the box and do some more testing
if it would be helpful.

Simon-



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]