[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: kupdated, bdflush and kjournald stuck in D state on RAID1 device (deadlock?)



On Wed, Aug 29, 2001 at 03:26:01PM -0700, Andrew Morton wrote:
> David Rees wrote:
> > 
> > On Wed, Aug 29, 2001 at 02:38:23PM -0700, Andrew Morton wrote:
> > >
> > > OK, thanks.  bdflush is stuck in raid1_alloc_r1bh() and
> > > everything else is blocked by it.  I thought we fixed
> > > that a couple of months ago :(
> > >
> > > Could you send the output of `cat /proc/meminfo'?
> > >
> > > > 18239 -bash            wait4
> > > > 18274 umount /opt      rwsem_down_write_failed
> > >
> > > What are we trying to do here?  Is /opt the deadlocked
> > > filesytem?
> > 
> > Yep, /dev/md0 is mounted on /opt.
> > 
> 
> OK, and according to your /proc/meminfo:
> 
>         total:    used:    free:  shared: buffers:  cached:
> Mem:  525422592 497364992 28057600        0 133500928 335839232
> Swap: 1052794880  4710400 1048084480
> MemTotal:       513108 kB
> MemFree:         27400 kB
> MemShared:           0 kB
> Buffers:        130372 kB
> Cached:         323368 kB
> SwapCached:       4600 kB
> Active:         293704 kB
> Inact_dirty:    161536 kB
> Inact_clean:      3100 kB
> Inact_target:       16 kB
> HighTotal:           0 kB
> HighFree:            0 kB
> LowTotal:       513108 kB
> LowFree:         27400 kB
> SwapTotal:     1028120 kB
> SwapFree:      1023520 kB
> 
> it's not an out-of-memory deadlock.
> 
> The RAID1 buffer allocation is pretty simple - unless the
> disk controller has decided to stop delivering interrupts,
> everything shold just come back to life as physical writes
> complete.  I assume the hardware is still working OK?
> 
> It's a uniprocessor machine, yes?

It is a uniprocessor machine, yes.  The machine is a 1.1GHz Athlon on a Soyo
motherboard.  This is the first problem we've seen with the machine.

The machine was and is still working mostly ok.  Since I typed "umount /opt",
the opt directory doesn't show any contents any more, but before that things
appear OK.  I can still use fdisk to look at the partition layout of the
drives in /opt raid1 array.  /proc/mdstat is normal.

There are no other software raid devices on the machine (/ is also ext3 on a
separate drive/controller).  There are no suspicious messages printed from
dmesg, either.

-Dave





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]