[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ext3 assertion failure.



On Tue, 30 Apr 2002, Andreas Dilger wrote:

> On Apr 30, 2002  21:54 -0400, Tom Diehl wrote:
> 
> It looks like you were getting garbage from the disk before the journal
> assertion happened (i.e. the ext3 error), and the journal assertion is
> just there to save your filesystem from getting corrupted with further
> bad operations.
> 
> > This is a stock 7.2 system with all revelant updates.
> > Not sure what other info to provide so if I missed something please let me
> > know.
> 
> I would really recommend upgrading to the latest RH errata kernel.  The
> ext3 code has had a number of bugs fixed since 2.4.9.  It might also be
> related to IDE stuff, don't know.

AFAIK 2.4.9-31 is their latest errata kernel. Just checked the ftp site and
that is the latest one there, although the roumer mill would suggest this 
might change shortly. I could upgrade to their beta kernel I suppose.

> > ide1: reset: success
> 
> When did that reset happen?  It wasn't in the syslog that you sent.

Looks like it happened just before the logs were rotated. I missed it sorry.

Here it is:
Apr 25 04:35:18 kanga kernel: hdc: timeout waiting for DMA
Apr 25 04:35:18 kanga kernel: ide_dmaproc: chipset supported ide_dma_timeout func only: 14
Apr 25 04:35:18 kanga kernel: hdc: status timeout: status=0xd0 { Busy }
Apr 25 04:35:18 kanga kernel: hdd: DMA disabled
Apr 25 04:35:18 kanga kernel: hdc: drive not ready for command
Apr 25 04:35:27 kanga kernel: ide1: reset: success

The next entry was from the syslog output I provided in the previous message.
So the reset was just a few seconds before.

Apr 25 04:35:53 kanga syslogd 1.4.1: restart.

> > EXT3-fs error (device ide1(22,65)): ext3_readdir: bad entry in directory #2665467: rec_len % 4 != 0 - offset=0, inode=762621470, rec_len=44574, name_len=110
> 
> The rec_len is way out.  The inode number is probably also bad, but not
> sure...
> 
> > Assertion failure in journal_bmap_Rbbdc8009() at journal.c:602: "ret != 0"
> > kernel BUG at journal.c:602!
> 
> Just a symptom of bad data, not the real cause.  Note that I wanted to
> look at this bit of code, but that assertion is not even there anymore
> (the kernel turns the filesystem read only and just returns now).

So am I understanding you correctly that there is still no good way to tell if this
was hdwe or a software failure?

-- 
.............Tom	"Nothing would please me more than being able to 
tdiehl rogueind com	hire ten programmers and deluge the hobby market 
			with good software." -- Bill Gates 1976

   			We are still waiting ....






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]