[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Assert in jbd-kernel.c



Hello. I have installed the ext3 file system on a test system, and sometimes I have a problem: I get an assert from within jbd-kernel.c, and whatever prgram was writing to the disk when this happens is unable to continue.

The system is a server I built, which I named "dax". It is running Debian unstable, and I updated it to all the latest packages in Debian unstable as of today. It is running a Linux 2.4.10 kernel. On the Linux Weekly News page I saw someone offering 2.4.10 sources pre-patched for both ext3 and kernel preemption, so I built from those sources.

http://lwn.net/2001/0927/a/ext3-preempt.php3
http://lameter.com/kernel/linux-2.4.10-ext3-preempt.tar.gz


I enabled both ext3 and kernel preemption. The server is running Linux software RAID, using RAID 1 (mirroring) on both ext3 filesystems. I have seen the same problem and assert several times now. Whenever I see this happen, I always reboot the system, since I am not sure how serious the problem is.




The assert text is as follows:

-- cut here -- cut here -- cut here -- cut here -- cut here --
Message from syslogd dax at Tue Oct 9 12:07:47 2001 ...
dax kernel: Assertion failure in jbd_preclean_buffer_check() at jbd-kernel.c:80:
"(((bh)->b_state & (1UL << BH_Dirty)) != 0)"
-- cut here -- cut here -- cut here -- cut here -- cut here --




Here is line 80 from jbd-kernel.c:

-- cut here -- cut here -- cut here -- cut here -- cut here --
                      J_ASSERT_JH(jh, buffer_dirty(bh));
-- cut here -- cut here -- cut here -- cut here -- cut here --



Then on the reboot, the boot message included this:

-- cut here -- cut here -- cut here -- cut here -- cut here --
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
(recovery.c, 253): journal_recover: JBD: recovery, exit status 0, recovered tran
sactions 20383 to 20655
(recovery.c, 255): journal_recover: JBD: Replayed 5021 and revoked 37/134 blocks
kjournald starting. Commit interval 5 seconds
EXT3-fs: md(9,0): orphan cleanup on readonly fs
ext3_orphan_cleanup: truncating inode 100113 to 894 bytes
EXT3-fs: md(9,0): 1 truncate cleaned up
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 204k freed
Unable to find swap-space signature
Adding Swap: 249440k swap-space (priority -1)
EXT3 FS 2.4-0.9.9, 5 Sep 2001 on md(9,0), internal journal
kjournald starting. Commit interval 5 seconds
EXT3 FS 2.4-0.9.9, 5 Sep 2001 on md(9,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
-- cut here -- cut here -- cut here -- cut here -- cut here --



The problem happened again today, as I was using aptitude(1) to update the packages on the system. aptitude downloaded all the packages successfully, but as it began to unpack them all the error occurred. Unpacking many packages does cause a lot of disk activity, so perhaps the problem is related to a lot of disk activity. I rebooted, and again ran aptitude; it managed to unpack a few more packages when the error ocurred again. I rebooted, tried again, hit the problem again, rebooted, tried again, and finished the package unpacking and installation without further errors. (It made progress on unpacking the packages each time, and the last time it had only a few packages left.)


I would like to help in finding and fixing the problem, if I can. If there is some sort of extra logging or debugging that I can enable, I am very willing to do it. I have no experience with kernel or file system debugging, but I am an experienced software engineer, and I can spare some time to work on this.

--
Steve R. Hastings		"Vita est"
steve hastings org		http://www.blarg.net/~steveha






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]