Questions regarding journal replay

Wed Feb 25 17:34:59 UTC 2009

On Wed, Feb 25, 2009 at 10:31:42AM -0600, Eric Sandeen wrote:
> 
> It'd be better to get to the bottom of the problem ... maybe iostat
> while it's happening to see if IO is actually happening; run blktrace to
> see where IO is going, do a few sysrq-t's to see where threads are at, etc.
> 
> Can you find a way to reproduce this at will?
> 
> Journal replay should *never* take this long, AFAIK.

Indeed.  The journal is 128 megs, as I recall.  So even if the journal
was completely full, if it's taking 800 seconds, that's a write rate
of 0.16 Mb/S (164 kb/second).   That is indeed way too slow.  

I assume this wasn't your boot partition, so the journal replay was
being done by e2fsck, right?  Or are you guys skipping e2fsck and the
journal replay was happening when you mounted the partition?  If the
journal replay is happening via e2fsck, is fsck running any other
filesystem checks in parallel? 

Also, what is the geometry of your raid?  How many disks, what RAID
level, and what is the chunk size?  The journal replay is done a
filesystem block at a time, so it could be that it's turning into a
large number of read-modify-writes, which is trashing your performance
if the chunk size is really large.

The other thing that might explain the performan problem is if the
somehow the number of multiple outstanding requests allowed by the
hard drive has been clamped down to a very small number, and so a
large number of small read/write requests is really killing
performance.  The system dmesg log might have some hidden clues about
that.

						- Ted