[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ext3-2.4-0.9.0

Neil Brown wrote:
> On Saturday July 7, andrewm uow edu au wrote:
> > An update of the ext3 journalling filesystem for 2.4 kernels
> > is available at
> >
> >       http://www.uow.edu.au/~andrewm/linux/ext3/
> >
> > Patches are against 2.4.6-ac1 and 2.4.6.
> I thought it was time to try out ext3 between nfsd and raid5, so I
> built 2.4.6  plus this patch, and an ext3 filesystem on a largish
> raid5 volume, exported it (with the "sync" flag), mounted it from
> another machines with NFSv2, and ran "dbench 4".
> This produces a live-lock (I think that it the right term).
> Throughput would drop to zero (determined by watching the counts in
> /proc/nfs/rpc/nfsd), but could be coaxed along by generating other
> filesystem activity.
> I tried nfs over ext3 on a plain ide disc and it worked fine.
> I tried dbench directly on ext3/raid5 and it worked fine.
> I tried dbench/nfs/ext2/raid5 and it worked fine.
> So I think it is some interaction between ext3fs and raid5 triggered
> by the high rate of "fsync" calls made by nfsd.  Naturally I blame
> ext3 because I know more about raid5 and nfsd :-)

fsync will cause ext3 to commit the current transaction once all
handles against it close - so that will produce rapid bursts
of small numbers of writes.

> One particular aspect of raid5 that *could* be related is that it is
> very reticent to schedule write requests. It tries to hang on the them
> as long as possible in the hope of getting more write requests in the
> same stripe.  My guess as to what is happening is that as write
> request is submitted and then waited-for without an intervening
>                 run_task_queue(&tq_disk);

Could well be.  ext3 will happily feed 2,000 buffers into submit_bh()
prior to running tq_disk.  Everything else is happy with this, so I blame
nfsd and raid5 :)  Rapid fsyncs will break this up, however.

Does this patch help?

--- fs/jbd/commit.c	2001/07/01 04:24:42	1.40
+++ fs/jbd/commit.c	2001/07/08 00:53:42
@@ -202,6 +202,7 @@
 			ll_rw_block(WRITE, bufs, wbuf);
+			run_task_queue(&tq_disk);
 			journal_brelse_array(wbuf, bufs);
 			goto write_out_data;
@@ -410,6 +411,7 @@
 				bh->b_end_io = end_buffer_io_sync;
 				submit_bh(WRITE, bh);
+			run_task_queue(&tq_disk);
 			/* Force a new descriptor to be generated next

> When the system is livelocked, all I can tell at the moment (I am at
> home and the console is at work so I cannot use alt-sysrq) is that
> kjournal is waiting in wait_on_buffer and an nfsd thread is waiting on
> the journal.

That sounds like Something Wierd is going on.  wait_on_buffer will
unplug and the disks should be going hell-for-leather.

> I will try to explore it more deeply next time I am at work, but if
> there are any suggestions as to what it might be, or how I might more
> easily find out what is going on, I am all ears.

I'll see if I can get it to happen here.  Thanks.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]