[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Cluster-devel] [PATCH 26/58] [GFS2] Revise gfs2_logd and flush thresholds



On Mon, 2008-01-21 at 09:21 +0000, swhiteho redhat com wrote:
From: Steven Whitehouse <swhiteho redhat com>

This patch intriduces two new log thresholds:

 o thresh1 is the point at which we wake up gfs2_logd due to the pinned
   block count. It is initialised at mount time to 2/5ths of the size
   of the journal. Currently it does not change during the course of
   the mount, but the intention is to adjust it automatically based
   upon various conditions. This automatic adjustment will be the subject
   of later patches.

 o thresh2 is the point at which we wake up gfs2_logd due to the total
   of pinned blocks and AIL blocks. It is initialised at mount time
   to 4/5ths of the size of the journal. The reason for not making it
   equal to 100% of journal size is to give gfs2_logd time to start up
   and do something before the processes filling the journal before
   they land up stalling, and waiting on journal flushes.

At the same time, the incore_log_blocks tunable is removed since it
was wrong (just a basic fixed threshold set to a number plucked out
of the air) and it was being compared against the wrong thing (the
amount of metadata in the journal) rather than the total number of
blocks.

Also, since the free blocks count is now an atomic variable, a
number of these comparisons now do not need locking, so that
the log lock has been removed around some operations.

This patch also ensures that there are no races when gfs2_logd is
woken up. It also changes the behavour of the periodic sync
so that instead of occuring every 60 secs, they will now
occur every 30 secs (which can be set via /sysfs still) if
there have been no other log flushes in the mean time.

When we reserve blocks at the start of a transaction, we now
use a waitqueue too. This means we can remove the old mutex
and the fast path through that code is just a couple of atomic
operations now. Also we no longer do log flushing at this point
in the code. Instead we wake up gfs2_logd to do it for us (this
shouldn't happen if the log is large enough and if gfs2_logd is
properly tuned) and do an exclusive wait.

As a result of these changes, postmark on my test machine runs about
20% faster, mainly due to increased efficiency in flushing the
journal.


Steve,

I still think this one is a bad idea. Postmark might be 20% faster, but every other benchmark we have run shows this causes significant I/O delays and poor performance.  It also changes the interface between users space utilities and the filesystem.

Kevin


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]