[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: how to counteract slowdown



On Tue, 13 Nov 2001, Andrew Morton wrote:
> Daniel Pittman wrote:

[...]

>> Yes, it would. Now, this latency would happen only if the journal got
>> to be so full that we couldn't fit more data, right? The code looks
>> that way (unless the journal is destroyed or the FS remounted, of
>> course.)
> 
> Pretty much, yes. I think the problem we're seeing is to do with the
> fact that once the journal is 1/4 used, we force checkpointing of the
> in-memory data into the main fs, 

When it's 25% full? That seems ... low to me. I would have expected to
see that happen at closer to 75%, because then the standard asynchronous
write-out would be able to close transactions for you for "free" without
blocking the filesystem.

> and this effectively blocks the fs. For something like ext2, we start
> async writeout of dirty data when it reaches 40% of all memory. We
> start sync writeout (to throttle writers) at 60%. So on a 512 megabyte
> machine, the writer can pump an additional 100 megs of data into the
> fs after IO has started. With ext3 in journalled data mode, or with
> metadata-intensive loads we don't have that extra buffer.

That would explain exactly the situation I was seeing here, once you
scale the numbers down some. 

> I suspect that for long-term worloads it doesn't make a lot of
> difference. With ext2, writers will still end up getting blocked. But
> later, and for longer.

I can see that. OTOH, shorter workloads tend to perform better because
they see the bigger buffers.

>> I suspect that increasing the flush delay will help smooth the load
>> /I/ see, but that it's relevant only because I have a specific case
>> that it helps: enough ram that it's better to buffer until the 30
>> second write burst is done before forcing some (lazy) writeback...
> 
> mm..  So you'd need a monstrous journal, and we need to start
> async checkpointing at 25% journal occupancy (wakeup_bdflush()?)
> and synchronous checkpointing at 75%....

I suspect that 100MB counts as monstrous in this context. :)

Anyway, the idea of delaying the synchronous checkpoint until the 75%
point or, possibly, even until the journal actually fills, would make
sense to me.

If you don't block the writer and it's a short workload (mail fetching)
it's not going to actually /need/ that synchronous write. If it's
long-term write load, it's going to hit the sync point...

How about a model like this:

When the journal hits 25% used, start async write-out.
  -- this should also be the case for the kjournald old data. :)
When the journal hits 75% used, start sync write-out.
When the usage drops to 50% (or so), stop the sync write-out but
   continue the async write-out.

That way you would hopefully see reasonable performance for short
workloads up to 75% of the journal size.

If the write load is higher than that, the big writers get blocked
waiting for the load to drop back down below 50%, and then are allowed
to continue.

You would probably see a fluctuation between 50% and 75% used on long
term heavy write loads, but short loads would be a lot smoother.

        Daniel

-- 
Art is the soul of a people.
        -- Romare Bearden





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]