[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [dm-devel] [Lsf-pc] [LSF/MM TOPIC] a few storage topics
- From: Jeff Moyer <jmoyer redhat com>
- To: Jan Kara <jack suse cz>
- Cc: Andreas Dilger <adilger dilger ca>, Andrea Arcangeli <aarcange redhat com>, "linux-scsi vger kernel org" <linux-scsi vger kernel org>, Mike Snitzer <snitzer redhat com>, Christoph Hellwig <hch infradead org>, "dm-devel redhat com" <dm-devel redhat com>, fengguang wu gmail com, Boaz Harrosh <bharrosh panasas com>, "linux-fsdevel vger kernel org" <linux-fsdevel vger kernel org>, "lsf-pc lists linux-foundation org" <lsf-pc lists linux-foundation org>, Chris Mason <chris mason oracle com>
- Subject: Re: [dm-devel] [Lsf-pc] [LSF/MM TOPIC] a few storage topics
- Date: Tue, 24 Jan 2012 15:13:40 -0500
Jan Kara <jack suse cz> writes:
> On Tue 24-01-12 14:14:14, Jeff Moyer wrote:
>> Chris Mason <chris mason oracle com> writes:
>>
>> >> All three filesystems use the generic mpages code for reads, so they
>> >> all get the same (bad) I/O patterns. Looks like we need to fix this up
>> >> ASAP.
>> >
>> > Can you easily run btrfs through the same rig? We don't use mpages and
>> > I'm curious.
>>
>> The readahead code was to blame, here. I wonder if we can change the
>> logic there to not break larger I/Os down into smaller sized ones.
>> Fengguang, doing a dd if=file of=/dev/null bs=1M results in 128K I/Os,
>> when 128KB is the read_ahead_kb value. Is there any heuristic you could
>> apply to not break larger I/Os up like this? Does that make sense?
> Well, not breaking up I/Os would be fairly simple as ondemand_readahead()
> already knows how much do we want to read. We just trim the submitted I/O to
> read_ahead_kb artificially. And that is done so that you don't trash page
> cache (possibly evicting pages you have not yet copied to userspace) when
> there are several processes doing large reads.
Do you really think applications issue large reads and then don't use
the data? I mean, I've seen some bad programming, so I can believe that
would be the case. Still, I'd like to think it doesn't happen. ;-)
> Maybe 128 KB is a too small default these days but OTOH noone prevents you
> from raising it (e.g. SLES uses 1 MB as a default).
For some reason, I thought it had been bumped to 512KB by default. Must
be that overactive imagination I have... Anyway, if all of the distros
start bumping the default, don't you think it's time to consider bumping
it upstream, too? I thought there was a lot of work put into not being
too aggressive on readahead, so the downside of having a larger
read_ahead_kb setting was fairly small.
Cheers,
Jeff
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]