[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: Very slow directory traversal
- From: Andreas Dilger <adilger clusterfs com>
- To: Ross Boylan <ross biostat ucsf edu>
- Cc: ext3-users redhat com
- Subject: Re: Very slow directory traversal
- Date: Wed, 10 Oct 2007 09:59:20 -0600
On Oct 06, 2007 00:10 -0700, Ross Boylan wrote:
> My last full backup of my Cyrus mail spool had 1,393,569 files and
> cconsumed about 4G after compression. It took over 13 hours. Some
> investigation led to the following test:
> time tar cf /dev/null /var/spool/cyrus/mail/r/user/ross/debian/user/
FYI - "tar cf /dev/null" actually skips reading any file data. The
code special cases /dev/null and skips the read entirely.
> That took 15 minutes the first time it ran, and 32 seconds when run
> immediately thereafter. There were 355,746 files. This is typical of
> what I've been seeing: initial run is slow; later runs are much faster.
I'd expect this is because on the initial run the on-disk inode ordering
causes a lot of seeks, and later runs come straight from memory. Probably
not a lot you can do directly, but e.g. pre-reading the inode table would
be a good start.
> I found some earlier posts on similar issues, although they mostly
> concerned apparently empty directories that took a long time. Theodore
> Tso had a comment that seemed to indicate that hashing conflicts with
> Unix requirements. I think the implication was that you could end up
> with linearized, or partly linearized searches under some scenarios.
> Since this is a mail spool, I think it gets lots of sync()'s.
There was an LD_PRELOAD library that Ted wrote that may also help:
http://marc.info/?l=mutt-dev&m=107226330912347&w=2
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]