[Date Prev][Date Next] [Thread Prev][Thread Next]
Re: ext2/ext3 directory handling
- From: Andreas Gietl <a gietl e-admin de>
- To: mike igconcepts com, Andrew Morton <akpm digeo com>
- Cc: "Alan R.Becker" <beckera mail-now com>, ext3-users redhat com
- Subject: Re: ext2/ext3 directory handling
- Date: Sun, 18 May 2003 18:20:02 +0200
On Sunday 18 May 2003 18:01, Michael Harris wrote:
> Just to add, I can attest that moving the files from the old dir to the new
> as described improves performance on my machines dramatically. In our
> service we end up with directories of 150k+ files which are generally
> touched only as they are added, though every file will be touched several
> times over a month. The files are each around 50kB. When the directory
> entry gets to be about 4MB it begins to take a long time for remote
> machines to copy files into the directory, maybe 4 seconds for a 50kB file
> on a switched 100 base network. The performance hit is worst with remote
> machines using SMB. Compressing the directory entry with mkdir new
> cp old/* new/
> rm -rf old
> mv old new
> definitely improves things, but generally when there gets to be more than
> 200k files we have to roll over to a new directory to keep things moving. I
> suspect the remote machines are effectively downloading the directory entry
> with each copy to the server, but I also see the smbd tasks pegging on the
> server as well, but never really investigated it. We see this with ext2 and
> ext3. Not really looking for a solution here but just offering the info,
> but if anyone has a quick fix please share it. I may try resiserfs someday
> but for now we just use thousands of directories for the files.
which way do you normaly use to push the files when you don't use smb?
> > "Alan R.Becker" <beckera mail-now com> wrote:
> > > (1) Is the assumption that directories don't compress when deleting
> > > files correct? How is this handled (in general terms)?
> > That is correct. A deleted file leaves a "hole" in the directory
> > which a new addition can fill (if it fits).
> > > (2) Is there any difference between ext2 and ext3?
> > No.
> > > (3) Does the htree code change the picture any (even
> > > though I don't use it, and won't until it is production) ?
> > No, htree will not release directory blocks.
> > > (4) Is it possible that the directories themselves
> > > were fragmented?
> > Yes, very probable.
> > However to understand why things slowed down a bit more info is needed.
> > It is probable that the many little files in one typical directory are
> > splattered all over the disk. Does your workload regularly touch all the
> > file in these directories? If so then it maybe suffering from this lack
> > of inter-file locality.
> > If not then yes, perhaps the problem is due to large, fragmented
> > directories.
> > How many bytes does a typical directory consume? If you have the disk
> > space, and are confident that (say) 64k is "enough" then perhaps you
> > could grow each user's mail directory to (say) 64k when that user is
> > created. This way they will have a nice unfragmented directory for all
> > time.
> > > (5) After doing a "mkdir" to create a new directory, how many
> > > file entries can it hold before it would be expanded to accept
> > > another file?
> > 4 kilobytes. Each directory entry consumes eight bytes, plus the length
> > of the name rounded up to a multiple of 4 bytes.
> > > When a directory is expanded, how many additional
> > > file entries can be stored before needing another expansion?
> > Another 4 kilobytes.
> > > (6) Say I have a directory containing some files, then I delete
> > > some files, and finally I start adding files. Will new file
> > > entries use empty or vacated directory slots before expanding
> > > the directory?
> > Deletion causes holes. Holes are coalesced within a 4k block. Holes are
> > allocated from on a first-fit basis.
> > > (7) I am aware of e2defrag (latest version I have found is 0.73).
> > > Does this program (or any other any tool) perform any
> > > directory optimization that would affect this problem?
> > It's obsolete.
> > For your purposes, all you'd need to do to defrag a directory is
> > mkdir new
> > ln old/* new/
> > rm -rf old
> > mv old new
> > If you use `cp' instead of `ln' then you'll defrag the files themselves,
> > and lay them out close to each other. Which is only important if you app
> > regularly touches lots of files in a single directory. It probably does
> > not..
> > > (8) If e2defrag would be helpful, has it/is it being brought
> > > forward to operate correctly with current (RH 8/9) systems?
> > > I see some warnings about blocksise restrictions, etc.
> > I haven't heard of anyone using it in ages.
> > > (9) In designing new systems, are there some useful guidelines
> > > about the maximum number of files that can exist in a single
> > > directory without significant performance loss?
> > > I am interested in ext2, ext3, and htree.
> > Non-htree gets awkward at a few thousand. htree appears to be OK up to
> > hundreds of thousands. Its practical scalability is unknown, really.
> > _______________________________________________
> > Ext3-users mailing list
> > Ext3-users redhat com
> > https://www.redhat.com/mailman/listinfo/ext3-users
> Ext3-users mailing list
> Ext3-users redhat com
e-admin internet gmbh
Andreas Gietl tel +49 941 3810884
Ludwig-Thoma-Strasse 35 fax +49 89 244329104
93051 Regensburg mobil +49 171 6070008
PGP/GPG-Key unter http://www.e-admin.de/gpg.html
[Date Prev][Date Next] [Thread Prev][Thread Next]