status of dir_index in 2.4 kernels ?

Theodore Ts'o tytso at mit.edu
Fri Sep 24 19:59:05 UTC 2004


On Fri, Sep 24, 2004 at 09:13:34AM +0200, Jakob Curdes wrote:
> d) The performance of the htree indexed filesystem depends on the usage 
> by the userspace programs; if they open all files in a directory after 
> gaining directory information with readdir() the performance is worse 
> than with a vanilla ext3 fs, at least if we have many files in that 
> directory [as it is the case with maildir structures].  

Correct.  The performance of the htree indexed filesystem can be worse
than that of a vanilla ext3 filesystem if the application opens all of
the files in readdir() order.  This is because readdir() has to return
the directory entries in hash sort order.  In contrast, in a vanilla
ext3 filesystem, normally directory entries are added in the order
that they were created, and inodes are created in sequential order.

So on a normal ext3 filesystem w/o htree, opening the files in readdir
order is roughly equivalent to reading them in inode number sort
order, which is a big win since it avoids the disk seaking all over
the place.  This difference can be diminished if the directory has a
lot of file creates and deletes, such that over time, readdir() order
!= inode number sort order.  This is particular true in maildir
directories, if mail messages are deleted, refiled, etc.  So if the
directory is badly out of order, the spd_readdir.so preload library
can make a big difference to performance in this scenario as well.

Why can't we do this spd_readdir trick in userspace?  Because
directories can be very large, and we don't want to be allocating this
much memory in the kernel.  

						- Ted





More information about the Ext3-users mailing list