[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Ext3 indexed directory extension.

On Fri, Aug 23, 2002 at 03:15:19AM -0600, Andreas Dilger wrote:
> > > Use lsattr <dir> with a newer e2fsprogs, and it will have a flag set,
> > > I believe it is "h" (hash) or "i" (index).
> > 
> > Thakyou very much for the information.

Actually, lsattr isn't printing that information right now.  ('i' is
the immutable flag.)  

We could print this information, but arguably users never need to know
whether or not a directory is hashed or not, since it has no visible
differences in the behaviour (just in the performance).  

I could go either way on this one, since I know that users might be
interested.  However, we don't have a way of authoratiative
determining that a symlink is a "fast" symlink either, which is a
similar "performance-only" feature.  

You can check to see if the size is < 60 bytes, yes, but in the case
of htree directories, if the directory is > 3 blocks and was created
on a htree directory system, it will be a hashed directory.

BTW, if you're concerned about older directories not being hashed when
converting over to using htree directories, version 1.28 of e2fsck
will have a new flag, -D, which will optimize all directories.  If the
filesystem has enabled htree indexing, it will convert (or re-index)
the directories that are larger than 3 blocks to be indexed-hash
trees.  If the filesystem has not enabled htree indexing, this flag
will "compress" directories, by recopying them to eliminate unused
space in the directories.

> e2fsprogs.sf.net - version 1.28 is very recent and has support for
> this flag.  I'm not sure whether the patch you have has a supported
> hash function though...  This was in flux up to a week ago or so.

1.28 is not fully released yet.  The latest 1.28-WIP (work in
progress) was released a week or ago, and so e2fsprogs is currently in
bug-fix-only mode.  I am looking for testers for 1.28-WIP-0817, since
I want to make sure we shake out the most embarassing bugs before 1.28
goes final.

As far as the hash function is concerned, as long as it's Christopher
Li's port of the htree patches to 2.4, which is still using Daniel
Phillips "dx_hack_hash", you'll be fine.  If the patch you have is
based off of the CVS "features" branch which uses a half-MD4 hash, it
won't be compatible.  Fortunately, as far as I know the CVS "features"
branch never escaped as a stand-alone patch, so I'm pretty sure you'll
be all right.  

> Hopefully soon.  I don't think there are any serious known problems
> right now, but there still needs to be some cleanup done (e.g. htree and
> e2fsck to live happily together).

The e2fsck/htree support is done, modulo bug fixing.  There are some
test cases that I still want to add into the e2fsck regression test
suite, and that may turn up some bugs, but I'm pretty confident about
the e2fsck htree support at this point.  Being able to convert
arbitrary large directories to use htree was the last major piece of

On the kernel side, the major tasks that need to be done are:

	* Support for multiple hash functions, and using the
		superblock field for the default hash function for new
		directories.  (Andreas, you were working on code for 
		that, right?  Is that ready for release yet?)
	* Adding support for the modified half-MD4 hash which is endian
		independent, and the TEA hash, which should be the 
		preferred hash going forward.
	* Port to 2.5

On the userland side, there's one minor change which still needs to be
done, and that's mke2fs/tune2fs support for enabling the htree support
so that you don't have to use debugfs to turn on the feature flag.
The reason why I haven't done this yet is this raises the ugly
question of how the default hash field in the superblock should be
filled in, and until we have canonical kernel patch which supports the
new hash functions, I didn't want to release code that made this
decision in either direction.  I really don't want to encourage people
to use the hack hash, since it is significantly worse at spreading
files, and right now, the main patch folks are likely to use only
support that old legacy dx_hack_hash.

						- Ted

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]