[Linux-cluster] Ext3/ext4 in a clustered environement

Wed Nov 9 16:00:39 UTC 2011

Steven Whitehouse wrote:

>> We see appreciable knee points in GFS directory performance at 512, 4096 
>> and 16384 files/directory, with progressively worse performance 
>> deterioration between each knee pair. (It's a 2^n type problem)
>>
> That is a bit strange. The GFS2 directory entries are sized according to
> (length of file name + length of fixed size info) which means that
> generally the number of blocks required to store a specific number of
> files is not constant unless the file names are all the same length.

Generally they are, as are file sizes.

> Also, once a directory has been unstuffed, the hash table will grow
> until it is 128k in size, which is 16k pointers. So with 16384 directory
> entries, you should be a long way from having a full hash table, since
> each leaf block should contain around 80 entries (again depending on
> filename length), so thats not too far off 1m entries.

Should be, but performance becomes unusable long before that happens.

> So for all unstuffed directories with fewer than about 1m entries, I'd
> expect to see all accesses resulting in the following I/O pattern:
>  1. Look up hash table block
>  2. Look up dir leaf block
>  3. Look up inode (if this is a ->lookup rather than readdir)
> 
> What test are you using to generate the performance figures in this
> case?

"ls -l" - which is what the clients are using as they import data for 
number crunching work. Rsync uses a raw directory read but the stat() 
calls on individual files are pretty similar.

Once the information is cached, accessing the directory is fast until 
the cache expires (3-10 minutes)

There is definite and very measurable performance degradation as more 
files are added to a directory - even on things as simple as an 
incremental backup the number of files opened/second falls away rapidly 
as directories get larger.