optimising filesystem for many small files

Matija Nalis mnalis-ml at voyager.hr
Sun Oct 18 11:41:00 UTC 2009


On Sun, Oct 18, 2009 at 03:01:46PM +0530, Viji V Nair wrote:
> The application which we are using are modified versions of mapnik and
> tilecache, these are single threaded so we are running 4 process at a

How does it scale if you reduce the number or processes - especially if you
run just one of those ? As this is just a single disk, 4 simultaneous
readers/writers would probably *totally* kill it with seeks.

I suspect it might even run faster with just 1 process then with 4 of
them...

> time. We can say only four images are created at a single point of
> time. Some times a single image is taking around 20 sec to create. I

is that 20 secs just the write time for an precomputed file of 10k ? 
Or does it also include reading and processing and writing ?

> can see lots of system resources are free, memory, processors etc
> (these are 4G, 2 x 5420 XEON)

I do not see how the "lots of memory" could be free, especially with such a
large number of inodes. dentry and inode cache alone should consume those
pretty fast as the number of files grow, not to mention (dirty and
otherwise) buffers...

You may want to tune following sysctls to allow more stuff to remain in
write-back cache (but then again, you will probably need more memory):

vm.vfs_cache_pressure
vm.dirty_writeback_centisecs
vm.dirty_expire_centisecs
vm.dirty_background_ratio
vm.dirty_ratio


> The file system is crated with "-i 1024 -b 1024" for larger inode
> number, 50% of the total images are less than 10KB. I have disabled
> access time and given a large value to the commit also. Do you have
> any other recommendation of the file system creation?

for ext3, larger journal on external journal device (if that is an option)
should probably help, as it would reduce some of the seeks which are most
probably slowing this down immensely.


If you can modify hardware setup, RAID10 (better with many smaller disks
than with fewer bigger ones) should help *very* much. Flash-disk-thingies of
appropriate size are even better option (as the seek issues are few orders
of magnitude smaller problem). Also probably more RAM (unless you full
dataset is much smaller than 2 GB, which I doubt). 

On the other hand, have you tried testing some other filesystems ? 
I've had much better performance with lots of small files of XFS (but that
was on big RAID5, so YMMV), for example.

-- 
Opinions above are GNU-copylefted.




More information about the Ext3-users mailing list