[Linux-cluster] GFS file size/performance question

Danny Wall Danny.Wall at health-first.org
Tue Apr 14 01:39:08 UTC 2009


I realize that GFS is not the most optimized filesystem for a lot of small files, but at what point does that become a concern? Is it when you have a lot of files less than 1MB? Less than 10MB?

We have five Red Hat clusters. The original system is RHEL 4.4 with RHCS and GFS. Most filesystems are 2TB LUNs on a Fibre Channel SAN, with hundreds of thousands of folders (50,000 at root) and millions of files.  The files range from 60KB to about 5MB each, with 20-100 files in each folder. 

We are having a problem with the new cluster. After a while of being on a node, the performance is horrible until we migrate the users to a different node. There are no issues with RAM, CPU, disk or NIC IO. 

The new cluster servers are RHEL 5.1 with RHCS and GFS. There are currently 4 folders off root, with 5,000-7,000 sub-folders each. The folders generally hold approx. 10-100 files, ranging in size from 1k to 5MB. Several of these folders have 100 files all less than 1MB. The three servers in one cluster have a single 2TB FC LUN attached using GFS. Service to the files is generally only provided from one node at a time, except off hours during backups, so there should not be a lot of locking issues.

Both clusters are running samba that comes with the respective versions of RHEL, for WinXP and Win2003 workstations in a Win2003 AD domain.

Is it possible that GFS performance is worse on the newer, more powerful cluster nodes because there are so many files under 100k? At what point does GFS performance really start taking a hit due to smaller file sizes?

The newer servers all have 32GB RAM and 8 CPU cores.

Thanks
Danny



#####################################
This message is for the named person's use only.  It may 
contain private, proprietary, or legally privileged information.  
No privilege is waived or lost by any mistransmission.  If you 
receive this message in error, please immediately delete it and 
all copies of it from your system, destroy any hard copies of it, 
and notify the sender.  You must not, directly or indirectly, use, 
disclose, distribute, print, or copy any part of this message if you 
are not the intended recipient.  Health First reserves the right to 
monitor all e-mail communications through its networks.  Any views 
or opinions expressed in this message are solely those of the 
individual sender, except (1) where the message states such views 
or opinions are on behalf of a particular entity;  and (2) the sender 
is authorized by the entity to give such views or opinions.
#####################################
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090413/e19f495b/attachment.htm>


More information about the Linux-cluster mailing list