Problems under Redhat EL3 and ext3

Andreas Dilger adilger at clusterfs.com
Thu Jul 20 07:17:44 UTC 2006


On Jul 19, 2006  17:00 -0700, Ulf Zimmermann wrote:
> I am running into performance issues with ext3. Historically we had our
> image files (pictures of cars, currently 5.3 million) sub divided into a
> directory structure [0-9]/[0-9]/[0-9]/[0-9], where we would take the
> first 4 letters/numbers of the file name and use that to put it into
> this structure. Letters [a-cA-C] would become a 0, [d-fD-F] a 1, etc. As
> the file names used to be based on VIN numbers of vehicles, that wasn't
> a problem. But then our developers changed the image file names using a
> vehicle ID from the database. And as we rolled over 1,000,000 in vehicle
> ids we would get large numbers of files into directories. And files do
> not get well distributed.
> 
> So we changed the method using [0-9a-f]/[0-9a-f]/[0-9a-f] and md5 on the
> file name, using then the first 3 letters/numbers to file it away. On
> initial testing this worked well, distribution nice across the
> directories, so we could split this on separate file systems or disks.
> 
> When we actually got to do this, a decision was made to use hard links
> from the old structure to the new structure for backward capability. And
> this turned into a disaster. Rsync or find on the new structure takes
> dramatic longer, talking about 5 minutes for a find on the old structure
> and hours on the new structure. Using strace I tracked it down to
> lstat64. On the old structure lstat64 takes on average 37 usecs/call
> while on the new structure it is over 2,400 usecs/call.
> 
> EL4 does not seem to have this problem, unfortunately I can't just
> upgrade, out of other reasons. So anyone have ideas why lstat64 would be
> so much slower on the new structure? Any help, hints, suggestions would
> be great.

Do you have directories with more than, say, 10-15,000 entries?
Do you have dir_index (directory indexing) feature enabled on your
filesystem?  This is done with "tune2fs -O dir_index" (even while
mounted) but only affects new directories.  I believe the RHEL3 code
has this functionality, but it isn't enabled by default like I
suspect it is on FC4.

Once you have enabled this, then an OFFLINE run of "e2fsck -fD {dev}"
will rebuild the directory indexes for existing directories.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.




More information about the Ext3-users mailing list