[rhelv6-list] RHEL6.2 XFS brutal performence with lots of files
Daryl Herzmann
akrherz at iastate.edu
Fri Apr 12 13:42:16 UTC 2013
Hi,
I figured I'd try to solicit the kind help of the mailing list again on
this as I continue to have issues with XFS and RHEL6. For example, I have
a 12 TB software RAID5 filesystem on a LSI 92118 and the drives are 3 TB
Seagate Barracuda ST3000DM001. This filesystem currently has around 140
million files with many of them smaller than 50 KB. This system is running
a fully patched RHEL6.4
Within this filesystem, I have a one particular tree of files I need to
remove. There are ~170 folders with around 4-10 sub-folders each and about
1,000 files in each of those sub-folders. Most files are less than 40KB.
Attempting to list out one of those top level folders like so:
ls -R * | wc -l
takes 50 seconds and wc reports 3825 lines (~files). Watching iostat during
this operation, the tps value pokes along around 100 to 150 tps. This
filesystem is doing other things at the time as well. Just running iostat
without args currently reports:
avg-cpu: %user %nice %system %iowait %steal %idle
11.12 0.03 2.70 3.60 0.00 82.56
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
md127 134.36 10336.87 11381.45 19674692141 21662893316
If I go into one of these folders with 1,000 files or so in it and attempt
to list out the directory cold, it takes 10-15 seconds. Attempting to
remove one of the top level folders takes a long time and the other
filesystem operations at the time feel very sluggish as well.
$ time rm -rf myfolder (this is around 4,000 files total within 6
subfolders of myfolder)
real 2m36.925s
user 0m0.018s
sys 0m0.657s
Running hdparm on one of the software raid5 drives reports decent numbers.
/dev/sdb:
Timing cached reads: 12882 MB in 2.00 seconds = 6448.13 MB/sec
Timing buffered disk reads: 396 MB in 3.06 seconds = 129.39 MB/sec
running some crude dd tests shows reasonable numbers, I think.
# dd bs=1M count=1280 if=/dev/zero of=test conv=fdatasync
1342177280 bytes (1.3 GB) copied, 29.389 s, 45.7 MB/s
I have other similiar filesystems on ext4 with similiar hardware and
millions of small files as well. I don't see such sluggishness with small
files and directories there. I guess I picked XFS for this filesystem
initially because of its fast fsck times.
Here are some more details on the filesystem
# xfs_info /dev/md127
meta-data=/dev/md127 isize=256 agcount=32, agsize=91570816
blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=2930265088, imaxpct=5
= sunit=128 swidth=512 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
# grep md127 /proc/mounts
/dev/md127 /mesonet xfs
rw,noatime,attr2,delaylog,sunit=1024,swidth=4096,noquota 0 0
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdf[3] sde[0] sdd[5] sdc[1] sdb[2]
11721060352 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5]
[UUUUU]
Anybody have ideas? or are these still known issues with XFS on RHEL as
noted here:
http://www.redhat.com/summit/2011/presentations/summit/decoding_the_code/thursday/wheeler_t_0310_billion_files_2011.pdf
thanks
daryl
On Fri, Jun 22, 2012 at 7:45 PM, Daryl Herzmann <akrherz at iastate.edu> wrote:
> On Tue, Jun 5, 2012 at 3:10 PM, Jussi Silvennoinen
> <jussi_rhel6 at silvennoinen.net> wrote:
> >> I've been noticing lots of annoying problems with XFS performance with
> >> RHEL6.2 on 64bit. I typically have 20-30 TB file systems with data
> >> structured in directories based on day of year, product type, for
> example,
> >>
> >> /data/2012/06/05/product/blah.gif
> >>
> >> Doing operations like tar or rm over these directories bring the system
> to
> >> a grinding halt. Load average goes vertical and eventually the power
> button
> >> needs to be pressed in many cases :( A hack workaround is to break
> apart the
> >> task into smaller chunks and let the system breath in between
> operations...
> >>
> >> Anyway, I read Ric Wheeler's "Billion Files" with great interest
> >>
> >>
> >>
> http://www.redhat.com/summit/2011/presentations/summit/decoding_the_code/thursday/wheeler_t_0310_billion_files_2011.pdf
> >>
> >> It appears there are 'known issues' with XFS and RHEL6.1. It does not
> >> appear these issues were addressed in RHEL 6.2?
> >>
> >> Does anybody know if these issues were addressed in the upcoming RHEL
> 6.3?
> >> My impression is that upstream fixes for this only recently (last 6
> months?)
> >> appeared in the mainline kernel.
> >>
> >> Perhaps I am missing some tuning that could be done to help with this?
> >
> >
> > Enabling lazy-count does wonders for workloads that involve massive
> amounts
> > of metadata. Unfortunately it's a mkfs-time option only AFAIK.
>
> Thanks, but it was already enabled...
>
> daryl
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/rhelv6-list/attachments/20130412/ee253036/attachment.htm>
More information about the rhelv6-list
mailing list