[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] GFS Tunables

Brandon Young wrote:
Hi all,

I currently have a GFS deployment consisting of eight servers and several GFS volumes. One of my GFS servers is a dedicated backup server with a second replica SAN attached to it through a second HBA. My approach to backups has been with tools such as rsync and rdiff-backup, run on a nightly basis. I am having a particular problem with one or two of my filesystems taking a *very* long time to backup. For example, I have /home living on GFS. Day-to-day performance is acceptable, but backups are hideously slow. Every night, I kick off an rdiff-backup of /home from my backup server, which dumps the backup onto an XFS filesystem on the replica SAN. This backup can take days in some cases.

Not only GFS, the "getdents()" has been more than annoying on many
filesystems if entries count within the directory is high - but, yes,
GFS is particularly bloody slow with its directory read. There have been
efforts contributed by Red Hat POSIX and LIBC folks to have new
standardized light-weight directory operations. Unfortunately I lost
tracks of their progress ... On the other hand, integrating these new
calls into GFS would take time anyway (if they are available) - so
unlikely it can meet your need. There were also few experimental GFS
patches but none of them made into the production code.

Unless other GFS folks can give you more ideas, I think your best bet at
this moment is to think "outside" the box. That is, don't do
file-to-file backup if all possible. Check out other block level backup
strategies. Are Linux LVM mirroring and/or snapshots workable for you ?
Does your SAN vendor provide embedded features (e.g. Netapp SAN box
offers snapshot, snapmirror, syncmirror, etc) ?

-- Wendy

We have done some investigating, and found that it appears that getdents(2) calls (which give the list of filenames present in a directory) are spectacularly slow on GFS, irrespective of the size of the directory in question. In particular, with 'strace -r', I'm seeing a rate below 100 filenames per second. The filesystem /home has at least 10 million files in it, which doing the math means 29.5 hours just to do the getdents calls to scan them, which is more than a third of wall-clock time. And that's before we even start stat'ing.

I google'd around a bit and I can't see any discussion of slow getdents calls under GFS. Is there any chance we have some sort of tunable turned on/off that might be causing this? I'm not sure which tunables to consider tweaking, even. This seems awfully slow, even with sub-optimal locking. Is there perhaps some tunable I can try tweaking to improve this situation? Any insights would be much appreciated.


Linux-cluster mailing list
Linux-cluster redhat com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]