[Linux-cluster] GFS performance

Fri Jan 4 16:16:47 UTC 2008

Ah ha!  I think this is starting to make sense now, Wendy.

And thank you for the explanation of why we should be using DLM rather than GULM.

So without the patch, which we do not have, it might be good to increase demote_secs [per GFS mount] to 600 or even more seconds, and scand_secs to...what's a reasonable/safe value on that?  It sounds like without the patch all we're doing -- to paraphrase you -- is reducing the frequency of operations which do no good and cause harm in the form of CPU and I/O resource usage.

The patch is built into RHEL 4.6 and 5.1, right?  When are those expected to be available (we only care about 4.6 right now) and/or how do we get the standalone patch?

Thanks again to everyone for the feedback and information.

- K

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Wendy Cheng
Sent: Friday, January 04, 2008 11:04 AM
To: linux clustering
Subject: Re: [Linux-cluster] GFS performance

Kamal Jain wrote:
> Feri,
>
> Thanks for the information.  A number of people have emailed me expressing some level of interest in the outcome of this, so hopefully I will soon be able to do some tuning and performance experiments and report back our results.
>
> On the demote_secs tuning parameter, I see you're suggesting 600 seconds, which appears to be longer than the default 300 seconds as stated by Wendy Cheng at http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4 -- we're running RHEL4.5.  Wouldn't a SHORTER demote period be better for lots of files, whereas perhaps a longer demote period might be more efficient for a smaller number of files being locked for long periods of time?
>

This demote_secs tunable is a little bit tricky :) ... What happens here
is that, GFS caches glocks that could get accumulated to a huge amount
of count. Unless vm releases these inodes (files) associated with these
glocks, current GFS internal daemons will do *fruitless* scan trying to
remove these glock (but never succeed). If you set the demote_secs to a
large number, it will *reduce* the wake-up frequencies of these daemons
doing these fruitless works, that, in turns, leaving more CPU cycles for
real works. Without glock trimming patch in place, that is a way to tune
a system that is constantly touching large amount of files (such as
rsync). Ditto for "scand" wake-up internal, making it larger will help
the performance in this situation.

With the *new* glock trimming patch, we actually remove the memory
reference count so glock can be "demoted" and subsequently removed from
the system if in idle states. To demote the glock, we need gfs_scand
daemon to wake up often - this implies we need smaller demote_secs for
it to be effective.
> On a related note, I converted a couple of the clusters in our lab from GULM to DLM and while performance is not necessarily noticeably improved (though more detailed testing was done after the conversion), we did notice that both clusters became more stable in the DLM configuration.
>
This is mostly because DLM is the current default lock manager (with
on-going development efforts) while GULM is not actively maintained.

-- Wendy