[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Cluster-devel] GFS2: glock statistics gathering (RFC)



Hi,

On Fri, 2011-11-04 at 12:31 -0400, David Teigland wrote:
> On Fri, Nov 04, 2011 at 03:19:49PM +0000, Steven Whitehouse wrote:
> > The three pairs of mean/variance measure the following
> > things:
> > 
> >  1. DLM lock time (non-blocking requests)
> 
> You don't need to track and save this value, because all results will be
> one of three values which can gather once:
> 
> short: the dir node and master node are local: 0 network round trip
> medium: one is local, one is remote: 1 network round trip
> long: both are remote: 2 network round trips
> 
> Once you've measured values for short/med/long, then you're done.
> The distribution will depend on the usage pattern.
> 
The reason for tracking this is to be able to compare it with the
blocking request value to (I hope) get a rough idea of the difference
between the two, which may indicate contention on the lock. So this
is really a "baseline" measurement.

Plus we do need to measure it, since it will vary according to a
number of things, such as what hardware is in use.

> >  2. DLM lock time (blocking requests)
> 
> I think what you want to quantify is how much contention a given lock is
> under.  A time measurement is probably not a great way to get that since
> it's a combination of: the value above, how long gfs2 takes to release the
> lock (itself a combination of things, including the the tunable itself),
> and how many nodes are competing for the lock (depends on workload).
> 
> >  3. Inter-request time (again to the DLM)
> 
> Time between gfs2 requesting the same lock?  That sounds like it might
> work ok for measuring contention.
> 
> > 1. To be able to better set the glock "min hold time"
> 
> Less for a lock with high contention?
> 
Generally we want more (i.e. to reduce the number of times the lock
gets passed around, and increase the time each node holds it) as
otherwise we make no progress, or very slow progress. This does penalise
interactive loads but increases throughput dramatically on batch loads.

Also, the point is that if we know the average time that it takes to get
a lock, and we know how many locks per second we are requesting, then we
can figure out the total percentage of time we are waiting for locks in
total. If there are N nodes in the cluster and the node is getting the
lock for less than 1/N of the total time, then it could increase its
minimum hold time.

That kind of thing is useful in order to ensure that we continue to make
progress under all circumstances.

> > 2. To spot performance issues more easily
> 
> Apart from contention, I'm not sure there are many perf issues that dlm
> measurements would help with.
> 
That is the #1 cause of reported performance issues, so top of our list
to work on. The goal is to make it easier to track down the source of
these kinds of problems.

> > 3. To improve the algorithm for selecting resource groups for
> > allocation (to base it on lock wait time, rather than blindly
> > using a "try lock")
> 
> Don't you grab an rg lock and keep it cached?  How would lock times help?
> 
Its only cached while other nodes are not trying to use the same rgrp.
The same issue applies to them as to inode glocks in terms of
contention. When allocating we have a choice to use another rgrp if the
contention is causing a problem (hence the need for some info to base
that choice upon). Unfortunately with deallocation, there is no choice
and we have to use the specific rgrp(s) however contended they might be.

> Also, ocfs2 keeps quite a lot of locking stats you might look at.
> 
> Dave
That might be useful depending on what they are gathering,

Steve.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]