[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns



----- Original Message -----
| Hi,
| 
| On Thu, 2012-04-05 at 12:11 -0400, Bob Peterson wrote:
| > Hi,
| > 
| > Here's another patch (explanation below). This patch replies upon
| > a DLM patch that hasn't fully gone upstream yet, so perhaps it
| > shouldn't be added to the nmw tree until it is. This greatly
| > improves the performance of gfs2_grow in a clustered gfs2 file
| > system.
| > 
| > Regards,
| > 
| I'm still not very keen on dragging in this bit of dlm. If it is
| really
| needed, then we should use the copy in the dlm itself and not add our
| own copy of it. Also, why can we not use the GLF_BLOCKING flag for
| this
| purpose? Is there some issue with try locks in that case?

To answer your questions:
(1) Yes, this bit is really needed.
(2) We cannot use the actual dlm flags because they're part of
    dlm/lock.c and not in a header. I can request they be moved
    to a header file, but that's at the whim of upstream dlm.
(3) We can't use GLF_BLOCKING because it's a different issue.
    The issue is this: Unless instructed otherwise, dlm will
    grant GFS2 (or any caller) locks as it sees fit, and when there
    is a hotly contested file (such as rindex, once it's been
    updated by gfs2_grow and needed immediately by all nodes in
    a cluster) dlm often finds it more convenient to grant the lock
    to a process on the current owner node rather than bouncing the
    lock to a different node.

    That can result in one node "hogging" the lock for long periods of
    time. That means a gfs2_grow can take 30 minutes to complete
    rather than 0.3 seconds. To use a farming analogy: your animals
    are starving because the hogs at the trough are eagerly fighting
    over the morsel at the bottom, and not giving the farmer enough
    of a break to put their food in. To avoid this fighting behavior
    dlm has the DLM_LKF_QUECVT bit which basically says lock upgrades
    (the farmer) over the convenience of keeping the lock on the same
    node (the hogs). Employing this bit does wonders for gfs2_grow.
    However, we can only use the bit on dlm lock upgrades. If we
    use it in other cases, dlm goes through an error path that
    results in a hang.

| When you say that this relies upon this dlm patch, what does that
| mean?

(4) It means that there's a DLM bug whereby using the DLM_LKF_QUECVT
    bit causes DLM (and therefore GFS2) to hang in certain situations.
    There is a patch to fix this bad DLM behavior but it hasn't gone
    upstream yet.

| What are the consequences of having this patch but not the dlm one?

(5) The consequences of having this patch but not the DLM one are
    disastrous: gfs2 is likely to hang.

| I'm
| wondering whether I should hold off on this at least until the dlm
| one
| has been finalised and applied.

(6) Yes, that's what we need to do.

| Aside from those points though, this is a really big step forward and
| should make a measurable difference to performance across the board
| when
| contention between multiple nodes occurs. So I'm very happy to see
| such
| significant progress in this area,
| 
| Steve.
| 
| 
| > Bob Peterson
| > Red Hat File Systems
| > 
| > Signed-off-by: Bob Peterson <rpeterso redhat com>
| > ---
| > Author: Bob Peterson <rpeterso redhat com>
| > Date:   Wed Apr 4 15:14:51 2012 -0500
| > 
| >     GFS2: Instruct DLM to avoid queue convert slowdowns
| >     
| >     This patch instructs DLM to prevent an "in place" conversion,
| >     where the
| >     lock just stays on the granted queue, and instead forces the
| >     conversion to
| >     the back of the convert queue. This is done on upward
| >     conversions only.
| >     
| >     This is useful in cases where, for example, a lock is
| >     frequently needed in
| >     PR on one node, but another node needs it temporarily in EX to
| >     update it.
| >     This may happen, for example, when the rindex is being updated
| >     by gfs2_grow.
| >     The gfs2_grow needs to have the lock in EX, but the other nodes
| >     need to
| >     re-read it to retrieve the updates. The glock is already
| >     granted in PR on
| >     the non-growing nodes, so this prevents them from continually
| >     re-granting
| >     the lock in PR, and forces the EX from gfs2_grow to go through.
| > 
| > diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
| > index bd5af03..43466d6 100644
| > --- a/fs/gfs2/lock_dlm.c
| > +++ b/fs/gfs2/lock_dlm.c
| > @@ -16,6 +16,25 @@
| >  #include "glock.h"
| >  #include "util.h"
| >  
| > +/*
| > + * Borrowed from dlm's lock.c: We need to only pass the QUECVT bit
| > with
| > + * requests that are compatible, so we use dlm's check:
| > + *
| > + * Compatibility matrix for conversions with QUECVT set.
| > + * Granted mode is the row; requested mode is the column.
| > + * Usage: matrix[grmode+1][rqmode+1]
| > + */
| > +static const int quecvt_compat_matrix[8][8] = {
| > +      /* UN NL CR CW PR PW EX PD */
| > +        {0, 0, 0, 0, 0, 0, 0, 0},       /* UN */
| > +        {0, 0, 1, 1, 1, 1, 1, 0},       /* NL */
| > +        {0, 0, 0, 1, 1, 1, 1, 0},       /* CR */
| > +        {0, 0, 0, 0, 1, 1, 1, 0},       /* CW */
| > +        {0, 0, 0, 1, 0, 1, 1, 0},       /* PR */
| > +        {0, 0, 0, 0, 0, 0, 1, 0},       /* PW */
| > +        {0, 0, 0, 0, 0, 0, 0, 0},       /* EX */
| > +        {0, 0, 0, 0, 0, 0, 0, 0}        /* PD */
| > +};
| >  
| >  static void gdlm_ast(void *arg)
| >  {
| > @@ -104,10 +123,13 @@ static int make_mode(const unsigned int
| > lmstate)
| >  	return -1;
| >  }
| >  
| > -static u32 make_flags(const u32 lkid, const unsigned int
| > gfs_flags,
| > +static u32 make_flags(struct gfs2_glock *gl, const unsigned int
| > gfs_flags,
| >  		      const int req)
| >  {
| >  	u32 lkf = 0;
| > +	u32 lkid = gl->gl_lksb.sb_lkid;
| > +	int is_upconvert =
| > +		quecvt_compat_matrix[make_mode(gl->gl_state) + 1][req + 1];
| >  
| >  	if (gfs_flags & LM_FLAG_TRY)
| >  		lkf |= DLM_LKF_NOQUEUE;
| > @@ -131,8 +153,11 @@ static u32 make_flags(const u32 lkid, const
| > unsigned int gfs_flags,
| >  			BUG();
| >  	}
| >  
| > -	if (lkid != 0)
| > +	if (lkid != 0) {
| >  		lkf |= DLM_LKF_CONVERT;
| > +		if (is_upconvert)
| > +			lkf |= DLM_LKF_QUECVT;
| > +	}
| >  
| >  	lkf |= DLM_LKF_VALBLK;
| >  
| > @@ -149,7 +174,7 @@ static unsigned int gdlm_lock(struct gfs2_glock
| > *gl,
| >  
| >  	gl->gl_req = req_state;
| >  	req = make_mode(req_state);
| > -	lkf = make_flags(gl->gl_lksb.sb_lkid, flags, req);
| > +	lkf = make_flags(gl, flags, req);
| >  
| >  	/*
| >  	 * Submit the actual lock request.
| > 
| 
| 
| 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]