[Cluster-devel] fencing conditions: what should trigger a fencing operation?

Thu Nov 19 17:28:09 UTC 2009

On Thu, Nov 19, 2009 at 04:15:58PM +0000, Steven Whitehouse wrote:
> Hi,
> 
> On Thu, 2009-11-19 at 11:04 -0600, David Teigland wrote:
> > On Thu, Nov 19, 2009 at 12:35:05PM +0100, Fabio M. Di Nitto wrote:
> > 
> > > - what are the current fencing policies?
> > 
> > node failure
> > 
> I think what Fabio is asking is what event is considered to be a node
> failure? It sounds from your description that it means a failure of
> corosync communications. 

corosync's main job is to define node up/down states and notify everyone
if it changes, i.e. "cluster membership"

> Are there other things which can feed into this though? For example dlm
> seems to have some kind of timeout mechanism which sends a message to
> userspace, and I wonder whether that contributes to the decision too?

lock timeouts?  lock timeouts are a just a normal lock manager feature,
although we don't use them.  (The dlm also has a variation on lock
timeouts where it doesn't cancel the timed out lock, but instead sends a
notice to the deadlock detection code that there may be a deadlock, so a
new deadlock detection cycle is started.)

> It certainly isn't desirable for all types of filesystem failure to
> result in fencing & automatic recovery. I think we've got that wrong in
> the past. I posted a patch a few days back to try and address some of
> that. In the case we find an invalid block in a journal during recovery
> we certainly don't want to try and recover the journal on another node,
> nor even kill the recovering node since it will only result in another
> node trying to recover the same journal and hitting the same error.
> Eventually it will bring down the whole cluster.
> 
> The aim of the patch was to return a suitable status indicating why
> journal recovery failed so that it can then be handled appropriately,

Historically, gfs will panic if it finds an error that will keep it from
making progress or handling further fs access.  This, of course, was in
the interest of HA since you don't want one bad fs on one node to prevent
all the *other* nodes from working too.

Dave