[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] qdiskd not properly failing nodes??



Thanks for the clarification.

Does  the state change from quorate -> inquorate get logged anywhere?  I have log level set to 7 and all that is in messages is "downgrading".  Also my status_file "Current disk state" always seems to be "None".

Thanks Again.
 - Dan
 
 -------------- Original message ----------------------
From: Lon Hohberger <lhh redhat com>
> On Wed, 2006-09-13 at 15:40 -0400, Andrea Westervelt wrote:
> > 
> > 
> > ______________________________________________________________________
> > 
> > Lon,
> >  
> > fenced is running and based on the manpage it seems like dropping
> > below a score of ½ should cause a reboot? 
> 
> It currently expects the quorate partition (remember, this node is no
> longer quorate) to fence the node rather than taking action itself.
> 
> >  I guess I am a little confused on what the heuristics/scoring are
> > meant to do.  Can you explain the role of the master partition and
> > what the expected outcome of an insufficient score should be?
> 
> The master node is a node with sufficient score to declare itself online
> according to the heuristics that you supply in the qdisk configuration.
> Assuming it maintains its score, it arbitrates what other nodes join the
> "master" partition.  If a node becomes part of the master partition, the
> node advertises quorum device votes to CMAN.
> 
> Insufficient scores should cause a node to remove itself from the master
> partition and tell CMAN that the quorum device is offline.  This should
> cause CMAN on a node in the qdisk master partition to fence the node
> (assuming that this causes the node to transition from
> quorate->inquorate).
> 
> I'm guessing what is happening here in your case is that CMAN is still
> seeing the node - even though it's inquorate - and it's not fencing it
> -- is that right?  A transition from quorate->inquorate should cause the
> node to get fenced.
> 
> That sounds like a bug (pretty easy to fix, too).
> 
> -- Lon
> 



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]