[Linux-cluster] GFS 6.0 node without quorum tries to fence

Schumacher, Bernd bernd.schumacher at hp.com
Tue Aug 3 16:44:06 UTC 2004


before I tried with manual fencing I tried this with automatic fencing
(fence_rib). And always mitte was faster and fenced oben and unten. This
means, one faulty node can reboot all other nodes. I think this is not
ok. And even after reboot the problem is not solved, because the faulty
node is still faulty.

A node should only be allowed to fence if it is Master and if it has the
qourum. And never if it is in arbitrating mode.

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Steve Landherr
> Sent: Dienstag, 3. August 2004 18:23
> To: Discussion of clustering software components including GFS
> Subject: RE: [Linux-cluster] GFS 6.0 node without quorum 
> tries to fence
> 
> 
> In a netsplit, what does fencing achieve when done by a node 
> that doesn't have quorum?  It still won't have quorum.  It 
> should probably just clean up as best it can and leave the 
> rest of the cluster alone.
> 
> -steve
> --
> Steve Landherr -- landherr at kazeon.com
> Kazeon Systems, Inc.
> Mountain View, California
> 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On > Behalf Of 
> Michael Conrad Tadpol Tilstra
> Sent: Tuesday, August 03, 2004 9:13 AM
> To: Discussion of clustering software components including GFS
> Subject: Re: [Linux-cluster] GFS 6.0 node without quorum 
> tries to fence
> 
> So looking at what you gave below, mitte was master. (making 
> this guess from the "Core lost slave quorum" part of the 
> message below.)  It knows that it doesn't have quorum, it 
> still is going to try to be the Master. It does not know 
> "that it can not build a cluster."  The only thing it knows 
> right now about the other nodes is that they failed to send 
> heartbeats.  Therefor they must have left the cluter 
> abnormally. Therefor it must fence them.
> 
> The other two nodes see that mitte have failed to reply to 
> heartbeats. Therefor it must have left the cluster 
> abnormally.  Therefor it must be fenced.
> 
> Both sides of the netsplit are trying to resolve things to 
> regain the cluster.  From an outsiders view point (which you 
> and I have, the nodes do not.) We can see that mitte's 
> attempts are futile, oben and unten will get control of the 
> cluter.  But the node cannot see this.
> 
> This is what makes netsplits kind of ugly.  
> 
> (using ifdown to test cluster stuff causes extra confusion in 
> my opinion. because you actually are creating a netsplit 
> case.  Not a simpler node down case.  The power switch is 
> nice for this.)
> 
> 
> I hope that made some sence.
> 
> -- 
> Michael Conrad Tadpol Tilstra
> Blood is thicker than water, and much tastier.
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com 
> http://www.redhat.com/mailman/listinfo/linux-> cluster
> 




More information about the Linux-cluster mailing list