[Cluster-devel] unfencing

Thu Feb 26 14:33:26 UTC 2009

On Thu, Feb 26, 2009 at 07:51:57AM +0100, Fabio M. Di Nitto wrote:
> On Mon, 2009-02-23 at 13:09 -0600, David Teigland wrote:
> > On Mon, Feb 23, 2009 at 07:52:55PM +0100, Fabio M. Di Nitto wrote:
> > > > A node unfences *itself* when it boots up.  As such, power-unfencing doesn't
> > > > make sense; unfencing is only meant to reverse storage fencing.
> > > 
> > > What can stop a user to run fence_node -U from another node to do remote
> > > (un)fencing?
> > 
> > It would work.  Users can do anything they like, that's beside the point.
> 
> I was thinking about 2 little points..
> 
> Given the time at which fence_node -U will fire, you probably want to
> add a cman_init + cman_is_active + cman_finish loop in fence_node to
> make sure cman is ready to reply to our ccs queries, otherwise we might
> have a race condition at boot time (it might be already there.. didn't
> really check the code). All our daemons do that to give cman time to
> bootstrap.

Yes, good point.  I wonder if we'd be better off having cman_tool join
effectively do an is_active wait before exiting?  Then we could probably
avoid doing it many other places.  (It's also annoying when corosync crashes
after is_active completes, but before I've read what I need from cman/ccs.)

> The second thing would be to set a minimal protection mechanism by
> allowing fence_node -U to be fired only for the node that it is invoking
> it. So if we run on node A, fence_node -U can only execute unfencing
> operations for node A. For testing purposes then we could add a manual
> override such as "--i-understand-this-operation-can-destroy-the-world".

I plan to use "fence_node -U" (no name) to unfence self.  I'm inclined to
just allow any node name after that, but not advertise it.