[Linux-cluster] Fencing Methods

Mon Mar 20 20:32:17 UTC 2006

> On Mon, Mar 20, 2006 at 01:16:37PM -0500, Hendershot, Zach wrote:
> > Hello,
> >     I was wondering about various fencing methods. We don't have any

> > "supported" hardware available to do proper fencing via the Red Hat 
> > fencing agents. Other clustered filesystems like the Veritas CFS and

> > Oracle's ocfs2 solve the fencing problem by simply panic'ing the 
> > machine to keep the IO from hitting the disk.

> This assumes the machine knows it should fence itself, which isn't
always the case.  If the machine is hung somewhere and comes
> back to life after it's been recovered, it could write and corrupt the
fs, a panic doesn't solve this.
> 
> Dave

That's a good point, I wasn't thinking. Oracle and (I assume) Veritas do
this by relying on a kernel thread that writes out timestamps and if it
doesn't write an expected timestamp (and other nodes see it as dead) it
panics itself to self-fence. How does RHCS decide if a node is dead? I
was under the understanding that if the other nodes don't receive a
heartbeat from the node for a timeout period they execute the fence
command on the node. I'm interested why that choice was made, was it a
technical problem with the above method or a design decision? Have a
good one.

Zach