[Linux-cluster] Few queries about fence working

Thu Jan 26 13:29:01 UTC 2012

On 01/26/2012 07:43 AM, jayesh.shinde at netcore.co.in wrote:
> Dear Digimer & Kaloyan Kovachev ,
> 
> Do u think this server shutdown problem ( while fencing simultaneously
> from both node via drbd.conf) can be completely avoid  if I use SAN disk
> instead of DRBD disk ?
> 
> i.e  in case of SAN disk the defined fence config under cluster.conf
> will take care of the n/w failuer and related fencing of node ?
> 
> What you will suggect ,  SAN or DRBD disk.
> please guide me.
> 
> Regards
> Jayesh Shinde

It won't fundamentally remove the issue. Any time there is a break down
in communication between nodes in a two-node cluster, there is going to
be a simultaneous fence call made. Ideally, you would have a fence
device that would not buffer calls, but that maybe not be feasible in
your case.

This is why fence delays exist - specifically to allow one node to
always complete a fence operation before another. If you really want to
avoid having the same node survive a fence call in a split like this,
then your best bet is to add a 3rd node for quorum. However, once you
do, the obliterate fence handler will no longer work as it is restricted
to 2 node clusters only (one of the things rhcs_fence resolves, but it
isn't tested on EL5).

To be honest though, is there really a problem with having one node
pre-defined to win a dual-fence call?

-- 
Digimer
E-Mail:              digimer at alteeve.com
Papers and Projects: https://alteeve.com