[Linux-cluster] Halt nodes in cluster with cable disconnect

Miguel Angel Guerrero kortux at gmail.com
Tue Jan 24 21:34:42 UTC 2012


Digimer i use your manual ;)

https://alteeve.com/w/Red_Hat_Cluster_Service_2_Tutorial

in a test environment y desactivate drbd daemon for testing but with or
without drbd daemon running, the problem persist
I use the next handler and fencing policy in drbd

fencing resource-and-stonith;
outdate-peer "/sbin/obliterate-peer.sh";

Digimer when you suggest add "sleep 10"' is in drbd.conf?

On Tue, Jan 24, 2012 at 4:09 PM, Digimer <linux at alteeve.com> wrote:

> On 01/24/2012 03:57 PM, Miguel Angel Guerrero wrote:
> > Hi i'm trying to setup a centos cluster with two nodes with cman, drbd,
> > gfs2 and i'm using ipmi for fencing. DRBD is set up between the nodes
> > using a dedicated interface. So, when I unplug the drbd network cable,
> > both nodes power off immediatly (i tried using crossover cable and both
> > nodes connected to a switch, but both scenarios fail), and the logs
> > doesn't seem to show something useful. In a previous thread on this
> > list, it is recommended to deactivate ACPID daemon, even at BIOS level,
> > but I'm still having troubles.
> >
> > If I simulate a physical disconnection with ifdown command in some node,
> > this node reboots with no hassle, but unpluging the cable kills both
> > nodes. I think the first scenario is correct, but the second one is not
> > what I expect.
> >
> > Thanks for your help the next are my cluster.conf
>
> This is likely caused by both nodes getting their fence calls off before
> one of them dies.
>
> How do you have DRBD configured? Specifically, what fence handler are
> you using? If you're interested in testing, I have rewritten lon's
> obliterate-peer.sh and added explicit delays to help resolve this exact
> issue.
>
> https://github.com/digimer/rhcs_fence
>
> Alternatively, add a 'sleep 10' or similar to one of your existing fence
> handlers and you should find that the node with the delay consistently
> loses while the other node remains up.
>
> --
> Digimer
> E-Mail:              digimer at alteeve.com
> Papers and Projects: https://alteeve.com
>



-- 
Atte:
------------------------------------
Miguel Angel Guerrero
Usuario GNU/Linux Registrado #353531
------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120124/95d46907/attachment.htm>


More information about the Linux-cluster mailing list