[Linux-cluster] Network hiccup + power-fencing = both nodes godown(redhat cluster 4)

Tue Jan 17 13:49:55 UTC 2006

Thanks Patrick.  I have upped my deadnode_timeouts to 120 each.  

My worry though is the box somehow rebooting and joining faster than the
other can wait its 120 seconds and take over the cluster.  Is there
another timeout value that I can tweak to keep the original, crashed
node from rebooting and joining too quickly?  Unfortunately, when the
boxes crash they seem to come right back up and not stay dead.  I think
this might be ILO behavior, but not sure.  I know when I shutdown -hy
now, they stay down, and when the power-fencing takes place they stay
down too, but not for crashes.

Thanks again,
Jeff

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Patrick Caulfield
Sent: Tuesday, January 17, 2006 8:33 AM
To: linux clustering
Subject: Re: [Linux-cluster] Network hiccup + power-fencing = both nodes
godown(redhat cluster 4)

Jeff Harr wrote:
> Patrick, this is awesome.  Here are my numbers:
> 
> [root at server1 ~]# cat /proc/cluster/config/cman/hello_timer
> 5
> [root at server1 ~]# cat /proc/cluster/config/cman/deadnode_timeout
> 21
> [root at server1 ~]#
> 
> I'm assuming these are seconds.  I think if I increase the
> deadnode_timeout to maybe 120, then a network hiccup or major glitch
> (like a reboot of a switch) could be ignored.

Yes, those are seconds. Sorry I should have said.

-- 

patrick

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster