[Linux-cluster] Node Failure Detection Problems

Patrick Caulfield pcaulfie at redhat.com
Mon Mar 20 08:21:43 UTC 2006


James Firth wrote:
> Hi,
> 
> I have some questions on configuring and tuning heartbeats and
> node-failure detection.
> 
> I have a 2-node cluster.  Whenever a node fails it seems to take a while
> to detect node failure.
> 
> First question: I have reduced heartbeat hello_timer to 1 second, and
> deadnode_timeout to 5 seconds.  Is there an elegant way to do this with
> cluster.conf?  Currently I'm setting
> /proc/cluster/config/cman/hello_timer with an init script hack.
> 

The latest code (RHEL4 U3, or CVS STABLE/RHEL4) allows you to put these into
cluster.conf as

<cluster>
<cman deadnode_timeout="5" hello_timer="1" />

...

</cluster/
-- 

patrick




More information about the Linux-cluster mailing list