[Linux-cluster] node failing


I am trying to track down a problem I’ll been having with the clustering software on redhat 3.0 (supplied rpm’s).  

I am running a 2 cluster node using Multicast Heartbeat, Network Tiebreaker IP address and have bonded Ethernet interfaces to different switches.

The problem is that you start the cluster and everything is working fine and then suddenly one node (always the same one) thinks the other node has become Inactive. Its gets into a state where one node

thinks both nodes are active and the other node only thinks it is active.

There is no networking problems that I can see. On the bad node I can ping the other node by it’s address and the multicast address. I have full debug mode on, but the log files don’t show anything.


Has any one else seen this problem or can give me some tips what to look at next ?




