[Linux-cluster] Problems with Cluster

Maciej Bogucki maciej.bogucki at artegence.com
Mon Jun 11 14:40:06 UTC 2007


Manish Kathuria napisał(a):
> Hello,
> 
> We have configured an active-active two node cluster accessing a
> shared storage. We are using Red Hat Cluster Suite on RHEL  4Update 4.
> The nodes are HP Servers having ILOs as the fence devices. I am facing
> certain problems in the setup and would seek suggestions.
> 
> While booting up the systems, many a times one node forces the other
> one to reboot for no reason. How do I prevent that ?

Try to set post_join_delay to high value fe. 600

> 
> We want the failover to happen when the power supply fails to either
> of the nodes. In order to test the scenario, we removed the power
> cables from one of the nodes. However the failover did not happen and
> upon observing the logs we found that the alive node could not connect
> to the fence device (ILO in this case) of the dead node since it was
> powered off and the fencing could not take place. Does this mean that
> we would not be able to have a failover in case of power failure for
> one of the nodes. Is there a way we can do it ? How is the cluster
> supposed to react when the ILO itself is powered off ?

You need to perform manual fencing(administrator reaction) when it happend.

> 
> Another failover test was conducted by removing the network
> connectivity on of one of the nodes. Failover did happen smoothly in
> the dead node was rebooted by the fence device. However when the dead
> node was reconnected to the network, it fenced and rebooted the alive
> node. How to deal with this ?

Try to increase post_join_delay.

Best Regards
Maciej Bogucki




More information about the Linux-cluster mailing list