[Linux-cluster] Cluster service restarting


We have been running web and database clusters successfully for several years on RHEL 3 and 4 and we now have one of each on RHEL 5.

The setup is very straight forward, 2 nodes active/active with one running the webserver the other the databases.

We have found the services restart in place regularly, up to 2 or 3 times a day sometimes. The cause is the Failure to ping one or another of the clustered service IP addresses and is evident from the log entries. This happens less frequently on the database server with one clustered interface than it does with the webserver that has 5. The failure to ping that is reported in the logs for the webserver is not always on the same IP address and it seems quite random in time and which in which IP address it reports is at fault. There are no load related issues as this is still in the testing stage.

I have turned the "Monitor Link" setting off and it still happens.

Are there any settings that will increase the timeout as I'm sure the interface does not go down.

Any other pointers or suggestions?

David Schroeder
Server Support
Information Services Division
Flinders University
Adelaide, Australia
Ph: +61 8 8201 2689

