Re: [Linux-cluster] [UPDATE] IP monitor failing periodically

I'm not sure about the segfaults, but we are facing the same issues on RHEL5 and FC6, i368 - random failovers due to ip-check failures. This workaround seems to help, for now at least:


I'll check if it is indeed the ping segfaulting and report back when I get some time.


chris cmiware com wrote:
We reinstalled our machines with RHEL 5 x86_64 (we were running i386) a few weeks ago and the mysterious IP monitoring failures have disappeared. I believe it was postulated that a compiler bug regarding -fpie might be causing segfaults in i386 binaries, so this would support that theory to some degree, although I did not really attempt to confirm it further. I thought the architecture change fixing the random failovers was noteworthy.

### previous thread below

Hi Chris,

I am experiencing the same problem on RHEL 5 and I have a support request in with RedHat.

I was asked to increase the debug level by changing the <rm> line in the cluster configuration to:

<rm log_facility="local4" log_level="7">

I then needed to add "local4.* /var/log/cluster" to /etc/syslog.conf and run "service syslog restart".

To update the cluster configuration I needed to propagate the cluster configuration to both nodes:

# ccs_tool update /etc/cluster/cluster.conf

After a week I have not had the problem with the increased logging despite the problem occurring regularly prior to that - 2 to 3 times a day. One day last week out of curiosity I reverted to the default settings and within a few hours I had the failure to ping error on one of the clustered IP addresses and the service was restarted.

I now have the logging back at 7 and the support request is pending.


