[Linux-cluster] CS4 problem "Missed too many heartbeats"

Alain Moulle Alain.Moulle at bull.net
Fri Mar 31 06:56:56 UTC 2006


Hi

I have a configuration where the following test
gives :

1. start CS4 on both nodes of a HA pair
2. start the service (doing just echo in a file) on node1
3. poweroff -f node1
   ==> so the service is migrating on node2 successfully
4. when node1 is up again, after verification of ping on
   heartbeat interface,  I start again CS4 and
   on "Starting fence domain:" , it remains stalled

In the syslog of node2 , I can see :

Mar 31 08:45:26 s_kernel at yack1 kernel: CMAN: node yack0 rejoining
Mar 31 08:45:56 s_kernel at yack1 kernel: CMAN: removing node yack0 from the
cluster : Missed too many heartbeats
Mar 31 08:45:56 s_sys at yack1 fenced[8855]: yack0 not a cluster member after 0 sec
post_fail_delay
Mar 31 08:45:56 s_sys at yack1 fenced[8855]: fencing node "yack0"
Mar 31 08:46:21 s_sys at yack1 fenced[8855]: fence "yack0" success

Any idea on this problem ?

Thanks
Alain




More information about the Linux-cluster mailing list