[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] "Missed too many heartbeats" messages and hung cluster



Fabrizio Lippolis wrote:
> I have configured two machines in a cluster domain to run mysql and ldap
> services. Everything works correctly except that from time to time,
> seems randomly, the two machines hung. Recently this is what I see in
> the log of the second machine:
> 
> Jun 23 23:37:17 AICLSRV02 kernel: CMAN: removing node AICLSRV01 from the
> cluster : Missed too many heartbeats


That message means that the heartbeat messages are getting lost somehow.
either through an unreliable network link or something else odd happening on
the machine to prevent the heartbeat packets reaching the network.

> 
> The two machines have been resetted to let them work again. Anybody
> could please explain what happened to cause this problem? I would also
> need a suggestion on how to configure a fence device so that the
> services could still continue to work. As you see actually I configured
> manual fence but that's not much useful. Thank you in advance.
> 


-- 

patrick


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]