[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] "Missed too many heartbeats" messages and hung cluster



Patrick Caulfield ha scritto:

Jun 23 23:37:17 AICLSRV02 kernel: CMAN: removing node AICLSRV01 from the
cluster : Missed too many heartbeats


That message means that the heartbeat messages are getting lost somehow.
either through an unreliable network link or something else odd happening on
the machine to prevent the heartbeat packets reaching the network.

This is very strange since the two machines are connected by a gigabit crossover cable and no other device is in the middle. Also, no firewall rules are configured on any machine.

By the way, actually I am using the fence manual method but it isn't much helpful and I would like to switch to a method that ensures a reliable service. Does it mean I have to buy a device sitting in the middle of the machines that connects network and power cables? I am rather new to it so please any suggestion is welcome.

--
Fabrizio Lippolis                fabrizio lippolis aurigainformatica it
Auriga Informatica s.r.l.            Via Don Guanella 15/B - 70124 Bari
Tel.: 080/5025414 - Fax: 080/5027448 - http://www.aurigainformatica.it/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]