[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
R: R: [Linux-cluster] "Missed too many heartbeats" messages andhung cluster
- From: "Leandro Dardini" <l dardini comune prato it>
- To: "linux clustering" <linux-cluster redhat com>
- Subject: R: R: [Linux-cluster] "Missed too many heartbeats" messages andhung cluster
- Date: Tue, 27 Jun 2006 12:04:36 +0200
> -----Messaggio originale-----
> Da: linux-cluster-bounces redhat com
> [mailto:linux-cluster-bounces redhat com] Per conto di
> Fabrizio Lippolis
> Inviato: martedì 27 giugno 2006 11.52
> A: linux clustering
> Oggetto: Re: R: [Linux-cluster] "Missed too many heartbeats"
> messages andhung cluster
>
> Leandro Dardini ha scritto:
>
> > If something happens between the two machine, they fence each other.
>
> I have configured manual fencing but as I wrote it's not much
> useful since, I think, requires manual handling which
> couldn't be possible immediately. Therefore I am looking for
> a method to let the services run even if such a thing
> happens. This is not the first time the problem arises,
> apparently without a reason, though the last time happened
> long time ago.
>
> > You can try to "ping" each other and see, when the problem
> arise, the connectivity state.
>
> Sometimes the machines are completely locked and it's not
> even possible to log in. A brute force switch off is
> necessary in this case. Sometimes looks like only the cluster
> service is locked and I can regularly ping the other machine
> though the cluster is not working.
This is really bad. This smells like an hardware problem or buggy kernel driver. Try to stress test the machines individually without cluster support. I usually start with a memtest from a Knoppix CD and then build a kernel for CPU stress. Try to transfer huge chunk of data to test the lan.
Leandro
>
> > Maybe a "too much intelligent switch" is handling the
> traffic and have some sort of "traffic shaping and control".
>
> There is nothing like that, the two machines are connected by
> a 1GB crossover cable, not even so long, provided by HP with
> the two machines.
>
> --
> Fabrizio Lippolis
> fabrizio lippolis aurigainformatica it
> Auriga Informatica s.r.l. Via Don Guanella 15/B -
> 70124 Bari
> Tel.: 080/5025414 - Fax: 080/5027448 -
> http://www.aurigainformatica.it/
>
> --
> Linux-cluster mailing list
> Linux-cluster redhat com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]