[Linux-cluster] Re: occasional cluster crashes
Fabrizio Lippolis
fabrizio.lippolis at aurigainformatica.it
Fri Nov 17 16:10:43 UTC 2006
Hi Lon,
Lon Hohberger ha scritto:
> Do they crash (panic), or do they just become totally unresponsive?
One server suddenly becomes unresponsive, like frozen. The second server
starts to miss heartbeats from the first. At the moment I have
configured manual fencing so the service is not relocated (more
explained below). If I remember good restarting the locked machine is
not enough, I have to reboot the working one too.
> Have you tried getting a stack trace from the console using sysrq? (echo
> 1 > /proc/sys/kernel/sysrq; then hit alt-sysrq-t from the console).
No I haven't, I will try this thing too.
> One thing that's peculiar is that - if they are locking up, they have to
> be locking up at about the same time -- otherwise, one would fence the
> other, and life would go on.
As I wrote only one gets locked. The fencing configuration is another
problem to me and something I am aware of. I haven't understood very
well how it works, looks like I need an external device which manages
power. In this case which device and consequently fencing method is more
suitable? I am rather confused about this argument.
Fabrizio
More information about the Linux-cluster
mailing list