[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Tracing gfs problems



Röthlein Michael (RI-Solution) wrote:
Hello,

In the past there occured hangs resulting in reboots of our 4 node cluster. The real problem is: there aren't any traces in the log files of the nodes.

Is there a possibilty to raise the verbosity of gfs?

Thanks

Michael
Hi Michael,

Right now, there's no way to increase the level of verbosity or logging in the gfs kernel code, but I'm not sure that would help you anyway. The lockup could be in any part of the kernel: GFS, The DLM/Gulm locking infrastructure, or any other part for that matter. It could also be
hardware related or running out of memory, etc.

Your best bet may be to temporarily disable fencing so that the hung node(s) don't get fenced as soon as it happens, for example by changing it to manual fencing, and then when it hangs, check for dmesgs on the console, syslog messages in /var/log/messages and if you can't get a command prompt, use the "magic sysreq" key to dump out what each module, thread and
process is doing.

If that doesn't tell you where the problem is, you can send the info to this list or create a bugzilla for the problem and attach the output from the sysrq, along with details on what
release of code you're using, your cluster.conf, etc.

Here are simple instructions for using the "magic sysrq" in case you're unfamiliar:

1. Turn it on by doing:
   echo "1" >  /proc/sys/kernel/sysrq
2. Recreate your kernel hang
3. If you're at the system console with a keyboard, do alt-sysrq t (task list)
  If you have a telnet console instead, do ctrl-] to get telnet> prompt
  telnet> send brk  (send a break char)
  t (task list)
  If you don't have a keyboard or telnet, but do have a shell:
  echo "t" > /proc/sysrq-trigger
  If you're doing it from a minicom, use: <ctrl-a>f followed by t
(For other types of serial consoles, you have to get it to send a break, then letter t)
4. The task info will be dumped to the console, so hopefully you have
   a way to save that off.

Regards,

Bob Peterson
Red Hat Cluster Suite


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]