[Linux-cluster] Tracing gfs problems

Robert Peterson rpeterso at redhat.com
Thu Aug 3 18:44:31 UTC 2006


Röthlein Michael (RI-Solution) wrote:
> Hello,
>
> In the past there occured hangs resulting in reboots of our 4 node cluster. The real problem is: there aren't any traces in the log files of the nodes.
>
> Is there a possibilty to raise the verbosity of gfs?
>
> Thanks
>
> Michael
>   
Hi Michael,

Right now, there's no way to increase the level of verbosity or logging 
in the gfs kernel code, but
I'm not sure that would help you anyway.  The lockup could be in any 
part of the kernel:
GFS, The DLM/Gulm locking infrastructure, or any other part for that 
matter.  It could also be
hardware related or running out of memory, etc.

Your best bet may be to temporarily disable fencing so that the hung 
node(s) don't get fenced
as soon as it happens, for example by changing it to manual fencing, and 
then when it hangs,
check for dmesgs on the console, syslog messages in /var/log/messages 
and if you can't get
a command prompt, use the "magic sysreq" key to dump out what each 
module, thread and
process is doing.

If that doesn't tell you where the problem is, you can send the info to 
this list or create a
bugzilla for the problem and attach the output from the sysrq, along 
with details on what
release of code you're using, your cluster.conf, etc.

Here are simple instructions for using the "magic sysrq" in case you're 
unfamiliar:

1. Turn it on by doing:
    echo "1" >  /proc/sys/kernel/sysrq
2. Recreate your kernel hang
3. If you're at the system console with a keyboard, do alt-sysrq t (task 
list)
   If you have a telnet console instead, do ctrl-] to get telnet> prompt
   telnet> send brk  (send a break char)
   t (task list)
   If you don't have a keyboard or telnet, but do have a shell:
   echo "t" > /proc/sysrq-trigger
   If you're doing it from a minicom, use: <ctrl-a>f followed by t
(For other types of serial consoles, you have to get it to send a break, 
then letter t)
4. The task info will be dumped to the console, so hopefully you have
    a way to save that off.

Regards,

Bob Peterson
Red Hat Cluster Suite




More information about the Linux-cluster mailing list