[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Re: rgmanger stuck, hung on futex



On Mon, 2006-12-11 at 10:22 -0800, aberoham gmail com wrote:
> Another clue -- haldaemon crashed on this node, perhaps at the same
> time clurgmgrd started to hang? 
> 
> lastest dmesg entry --
> hal[3509]: segfault at 0000000000000000 rip 0000000000400ec7 rsp
> 0000007fbfffd7e0 error 4 
> 
> grep clurgmgrd /var/log/messages --
> [snip]
> Dec 11 06:39:43 bamf01 clurgmgrd: [7983]: <info>
> Executing /etc/init.d/rsyncd-tiger status
> Dec 11 06:39:44 bamf01 clurgmgrd: [7983]: <info>
> Executing /etc/init.d/httpd.cluster status 
> Dec 11 06:39:44 bamf01 clurgmgrd: [7983]: <info>
> Executing /etc/init.d/rsyncd-hartigan status
> Dec 11 06:41:11 bamf01 clurgmgrd[7983]: <err> #48: Unable to obtain
> cluster lock: Connection timed out
> Dec 11 06:41:56 bamf01 clurgmgrd[7983]: <err> #50: Unable to obtain
> cluster lock: Connection timed out 
> [snip]

Could you check /proc/slabinfo and post it from all nodes?  I think I
know what this is.

-- Lon

Attachment: signature.asc
Description: This is a digitally signed message part


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]