Re: [Linux-cluster] Corosync goes cpu to 95-99%

On 06/02/2011 01:27 AM, Nicolas Ross wrote:

cman_tool join is called in /etc/rc.d/init.d/cman I believe. Add a -P
option to it.


Where is "-P" option under cman_tool manpage?? I didn't see it. Appears
"-S", "-X", "-A", "-D" ... but not -P ...

Is it correct to put this option under /etc/sysconfig/cman config file
on RHEL6??

I had to modify my /etc/rc.d/init.d/cman script on each node and add -P
(undocumented) at line 500, after $cman_join_opts

And it did not solve the problem, but it help verry little bit to
aliviate it. While a node is experiencing it, it's still not usable by
ssh, but response time to service seems a very little better, barely

GSS asked me today to produce a core dump of corosync while it's eating
up CPU.


Oops .. Bad, bad, very bad news, almost for me. Nicolas, I have found the option to pass "-p" to corosync without modifying cman startup script. In /etc/sysconfig/cman config file, I have put a line with this:


 .. and works ok.

[root rhelnode01 sysconfig]# ps xa |grep corosync
 1033 ?        SLsl   0:00 corosync -f -p
 1494 pts/1    S+     0:00 grep corosync

I will do some tests with two nodes, But I think RHEL6.x is not yet ready for production environments, almost RHCS.

CL Martinez
carlopmart {at} gmail {d0t} com

