[Linux-cluster] Corosync goes cpu to 95-99%

Nicolas Ross rossnick-lists at cybercat.ca
Fri Nov 4 18:05:47 UTC 2011


>> get a support signoff.  Also the corosync updates have not finished
>> through our validation process.  Only hot fixes (from support) are 
>> available
>>
>> Regards
>> -steve
>>
>
> Sorry to re-open this thread ... But exists any news about this problem??

In fact, there is !

It appears that this situation is within the microcode of some specific xeon 
"nahalem" (sorry for the spelling) processors... It has to do with switching 
cstate and the way rhel6.1 now switch state that was not done in 6.0.

You can look at bugzilla # 710265 and kb docs # 61105.

Our temporary fix for the moment was to disable cstate transition by adding 
:

intel_idle.max_cstate=0 processor.max_cstate=1

to the kernel line in grub.conf, update and reboot. We hadn't had any cpu 
spikes on any of the 5 nodes we've updated yet. The 3 remaining still 
haven't been updated due to production downtime.

Get a support signoff for this, I'm in no way endorsing this solution, as I 
can't know if you're in the same situation as mine.

Have fun ! 




More information about the Linux-cluster mailing list