[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] cman-2.0.95-1.el5 / question about a problem when launching cman



Yes, I suspect the problem is that the node is 'bouncing' as it joins
the cluster.

Causes of this are usually to do with either a) startup scripts (eg some
Xen ones) taking he interface down and then up after openais has started
or b) "intelligent" switches taking too long to recognise the multicast
join. So that both cluster nodes have "state" (the dirty flag) by the
time they see each other.

Chrissie

Alain.Moulle wrote:
> Hi,
> 
> Thanks for your response Marc. It seems that we are the only ones facing
> this
> problem ... ?
> I saw in changelog a fix :
> - A dirty node is now prevented from joining the cman cluster.
> It could be related to our problem ... because when launching cman
> on the second node, the node is labeled as "dirty" ...
> So could someone explain which are all the possible causes which
> could flagged a node as "dirty" and lead to our problem ?
> 
> PS: note that this problem does not happen on RHEL 5.2 with cman 2-0-73-1
> 
> Thanks
> Regards,
> Alain
> 
>> Hi,
>> I'm having "exactly" the same problem with some clusters (Version: 
>> cman-2.0.84-2.el5_2.2,..) 
>>
>> Is it so that if you reboot the node that was killed, it will rejoin the 
>> cluster without being killed? And does it only happen if you start the whole 
>> cluster from scratch?
>>
>> I didn't figure out the whole picture behind it but I think it is related to 
>> IGMP,openais and cman. At least it fells like the same behaviour I'm 
>> experiencing.
>>
>> Somehow it seems to be related to the networkswitches and IGMP Version being 
>> used (I don't have it on all RHEL5 clusters but on the majority running 
>> RHEl5U2+). I'm still investigating on this issue.
>>
>> Strange thing.
>>
>> Marc.
>>
>> On Friday 09 January 2009 11:47:02 Alain.Moulle wrote:
>>   
>>> > Hi
>>> >
>>> > Release : cman-2.0.95-1.el5
>>> > (but same problem with 2.0.98)
>>> >
>>> > I face a problem when launching cman on a two-node cluster :
>>> >
>>> > 1. Launching cman on node 1 : OK
>>> > 2. When launching cman on node 2, the log on node1 gives :
>>> >     cman killed by node 2 because we rejoined the cluster without a full
>>> > restart
>>> >
>>> > Any idea ?
> 
> 
> ------------------------------------------------------------------------
> 
> --
> Linux-cluster mailing list
> Linux-cluster redhat com
> https://www.redhat.com/mailman/listinfo/linux-cluster


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]