[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Strange error returned by openais



carlopmart wrote:
Christine Caulfield wrote:
On 03/03/10 09:02, carlopmart wrote:
martijn tenheuvel net wrote:
Hi all,

I am trying to setup a rh5.4 cluster with only two nodes, but I can't.
Under
/var/log/messages I can see a lot of errors like these:

These nodes have two network interfaces, one on the same network for
cluster
operation and another on different subnet. Like this:

Node01: 172.16.1.1 (eth0) and 192.168.35.1 (eth1)
Node02: 172.16.1.2 (eth0) and 172.26.50.1 (eth1)

Default gateways point to 192.168.35.20 in node01 and on node02 to
172.26.50.30
... maybe this is the problem??

I have put ip routing rules on both nodes but problem continues ... How
can I fix
this??

I've had exactly the same errors, and eventually found what was wrong.
The problem seems to be the vlans, switches which block the multicast
traffic. For now I'm using a crosscable.

So, check with the network engineers, they should be able to assist you,
but you can convince them they're blocking you using the crosscable.

regards,
Martijn




Maybe you are right Martijn. I have copied manually cluster.conf from
node02 to node01 and all works ok (node01 joins to cluster). But If
mutlicast is the problem, why node01 joins to cluster if cluster.conf it
is at same version than on node02??

My problem only occurs when cluster.conf version is different between
nodes ...


Well, that's exactly your problem! cman expects the cluster.conf to be the same version on all nodes. ccsd is meant to synchronise these in RHEL5 but it has problems with a two node cluster where quorum cannot be established.

What you need to do is either use two_node="1" mode in cluster.conf or use a quorum disk to maintain quorum while a single node is up.

Chrissie


But I am using two_node=1 on my cluster.conf. Here it is:

<?xml version="1.0"?>
<cluster alias="MiddleEarth" config_version="12" name="MiddleEarth">
<fence_daemon post_fail_delay="0" post_join_delay="3" clean_start="1"/>
        <clusternodes>
<clusternode name="mgmtnode01.hpulabs.org" nodeid="1" votes="1">
                        <multicast addr="239.192.11.25" interface="eth1"/>
                        <fence>
                                <method name="1">
<device name="last-resort" nodename="mgmtnode01.hpulabs.org"/>
                                </method>
                        </fence>
                </clusternode>
<clusternode name="mgmtnode02.hpulabs.org" nodeid="2" votes="1">
                        <multicast addr="239.192.11.25" interface="eth1"/>
                        <fence>
                                <method name="1">
<device name="last-resort" nodename="mgmtnode02.hpulabs.org"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1">
                <multicast addr="239.192.11.25"/>
        </cman>
        <fencedevices>
                <fencedevice agent="fence_manual" name="last-resort"/>
        </fencedevices>
        <rm log_facility="local4" log_level="7"/>
</cluster>

I have another two-node cluster configured like this (except on these nodes they have only one interface) and all works ok. When I make changes in cluster.conf on one node is replicated automatically on the other ... Why doesn't occurs the same on this two-node cluster??

Thanks.


Any ideas please??

--
CL Martinez
carlopmart {at} gmail {d0t} com


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]