[Linux-cluster] Re: Fencing test
Paras pradhan
pradhanparas at gmail.com
Tue Feb 24 23:41:31 UTC 2009
On Tue, Feb 24, 2009 at 3:15 PM, Paras pradhan <pradhanparas at gmail.com> wrote:
> Hi,
>
> Was busy on some other stuffs.
>
> On Fri, Jan 16, 2009 at 5:16 AM, Rajagopal Swaminathan
> <raju.rajsand at gmail.com> wrote:
>> Greetings,
>>
>> On Thu, Jan 15, 2009 at 1:18 AM, Paras pradhan <pradhanparas at gmail.com> wrote:
>>>> On Fri, Jan 9, 2009 at 12:09 AM, Paras pradhan <pradhanparas at gmail.com> wrote:
>>>>>
>>>>>
>>>>> In an act to solve my fencing issue in my 2 node cluster, i tried to
>>>>> run fence_ipmi to check if fencing is working or not. I need to know
>>>>> what is my problem
>>>>>
>>>>> -
>>>>> [root at ha1lx ~]# fence_ipmilan -a 10.42.21.28 -o off -l admin -p admin
>>> Yes as you said, I am able to power down node4 using node3, so it
>>> seems ipmi is working fine. But I dunno what is going on with my two
>>> node cluster. Can a red hat cluster operates fine in a two nodes mode?
>>
>> Yes. I have configured few clusters on RHEL 4 and 5. They do work.
>>
>>> Do i need qdisk or it is optional. Which area do i need to focus to
>>> run my 2 nodes red hat cluster using ipmi as fencing device.
>>>
>> But I have done it on HP, SUN and IBM servers. All of them have their
>> own technology like HP-ILO, SUN-ALOm etc.
>>
>> I never had a chance on an IPMI.
>>
>> BTW, This is a wild guess. I am just curious:
>>
>>> <clusternode name="10.42.21.27" nodeid="2" votes="1">
>>
>> Why nodeid here is 2
>>
>>> <method name="1">
>>> <device name="fence1"/>
>>> </method>
>>>
>>> <fencedevice agent="fence_ipmilan" ipaddr="10.42.21.28" login="admin"
>>> name="fence1" passwd="admin"/>
>>
>> and
>>
>>> <clusternode name="10.42.21.29" nodeid="1" votes="1">
>>
>> here it is 1
>
> Changing node ids did not solve my problem.
>
>
>>
>>
>>> <method name="1">
>>> <device name="fence2"/>
>>> </method>
>>
>>> <fencedevice agent="fence_ipmilan" ipaddr="10.42.21.30" login="admin" name="fence2" passwd="admin"/>
>>
>> <All the disclaimers ever invented apply>
>> HAve you tried exhanging the numbers? say the one with IP .27 to 1 and .29 to 2.
>> </All the disclaimers ever invented apply>
>>
>> No warranties offered. Just a friendly suggestion....
>> Never try it on Production cluster.
>>
>> Also we all will get a clearer picture if you use seperate switches
>> for heartbeat and data networks.
>>
>> HTH
>>
>> With warm regards
>>
>> Rajagopal
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> I will try qdisk in my 2 node cluster and post here how it goes.
>
> Thanks
> Paras.
>
Ok. here is my result using qdisk in a 2 node cluster:
O/p of qdisk ( hope to be fine)
[root at ha2lx ~]# mkqdisk -L
mkqdisk v0.5.2
/dev/sdc1:
Magic: eb7a62c2
Label: rac_qdisk
Created: Tue Feb 24 17:29:10 2009
Host: ha1lx.xx.xxxx.com
Kernel Sector Size: 512
Recorded Sector Size: 512
'
Here is my cluster.conf:
[root at ha2lx cluster]# more cluster.conf
<?xml version="1.0"?>
<cluster alias="xencluster" config_version="24" name="xencluster">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="10.42.21.29" nodeid="2" votes="1">
<fence>
<method name="2">
<device name="fence2"/>
</method>
</fence>
</clusternode>
<clusternode name="10.42.21.27" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="fence1"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="3" two_node="0"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="10.42.21.28"
login="admin" name="fence1" passwd="admin"/>
<fencedevice agent="fence_ipmilan" ipaddr="10.42.21.30"
login="admin" name="fence2" passwd="admin"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="myfd" nofailback="0" ordered="1" restricted="0">
<failoverdomainnode name="10.42.21.29" priority="2"/>
<failoverdomainnode name="10.42.21.27" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources/>
<vm autostart="1" domain="myfd" exclusive="0" migrate="live"
name="linux" path="/guest_roots" recovery="restart"/>
</rm>
<totem consensus="4799" join="60" token="10000"
token_retransmits_before_loss_const="20"/>
<quorumd interval="1" label="rac_qdisk" min_score="1" tko="10" votes="1"/>
</cluster>
[root at ha2lx cluster]#
Now... When I stop the network service on node 1, ie ha1lx, on node 2 i see
[root at ha2lx ~]#
Message from syslogd@ at Tue Feb 24 16:10:29 2009 ...
ha2lx clurgmgrd[6071]: <emerg> #1: Quorum Dissolved
[root at ha2lx ~]#
Log says:
Feb 24 16:12:03 ha2lx ccsd[5039]: Cluster is not quorate. Refusing connection.
Feb 24 16:12:03 ha2lx ccsd[5039]: Error while processing connect:
Connection refused
Feb 24 16:12:14 ha2lx ccsd[5039]: Cluster is not quorate. Refusing connection.
Feb 24 16:12:14 ha2lx ccsd[5039]: Error while processing connect:
Connection refused
Feb 24 16:12:24 ha2lx ccsd[5039]: Cluster is not quorate. Refusing connection.
Feb 24 16:12:24 ha2lx ccsd[5039]: Error while processing connect:
Connection refused
Feb 24 16:12:34 ha2lx ccsd[5039]: Cluster is not quorate. Refusing connection.
What is wrong here? I have tried chaning vote in quoromd interval to 2
as well. same message. Also, how do I know if my qdisk is working
properly.
Thanks
Paras.
More information about the Linux-cluster
mailing list