[Linux-cluster] Re: Fencing test

Tue Feb 24 23:41:31 UTC 2009

On Tue, Feb 24, 2009 at 3:15 PM, Paras pradhan <pradhanparas at gmail.com> wrote:
> Hi,
>
> Was busy on some other stuffs.
>
> On Fri, Jan 16, 2009 at 5:16 AM, Rajagopal Swaminathan
> <raju.rajsand at gmail.com> wrote:
>> Greetings,
>>
>> On Thu, Jan 15, 2009 at 1:18 AM, Paras pradhan <pradhanparas at gmail.com> wrote:
>>>> On Fri, Jan 9, 2009 at 12:09 AM, Paras pradhan <pradhanparas at gmail.com> wrote:
>>>>>
>>>>>
>>>>> In an act to solve my fencing issue in my 2 node cluster, i tried to
>>>>> run fence_ipmi to check if fencing is working or not. I need to know
>>>>> what is my problem
>>>>>
>>>>> -
>>>>> [root at ha1lx ~]# fence_ipmilan -a 10.42.21.28 -o off -l admin -p admin
>>> Yes as you said, I am able to power down node4 using node3, so it
>>> seems ipmi is working fine. But I dunno what is going on with my two
>>> node cluster. Can a red hat cluster operates fine in a two nodes mode?
>>
>> Yes. I have configured few clusters on RHEL 4 and 5. They do work.
>>
>>> Do i need qdisk or it is optional. Which area do i need to focus to
>>> run my 2 nodes red hat cluster using ipmi as fencing device.
>>>
>> But I have done it on HP, SUN and IBM servers. All of them have their
>> own technology like HP-ILO, SUN-ALOm etc.
>>
>> I never had a chance on an IPMI.
>>
>> BTW, This is a wild guess. I am just curious:
>>
>>>              <clusternode name="10.42.21.27" nodeid="2" votes="1">
>>
>> Why nodeid here is 2
>>
>>>                               <method name="1">
>>>                                      <device name="fence1"/>
>>>                               </method>
>>>
>>>               <fencedevice agent="fence_ipmilan" ipaddr="10.42.21.28" login="admin"
>>> name="fence1" passwd="admin"/>
>>
>> and
>>
>>>               <clusternode name="10.42.21.29" nodeid="1" votes="1">
>>
>> here it is 1
>
> Changing node ids did not solve my problem.
>
>
>>
>>
>>>                               <method name="1">
>>>                                       <device name="fence2"/>
>>>                               </method>
>>
>>>               <fencedevice agent="fence_ipmilan" ipaddr="10.42.21.30" login="admin" name="fence2" passwd="admin"/>
>>
>> <All the disclaimers ever invented apply>
>> HAve you tried exhanging the numbers? say the one with IP .27 to 1 and .29 to 2.
>> </All the disclaimers ever invented apply>
>>
>> No warranties offered. Just a friendly suggestion....
>> Never try it on Production cluster.
>>
>> Also we all will get a clearer picture if you use seperate switches
>> for heartbeat and data networks.
>>
>> HTH
>>
>> With warm regards
>>
>> Rajagopal
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> I will try qdisk in my 2 node cluster and post here how it goes.
>
> Thanks
> Paras.
>

Ok. here is my result using qdisk in a 2 node cluster:

O/p of qdisk ( hope to be fine)

[root at ha2lx ~]# mkqdisk -L
mkqdisk v0.5.2
/dev/sdc1:
	Magic:                eb7a62c2
	Label:                rac_qdisk
	Created:              Tue Feb 24 17:29:10 2009
	Host:                 ha1lx.xx.xxxx.com
	Kernel Sector Size:   512
	Recorded Sector Size: 512
'

Here is my cluster.conf:

[root at ha2lx cluster]# more cluster.conf
<?xml version="1.0"?>
<cluster alias="xencluster" config_version="24" name="xencluster">
	<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
	<clusternodes>
		<clusternode name="10.42.21.29" nodeid="2" votes="1">
			<fence>
				<method name="2">
					<device name="fence2"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="10.42.21.27" nodeid="1" votes="1">
			<fence>
				<method name="1">
					<device name="fence1"/>
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<cman expected_votes="3" two_node="0"/>
	<fencedevices>
		<fencedevice agent="fence_ipmilan" ipaddr="10.42.21.28"
login="admin" name="fence1" passwd="admin"/>
		<fencedevice agent="fence_ipmilan" ipaddr="10.42.21.30"
login="admin" name="fence2" passwd="admin"/>
	</fencedevices>
	<rm>
		<failoverdomains>
			<failoverdomain name="myfd" nofailback="0" ordered="1" restricted="0">
				<failoverdomainnode name="10.42.21.29" priority="2"/>
				<failoverdomainnode name="10.42.21.27" priority="1"/>
			</failoverdomain>
		</failoverdomains>
		<resources/>
		<vm autostart="1" domain="myfd" exclusive="0" migrate="live"
name="linux" path="/guest_roots" recovery="restart"/>
	</rm>
	<totem consensus="4799" join="60" token="10000"
token_retransmits_before_loss_const="20"/>
	<quorumd interval="1" label="rac_qdisk" min_score="1" tko="10" votes="1"/>
</cluster>
[root at ha2lx cluster]#

Now... When I stop the network service on node 1, ie ha1lx, on node 2 i see

[root at ha2lx ~]#
Message from syslogd@ at Tue Feb 24 16:10:29 2009 ...
ha2lx clurgmgrd[6071]: <emerg> #1: Quorum Dissolved
[root at ha2lx ~]#

Log says:

Feb 24 16:12:03 ha2lx ccsd[5039]: Cluster is not quorate.  Refusing connection.
Feb 24 16:12:03 ha2lx ccsd[5039]: Error while processing connect:
Connection refused
Feb 24 16:12:14 ha2lx ccsd[5039]: Cluster is not quorate.  Refusing connection.
Feb 24 16:12:14 ha2lx ccsd[5039]: Error while processing connect:
Connection refused
Feb 24 16:12:24 ha2lx ccsd[5039]: Cluster is not quorate.  Refusing connection.
Feb 24 16:12:24 ha2lx ccsd[5039]: Error while processing connect:
Connection refused
Feb 24 16:12:34 ha2lx ccsd[5039]: Cluster is not quorate.  Refusing connection.

What is wrong here? I have tried chaning vote in quoromd interval to 2
as well. same message.  Also, how do I know if my qdisk is working
properly.

Thanks
Paras.