[Linux-cluster] Testing Failover - Failing in few cases

Hagmann, Michael Michael.Hagmann at hilti.com
Wed May 30 14:55:41 UTC 2007


Hi 
 
First of all when you really have RHEL4 update4, then you should update
to RHEL4 update5 befor you go into more testing.
 
There are a lot of bugs in RHEL4 CS Update 4 !
 
Mike
 

________________________________

From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Satya Daragani
Sent: Montag, 28. Mai 2007 15:12
To: Linux-cluster at redhat.com
Subject: [Linux-cluster] Testing Failover - Failing in few cases


Hi Linux-Cluster Team,
 
Please help me in testing the failover with the RHEL Cluster Suite 4
with update 4. I am appending the details related to cluster nodes and
configuration here. Kindly suggest me how to proceed further.
 

IBM Lenovo Thinkcentre with AMD Opteron 64bit processor - Two nodes

256 MB RAM

One NIC

 

1.	Installed RHEL AS 4 Update 4 on both the nodes 
2.	Configured NIC with IP range 192.168.1.x (node1 - 192.168.1.1 ,
node2 - 192.168.1.2) 
3.	Configured /etc/hosts. 
4.	Installed the RHEL cluster suite 4 update 4 on both nodes. 
5.	Added both the nodes in the cluster manager with one quorum vote

6.	No fence devices configured (chkconfig --del fenced) 
7.	Restricted & ordered by priority (node1 - 1, node -2) level
failover domain configured. 
8.	Shared IP address (192.168.1.5) resource is configured and
enabled the monitor link option. 
9.	Created a service with the name httpd and configured the
following 

	a.	Checked the Autostart this service 
	b.	Selected the failover domain configured in the previous
steps. 
	c.	Selected the Relocate as the recovery policy 
	d.	Added the shared resource (IP created in the above
steps), under this shared resource added the private resource
script(/etc/rc.d/init.d/httpd). 

 

Checking the failover:

1st case

After configuring the above, now node1 is the primary node for the httpd
service.

If I restart the node1 the service is failed over to the node2, and once
the node1 comes up again the service is failing over to the node1 (as
the priority is configured) 

 

2nd case

Currently node1 is running the httpd service, if I down the network
interface (ifconfig eth0 down), the httpd service is failing over to the
node2. 

Then if I up the interface (ifconfig eth0 up) on node1, the service is
not failovering to the node1 and in the /var/log/messages it is saying
"unable to contact the cluster infrastructure". Need your help here

 

If I restart the cluster services on the node1 again the service is
getting started on the node1.

 

3rd case

Currently node1 is running the HTTPd service, if I remove the powercord
(I mean the improper shutdown), the service is going to the recovery
mode and not getting started on the node2. Need your help here.

 

4th case

Currently node1 is running the httpd service, if I stop or killall the
httpd service (service httpd stop) failover is not happening. Need your
help here.

 

-- 
Thanx
Satya Daragani
satya.daragani at gmail.com 
+91 98850 58366 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070530/97b506f1/attachment.htm>


More information about the Linux-cluster mailing list