[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Failover issues when shuting down node



 

Hi,

 

I think this is related to bug:

 

https://bugzilla.redhat.com/show_bug.cgi?id=548133

 

Adding a sleep after stopping rgmanager can help.

 

Best regards,

 

Alfredo

 

 

 


From: linux-cluster-bounces redhat com [mailto:linux-cluster-bounces redhat com] On Behalf Of Marcos David
Sent: Wednesday, January 27, 2010 11:53 AM
To: linux clustering
Subject: [Linux-cluster] Failover issues when shuting down node

 

Hi,

after a few tests with a four-node cluster (mainly shutting one down to see if the failover was working properly) we had the following messages:

Jan 27 03:31:02 node2_pub clurgmgrd[4240]: <err> #75: Failed changing service status
Jan 27 03:31:02 node2_pub clurgmgrd[4240]: <debug> Stopping failed service service:PID_PA-SA-R2
Jan 27 03:31:07 node2_pub clurgmgrd[4240]: <notice> Stopping service service:PID_PA-SA-R2

...
other checks
...
Jan 27 03:31:25 node2_pub openais[3480]: [TOTEM] entering GATHER state from 12.
Jan 27 03:31:27 node2_pub clurgmgrd[4240]: <err> #52: Failed changing RG status

Jan 27 03:31:27 node2_pub clurgmgrd[4240]: <crit> #13: Service service:PID_PA-SA-R2 failed to stop cleanly
Jan 27 03:31:27 node2_pub clurgmgrd[4240]: <debug> Handling failure request for RG service:PID_PA-SA-R2

Jan 27 03:31:30 node2_pub openais[3480]: [TOTEM] entering GATHER state from 11.

The problem is that the same service was running on two nodes which isn't supposed to happen....


The service in question consists of a virtual ip and a script.
The script's stop doesn't return an error in any circumstance.

Cluster is running RHEL5.3.

What could have caused these errors?


Thanks for any insight and help ;)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]