[Linux-cluster] Clustering tomcat

Digimer lists at alteeve.ca
Wed Apr 11 15:44:19 UTC 2012


On 04/11/2012 07:24 AM, Sadvary, Bill wrote:
> 
> Hi,
> 
> I'm having some difficulty getting a tomcat cluster service up and running with Centos v6.2 and Tomcat6.
> 
> The service won't start tomcat and it keeps ping-ponging back and forth between the servers every 30 seconds.  
> 
> Below is the cluster.conf file, "messages" file and the rgmanager.log
> 
> Any help would be appreciated.
> 
> Thanks,
> -Bill
> 
> 
> Here's my cluster.conf
> ---------------------------
> 
> <?xml version="1.0"?>
> <cluster config_version="11" name="AUTHCLUSTERDEV">
>         <cman expected_votes="1" two_node="1"/>
>         <clusternodes>
>                 <clusternode name="AUTHCLUSTER1DEV" nodeid="1">
>                         <fence>
>                                 <method name="single"/>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="AUTHCLUSTER2DEV" nodeid="2">
>                         <fence>
>                                 <method name="single"/>
>                         </fence>
>                 </clusternode>
>         </clusternodes>
>         <rm>
>                 <failoverdomains>
>                         <failoverdomain name="failoverDom" nofailback="1" ordered="0" restricted="0">
>                                 <failoverdomainnode name="AUTHCLUSTER1DEV" priority="1"/>
>                                 <failoverdomainnode name="AUTHCLUSTER2DEV" priority="1"/>
>                         </failoverdomain>
>                 </failoverdomains>
>                 <resources>
>                         <ip address="172.16.223.69" monitor_link="1"/>
>                         <tomcat-6 config_file="/etc/tomcat6/tomcat6.conf" name="tomcat6" shutdown_wait="30"/>
>                 </resources>
>                 <service domain="failoverDom" name="ipservice" recovery="relocate">
>                         <ip ref="172.16.223.69">
>                                 <tomcat-6 ref="tomcat6"/>
>                         </ip>
>                 </service>
>         </rm>
>         <logging debug="on"/>
> </cluster>
> 
> Here's the "messages" file after one full cycle of ping-pongs
> ------------------------------------------------------------------------
> Apr 10 10:09:44 DKNAUTH1DEV rgmanager[2191]: Service service:ipservice is now running on member 2
> Apr 10 10:10:55 DKNAUTH1DEV rgmanager[2191]: Recovering failed service service:ipservice
> Apr 10 10:10:56 DKNAUTH1DEV rgmanager[8695]: [ip] Adding IPv4 address 172.16.223.69/28 to eth2
> Apr 10 10:11:00 DKNAUTH1DEV rgmanager[8837]: [tomcat-6] Starting Service tomcat-6:tomcat6
> Apr 10 10:11:00 DKNAUTH1DEV ntpd[1938]: Listening on interface #81 eth2, 172.16.223.69#123 Enabled
> Apr 10 10:11:01 DKNAUTH1DEV rgmanager[2191]: Service service:ipservice started
> Apr 10 10:12:09 DKNAUTH1DEV rgmanager[9694]: [tomcat-6] Checking Existence Of File /var/run/cluster/tomcat-6/tomcat-6:tomcat6.pid [tomcat-6:tomcat6] > Failed
> Apr 10 10:12:09 DKNAUTH1DEV rgmanager[9714]: [tomcat-6] Monitoring Service tomcat-6:tomcat6 > Service Is Not Running
> Apr 10 10:12:09 DKNAUTH1DEV rgmanager[2191]: status on tomcat-6 "tomcat6" returned 1 (generic error)
> Apr 10 10:12:09 DKNAUTH1DEV rgmanager[2191]: Stopping service service:ipservice
> Apr 10 10:12:09 DKNAUTH1DEV rgmanager[9805]: [tomcat-6] Stopping Service tomcat-6:tomcat6
> Apr 10 10:12:10 DKNAUTH1DEV rgmanager[9825]: [tomcat-6] Checking Existence Of File /var/run/cluster/tomcat-6/tomcat-6:tomcat6.pid [tomcat-6:tomcat6] > Failed - File Doesn'
> Apr 10 10:12:10 DKNAUTH1DEV rgmanager[9845]: [tomcat-6] Stopping Service tomcat-6:tomcat6 > Succeed
> Apr 10 10:12:10 DKNAUTH1DEV rgmanager[9896]: [ip] Removing IPv4 address 172.16.223.69/28 from eth2
> Apr 10 10:12:11 DKNAUTH1DEV ntpd[1938]: Deleting interface #81 eth2, 172.16.223.69#123, interface stats: received=0, sent=0, dropped=0, active_time=71 secs
> Apr 10 10:12:20 DKNAUTH1DEV rgmanager[2191]: Service service:ipservice is recovering
> Apr 10 10:12:24 DKNAUTH1DEV rgmanager[2191]: Service service:ipservice is now running on member 2
> 
> The rgmanager.log for the same time duration
> --------------------------------------------------------
> Apr 10 10:09:44 rgmanager Service service:ipservice is now running on member 2
> Apr 10 10:09:49 rgmanager 2 events processed
> Apr 10 10:10:55 rgmanager Recovering failed service service:ipservice
> Apr 10 10:10:56 rgmanager [ip] Link for eth2: Detected
> Apr 10 10:10:56 rgmanager [ip] Adding IPv4 address 172.16.223.69/28 to eth2
> Apr 10 10:10:56 rgmanager [ip] Pinging addr 172.16.223.69 from dev eth2
> Apr 10 10:10:59 rgmanager [ip] Sending gratuitous ARP: 172.16.223.69 00:15:5d:98:91:05 brd ff:ff:ff:ff:ff:ff
> Apr 10 10:11:00 rgmanager [tomcat-6] Verifying Configuration Of tomcat-6:tomcat6
> Apr 10 10:11:00 rgmanager [tomcat-6] Verifying Configuration Of tomcat-6:tomcat6 > Succeed
> Apr 10 10:11:00 rgmanager [tomcat-6] Starting Service tomcat-6:tomcat6
> Apr 10 10:11:00 rgmanager 1 events processed
> Apr 10 10:11:00 rgmanager [tomcat-6] Looking For IP Addresses
> Apr 10 10:11:01 rgmanager [tomcat-6] 1 IP addresses found for ipservice/tomcat6
> Apr 10 10:11:01 rgmanager [tomcat-6] Looking For IP Addresses > Succeed -  IP Addresses Found
> Apr 10 10:11:01 rgmanager [tomcat-6] Checking: SHA1 checksum of config file /tomcat-6/tomcat-6:tomcat6/conf/server.xml
> Apr 10 10:11:01 rgmanager [tomcat-6] Checking: SHA1 checksum > succeed
> Apr 10 10:11:01 rgmanager [tomcat-6] Generating New Config File /tomcat-6/tomcat-6:tomcat6/conf/server.xml From /usr/share/tomcat6/conf/server.xml
> Apr 10 10:11:01 rgmanager [tomcat-6] Generating New Config File /tomcat-6/tomcat-6:tomcat6/conf/server.xml From /usr/share/tomcat6/conf/server.xml > SucApr 10 10:11:01 rgmanager [tomcat-6] Starting Service tomcat-6:tomcat6 > Succeed
> Apr 10 10:11:01 rgmanager Service service:ipservice started
> Apr 10 10:11:07 rgmanager 1 events processed
> Apr 10 10:11:29 rgmanager [ip] Checking 172.16.223.69, Level 0
> Apr 10 10:11:29 rgmanager [ip] 172.16.223.69 present on eth2
> Apr 10 10:11:29 rgmanager [ip] Link for eth2: Detected
> Apr 10 10:11:29 rgmanager [ip] Link detected on eth2
> Apr 10 10:11:49 rgmanager [ip] Checking 172.16.223.69, Level 0
> Apr 10 10:11:49 rgmanager [ip] 172.16.223.69 present on eth2
> Apr 10 10:11:49 rgmanager [ip] Link for eth2: Detected
> Apr 10 10:11:49 rgmanager [ip] Link detected on eth2
> Apr 10 10:12:09 rgmanager [ip] Checking 172.16.223.69, Level 10
> Apr 10 10:12:09 rgmanager [ip] 172.16.223.69 present on eth2
> Apr 10 10:12:09 rgmanager [ip] Link for eth2: Detected
> Apr 10 10:12:09 rgmanager [ip] Link detected on eth2
> Apr 10 10:12:09 rgmanager [ip] Local ping to 172.16.223.69 succeeded
> Apr 10 10:12:09 rgmanager [tomcat-6] Verifying Configuration Of tomcat-6:tomcat6
> Apr 10 10:12:09 rgmanager [tomcat-6] Verifying Configuration Of tomcat-6:tomcat6 > Succeed
> Apr 10 10:12:09 rgmanager [tomcat-6] Monitoring Service tomcat-6:tomcat6
> Apr 10 10:12:09 rgmanager [tomcat-6] Checking Existence Of File /var/run/cluster/tomcat-6/tomcat-6:tomcat6.pid [tomcat-6:tomcat6] > Failed
> Apr 10 10:12:09 rgmanager [tomcat-6] Monitoring Service tomcat-6:tomcat6 > Service Is Not Running
> Apr 10 10:12:09 rgmanager status on tomcat-6 "tomcat6" returned 1 (generic error)
> Apr 10 10:12:09 rgmanager Stopping service service:ipservice
> Apr 10 10:12:09 rgmanager [tomcat-6] Verifying Configuration Of tomcat-6:tomcat6
> Apr 10 10:12:09 rgmanager [tomcat-6] Verifying Configuration Of tomcat-6:tomcat6 > Succeed
> Apr 10 10:12:09 rgmanager [tomcat-6] Stopping Service tomcat-6:tomcat6
> Apr 10 10:12:10 rgmanager [tomcat-6] Checking Existence Of File /var/run/cluster/tomcat-6/tomcat-6:tomcat6.pid [tomcat-6:tomcat6] > Failed - File Doesn'Apr 10 10:12:10 rgmanager [tomcat-6] Stopping Service tomcat-6:tomcat6 > Succeed
> Apr 10 10:12:10 rgmanager [ip] Removing IPv4 address 172.16.223.69/28 from eth2
> Apr 10 10:12:20 rgmanager Service service:ipservice is recovering
> Apr 10 10:12:20 rgmanager Sent remote-start request to 2
> Apr 10 10:12:24 rgmanager Service service:ipservice is now running on member 2
> Apr 10 10:12:29 rgmanager 2 events processed
> Apr 10 10:12:39 rgmanager Forwarding req. to AUTHCLUSTER2DEV.
> Apr 10 10:12:40 rgmanager FW: Forwarding disable request to 2
> Apr 10 10:12:55 rgmanager 1 events processed

I've not used tomcat (or it's RA), so I can't speak to it specifically.
It looks like the RA is returning a bad exit code though... If you look
at /usr/share/cluster/tomcat-6.sh, you might be able to suss out what it
is failing on.

As an aside; you need a proper fence device. As it is now, a node
failure will hang your cluster as 'single' is not defined from what I
see. Have you tested a node failure?

-- 
Digimer
Papers and Projects: https://alteeve.com




More information about the Linux-cluster mailing list