[Linux-cluster] Active-Active configuration of arbitrary services

Fri Oct 19 14:53:54 UTC 2007

We are running RHCS on RHEL 4.5 and have a basic 2-node HA cluster
configuration for a critical application in place and functional. The
config looks like this:

<?xml version="1.0"?>
<cluster config_version="16" name="routing_cluster">
        <fence_daemon post_fail_delay="0" post_join_delay="10"/>
        <clusternodes>
                <clusternode name="host1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="manual"
nodename="host1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="host2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="manual"
nodename="host2"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman dead_node_timeout="10" expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_manual" name="manual"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="routing_servers"
ordered="1" restricted="1">
                                <failoverdomainnode name="host1"
priority="1"/>
                                <failoverdomainnode name="host2"
priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <script file="/etc/init.d/rsd" name="rsd"/>
                        <ip address="123.456.78.9" monitor_link="1"/>
                </resources>
                <service autostart="1" domain="routing_servers"
name="routing_daemon" recovery="relocate">
                        <ip ref="123.456.78.9"/>
                        <script ref="rsd"/>
                </service>
        </rm>
</cluster>

The cluster takes about 15-20 seconds to notice that the daemon is down
and migrate it to the other node. However, due to slow migration and
startup time, we now require the daemon on the secondary to be active
and only transfer the VIP in case it aborts on the primary. 

I have found the cluster suite documentation to be lacking, particularly
in its reliance on GUI configuration tools without reference or
explanation to the underlying command-line tools. I managed to figure
out most of it by myself (we don't run GUIs on servers), but I'm not
sure how to keep both services active, using the init script for status,
while only moving the VIP. Is this possible? Am I overlooking the
obvious? 

Thanks in advance for suggestions/help!

Glenn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071019/99b0753f/attachment.htm>