[Linux-cluster] Problems with relocation of apache and fence_vmware

Thu Aug 30 15:28:17 UTC 2012

Never mind. I am good now. I have figured out the syntax for fence_vmware
and it works beautifully now.

Here it is, just in case someone breaks his head to get this done in future
..

...
        <clusternodes>
                <clusternode name="node1.localdomain" nodeid="1" votes="1">
                        <fence>
                                <method name="fence_vmware">
                                        <device name="vmware"
port="node1.localdomain"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="node2.localdomain" nodeid="2" votes="1">
                        <fence>
                                <method name="fence_vmware">
                                        <device name="vmware"
port="node2.localdomain"/>
                                </method>
                        </fence>
                </clusternode>
..
        <fencedevices>
                <fencedevice agent="fence_vmware" ipaddr="a.b.c.d"
login="xxx" name="vmware" passwd="xxx"/>
        </fencedevices>
...

I will be doing some series of fail-over scenarios ( node and service
failures have worked very well so far) and will get back with the results
if there are any concerns. Thanks for helping me thus far. I really
appreciate.

Param

On Thu, Aug 30, 2012 at 1:37 PM, PARAM KRISH <mkparam at gmail.com> wrote:

> *Background : *
> I am using two VM's hosted in my internal lab that has two interfaces one
> configured with a valid IP and other being down. I have kept the VIP also
> in the same network. My intention is to have a Apache configured as cluster
> service in these two nodes and do a fail-over when the node or the
> interface goes down. I try to use fence_vmware as fencing device. These two
> VM's are now part of a ESX 4.1 host and the GuestOS in my VM's are RHEL6.0
> 32-bit.
>
>
> I am seeing the following problems in my setup now ...
>
> 1. When starting a apache service from LUCI, it starts fine in a node.
> But, if i kill httpd process from that node manually, it does not detect
> the service is down to restart or to relocate
> 2. -same- case if i do "ip adds del <VIP>" ; it just detects the node is
> down but does not do a restart or relocate of the service
> 3. Whenever i reboot the nodes, it comes online and the service properly
> starts fine in either of the node and both nodes perfectly in Quorum but
> the fail-over never happens if i stop that active node.
> 4. I am not sure what format of fence that i must put in the cluster.conf,
> since there is no way i can test that out if at all it works fine.
>
> Manual tests :
> 1. I manually run something like this
> "fence_vmware --action=status --ip=10.72.145.145 --username=<login>
> --password=<password> --plug=<vm-name>" which works fine on both the nodes.
> 2. Apache starts/stops just particularly fine from both nodes when i do
> "rg_test test /etc/cluster/cluster.conf start service WEB"
>
> Cluster.conf is attached herewith.
> rgmanager.log is attached herewith.
>
> Please let me know any specific debug commands that i can run manually to
> find out the issues going on here, more particularly the "relocation" of
> service and the "fencing"; both consistently fails.
>
> Please help. I have been spending more than 10 days now to set this up in
> my internal lab to show it as Proof of Concept to my business heads to buy
> RHEL cluster indeed works for our production requirement.
>
> -Param
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120830/2a00c23c/attachment.htm>