[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Problems with relocation of apache and fence_vmware



Never mind. I am good now. I have figured out the syntax for fence_vmware and it works beautifully now. 

Here it is, just in case someone breaks his head to get this done in future .. 

...
        <clusternodes>
                <clusternode name="node1.localdomain" nodeid="1" votes="1">
                        <fence>
                                <method name="fence_vmware">
                                        <device name="vmware" port="node1.localdomain"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="node2.localdomain" nodeid="2" votes="1">
                        <fence>
                                <method name="fence_vmware">
                                        <device name="vmware" port="node2.localdomain"/>
                                </method>
                        </fence>
                </clusternode>
..
        <fencedevices>
                <fencedevice agent="fence_vmware" ipaddr="a.b.c.d" login="xxx" name="vmware" passwd="xxx"/>
        </fencedevices>
...

I will be doing some series of fail-over scenarios ( node and service failures have worked very well so far) and will get back with the results if there are any concerns. Thanks for helping me thus far. I really appreciate.

Param


On Thu, Aug 30, 2012 at 1:37 PM, PARAM KRISH <mkparam gmail com> wrote:
Background : 
I am using two VM's hosted in my internal lab that has two interfaces one configured with a valid IP and other being down. I have kept the VIP also in the same network. My intention is to have a Apache configured as cluster service in these two nodes and do a fail-over when the node or the interface goes down. I try to use fence_vmware as fencing device. These two VM's are now part of a ESX 4.1 host and the GuestOS in my VM's are RHEL6.0 32-bit.


I am seeing the following problems in my setup now ... 

1. When starting a apache service from LUCI, it starts fine in a node. But, if i kill httpd process from that node manually, it does not detect the service is down to restart or to relocate
2. -same- case if i do "ip adds del <VIP>" ; it just detects the node is down but does not do a restart or relocate of the service
3. Whenever i reboot the nodes, it comes online and the service properly starts fine in either of the node and both nodes perfectly in Quorum but the fail-over never happens if i stop that active node.
4. I am not sure what format of fence that i must put in the cluster.conf, since there is no way i can test that out if at all it works fine.

Manual tests :
1. I manually run something like this 
"fence_vmware --action="" --ip=10.72.145.145 --username=<login> --password=<password> --plug=<vm-name>" which works fine on both the nodes.
2. Apache starts/stops just particularly fine from both nodes when i do 
"rg_test test /etc/cluster/cluster.conf start service WEB"

Cluster.conf is attached herewith.
rgmanager.log is attached herewith.

Please let me know any specific debug commands that i can run manually to find out the issues going on here, more particularly the "relocation" of service and the "fencing"; both consistently fails. 

Please help. I have been spending more than 10 days now to set this up in my internal lab to show it as Proof of Concept to my business heads to buy RHEL cluster indeed works for our production requirement. 

-Param


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]