[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] LVS not not failing over properly



I have a LVS-NAT implementation in the lab that sort of works. I have a primary and hot backup lvs node, and two web servers behind it. I can happily point my web browser at the virtual IP and I get the apache test page just fine. I check the httpd access logs on the two real web servers and see that the load is being distributed. The problem lies when I try to test the failover of the lvs nodes. I shut the primary node down, and I see that it at least attempts to fail over, and seems to do so successfully:

Aug 25 18:21:44 lb2 pulse[5064]: partner dead: activating lvs
Aug 25 18:21:44 lb2 lvs[5083]: starting virtual service glassfish active: 80
Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record for 10.11.12.10 on eth1. Aug 25 18:21:44 lb2 avahi-daemon[3136]: Withdrawing address record for 10.11.12.10 on eth1. Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record for 10.11.12.10 on eth1. Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record for 10.100.13.220 on eth0. Aug 25 18:21:44 lb2 avahi-daemon[3136]: Withdrawing address record for 10.100.13.220 on eth0. Aug 25 18:21:44 lb2 avahi-daemon[3136]: Registering new address record for 10.100.13.220 on eth0. Aug 25 18:21:44 lb2 lvs[5083]: create_monitor for glassfish/gf1 running as pid 5094 Aug 25 18:21:44 lb2 nanny[5094]: starting LVS client monitor for 10.100.13.220:80 Aug 25 18:21:44 lb2 nanny[5095]: starting LVS client monitor for 10.100.13.220:80 Aug 25 18:21:44 lb2 lvs[5083]: create_monitor for glassfish/gf2 running as pid 5095
Aug 25 18:21:44 lb2 nanny[5094]: making 10.11.12.1:80 available
Aug 25 18:21:44 lb2 nanny[5095]: making 10.11.12.2:80 available
Aug 25 18:21:49 lb2 pulse[5085]: gratuitous lvs arps finished


The problem is that attempts from my web browser to refresh the page are unsuccessful. The lvs.cf is synchronized between the lvs nodes. Here's a copy of the config:


serial_no = 49
primary = 10.100.13.96
primary_private = 10.11.12.8
service = lvs
backup_active = 1
backup = 10.100.13.87
backup_private = 10.11.12.9
heartbeat = 1
heartbeat_port = 539
keepalive = 6
deadtime = 10
network = nat
nat_router = 10.11.12.10 eth1:1
nat_nmask = 255.255.255.0
debug_level = NONE
monitor_links = 1
virtual glassfish {
    active = 1
    address = 10.100.13.220 eth0:1
    vip_nmask = 255.255.255.0
    port = 80
    send = "GET / HTTP/1.0\r\n\r\n"
    expect = "HTTP"
    use_regex = 0
    load_monitor = none
    scheduler = wlc
    protocol = tcp
    timeout = 6
    reentry = 15
    quiesce_server = 0
    server gf1 {
        address = 10.11.12.1
        active = 1
        weight = 1
    }
    server gf2 {
        address = 10.11.12.2
        active = 1
        weight = 1
    }
}


I believe the problem lies in arping, but I'm not sure how to diagnose this. There are no firewalls between my browser and the lvs, and I'm using a fairly dumb 100mb switch (also tried with a smarter switch).

Any help would be greatly appreciated.

Thanks,

James


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]