nanny "Read Timed Out" Errors

Fri Nov 6 00:40:14 UTC 2009

Thanks, Kit,

It's my understanding the weight is a relative number -- relative to
the other real servers in the pool. For example, server 1 has a weight
of 1. Server 2 has a weight of 2. Server 2 will get more traffic sent
to it by the load balancer because it has a higher weight. So it's
okay for the weight to be set as is.

See http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Virtual_Server_Administration/s2-piranha-virtservs-rs-VSA.html

"Weight
An integer value indicating this host's capacity relative to that of
other hosts in the pool. The value can be arbitrary, but treat it as a
ratio in relation to other real servers in the pool. For more on
server weight, see Section 1.3.2, “Server Weight and Scheduling”."

The hub is an interesting idea but I can't use one in this situation.

So it looks like my main issue here is that nanny can't talk to the
real servers for some odd reason (even though the server itself has
verified connectivity to them, as discussed previously). It seems I
have eliminated probable causes for that issue, yet is persists.
Wonder what I'm missing.

Mike

On 05/11/2009, Kit Gerrits <kitgerrits at gmail.com> wrote:
>
> Shouldn't weight be somewhere around the maximum number of sessione for
> that
> host?
>
> The hosts should be on the same sunbet because of the way it handles the
> MAC
> table.
> (Maybe a HUB would be better than a switch?)
>
> More info:
> http://www.linuxvirtualserver.org/VS-DRouting.html
>
>
> Kit
>
> -----Original Message-----
> From: piranha-list-bounces at redhat.com
> [mailto:piranha-list-bounces at redhat.com] On Behalf Of mojorising
> Sent: donderdag 5 november 2009 19:53
> To: Piranha clustering/HA technology
> Subject: Re: nanny "Read Timed Out" Errors
>
> Okay. It seems I missed a critical piece of my config file when I
> copy/pasted it to you. Sorry about that.
>
> So here is my ha.cf file now
>
> [root at omsbuild ~]# cat /etc/sysconfig/ha/lvs.cf serial_no = 93 primary =
> 192.168.3.28 service = lvs backup = 0.0.0.0 heartbeat = 1 heartbeat_port =
> 539 keepalive = 6 deadtime = 18 network = direct debug_level = NONE virtual
> test1 {
>      active = 1
>      address = 192.168.3.40 eth0:1
>      vip_nmask = 255.255.248.0
>      port = 80
>      expect = "HTTP"
>      use_regex = 0
>      load_monitor = none
>      scheduler = lc
>      protocol = tcp
>      timeout = 6
>      reentry = 15
>      quiesce_server = 0
>      server kiwidev4 {
>          address = 192.168.3.38
>          active = 1
>          port = 80
>          weight = 1
>      }
> }
>
>
> I took out those other machines because I can not change their IPs (I'm
> just
> using them for testing). So in their place, I put a machine
> (kiwidev4) that happens to be on the same subnet as the LVS box.
> kiwidev4 was always there and active but that part of the config file was
> accidentally clipped off from my message.  :(
>
> I can not change those iptables rules at this time because that
> kiwidev4 box may be in use for some other testing at the moment. Can we do
> this without making the specified changes to iptables? It seems we
> shouldn't
> need to do that. I will eventually be using LVS to balance traffic to
> Windows machines as well so I need to be able to do without iptables for
> that reason also.
>
>
> Mike
>
>
> On 04/11/2009, Tapan Thapa <tapan.thapa2000 at gmail.com> wrote:
>> Hello Mike,
>>
>> Now your network status looks good.
>>
>> But still i can not see any real server on the same network. i.e.
>> 192.168.3.x.
>>
>> As per your lvs.cf, you have configured two real servers. First one is
>> server Speedy and second one is server test1 and currently both are
>> not active. (active = 0) They should be (active = 1).
>>
>> And also your real servers are not in right subnet.
>>
>> Your real server should be on same 192.168.3.x network.
>>
>> Your example lvs.cf should look like:
>>
>> serial_no = 93
>> primary = 192.168.3.28
>> service = lvs
>> backup = 0.0.0.0
>> heartbeat = 1
>> heartbeat_port = 539
>> keepalive = 6
>> deadtime = 18
>> network = direct
>> debug_level = NONE
>> virtual test1 {
>>     active = 1
>>      address = 192.168.3.40 eth0:1
>>     vip_nmask = 255.255.248.0
>>     port = 80
>>     expect = "HTTP"
>>     use_regex = 0
>>     load_monitor = none
>>     scheduler = lc
>>     protocol = tcp
>>     timeout = 6
>>     reentry = 15
>>     quiesce_server = 0
>>     server Speedy {
>>         address = 192.168.3.29
>>         active = 1
>>         port = 80
>>         weight = 1
>>     }
>>     server test1 {
>>         address = 192.168.3.30
>>         active = 1
>>         port = 80
>>         weight = 1
>>     }
>>
>> Please change the ip address of Speedy server to 192.168.3.29 and
>> test1 server to 192.168.3.30 with subnet mask of 255.255.248.0 and
>> restart network and httpd service.
>>
>> then fire below mentioned commands in both real servers (Not in lvs
>> server):
>>
>> chkconfig iptables on
>> iptables -F
>> iptables -t nat -A PREROUTING -p tcp --dport 80 -d 192.168.3.40 -j
>> REDIRECT service iptables save
>>
>> then please restart pulse service at linux director server (lvs
>> server) and wait for 2 minutes. and then check the status of ipvsadm
>> -L -n command and let me know in case any issues.
>>
>>
>> Regards
>> Tapan Thapa
>> India
>>
>>
>>
>> On Thu, Nov 5, 2009 at 1:25 AM, mojorising <moj0rising at aim.com> wrote:
>>
>>> Tapan, sorry for confusing you. I overlooked my virtual IP and
>>> accidedntally left it with an IP on the wrong net. This is now
>>> corrected.
>>>
>>> Those real servers on other nets are still in my configuration but
>>> they are "down," as they were before. I do have one real server up on
>>> the proper net --       192.168.3.38.
>>>
>>>
>>> My present network interface set-up:
>>>
>>> eth0      Link encap:Ethernet  HWaddr 00:50:56:AE:14:E3
>>>          inet addr:192.168.3.28  Bcast:192.168.7.255  Mask:255.255.248.0
>>>          inet6 addr: fe80::250:56ff:feae:14e3/64 Scope:Link
>>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>           RX packets:36180876 errors:1122 dropped:1234 overruns:0
>>> frame:0
>>>          TX packets:8729361 errors:0 dropped:0 overruns:0 carrier:0
>>>          collisions:0 txqueuelen:1000
>>>          RX bytes:72196093 (68.8 MiB)  TX bytes:610192805 (581.9 MiB)
>>>           Interrupt:177 Base address:0x1400
>>>
>>> eth0:1    Link encap:Ethernet  HWaddr 00:50:56:AE:14:E3
>>>           inet addr:192.168.3.40  Bcast:192.168.7.255
>>> Mask:255.255.248.0
>>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>          Interrupt:177 Base address:0x1400
>>>
>>>
>>> A problem is that I am still getting those error messages from nanny:
>>>
>>> Nov  4 11:06:52 omsbuild nanny[16919]: READ to 192.168.3.38:80 timed
>>> out Nov  4 11:06:53 omsbuild nanny[20548]: READ to 192.168.3.38:80
>>> timed out Nov  4 11:07:04 omsbuild nanny[16919]: READ to
>>> 192.168.3.38:80 timed out Nov  4 11:07:05 omsbuild nanny[20548]: READ
>>> to 192.168.3.38:80 timed out
>>>
>>>
>>> My lvs.cf file now:
>>>
>>> serial_no = 93
>>> primary = 192.168.3.28
>>> service = lvs
>>> backup = 0.0.0.0
>>> heartbeat = 1
>>> heartbeat_port = 539
>>> keepalive = 6
>>> deadtime = 18
>>> network = direct
>>> debug_level = NONE
>>> virtual test1 {
>>>     active = 1
>>>      address = 192.168.3.40 eth0:1
>>>      vip_nmask = 255.255.248.0
>>>     port = 80
>>>     expect = "HTTP"
>>>     use_regex = 0
>>>     load_monitor = none
>>>     scheduler = lc
>>>     protocol = tcp
>>>     timeout = 6
>>>     reentry = 15
>>>     quiesce_server = 0
>>>     server Speedy {
>>>         address = 192.168.18.29
>>>         active = 0
>>>         port = 80
>>>         weight = 1
>>>     }
>>>     server test1 {
>>>         address = 65.39.179.197
>>>         active = 0
>>>         port = 80
>>>         weight = 1
>>>     }
>>>
>>>
>>>
>>> Mike
>>>
>>>
>>> On 03/11/2009, Tapan Thapa <tapan.thapa2000 at gmail.com> wrote:
>>> > Hello Mike,
>>> >
>>> > Thanks for providing helpful information.
>>> >
>>> > Now as i understood from your configuration, you have two networks
>>> > on
>>> eth0.
>>> >
>>> > 1. 192.168.3.x (on eth0)
>>> > 2. 192.168.0.x (on eth0:1).. (Is it map with any external ip
>>> > address?) (Please provide netstat -rn output here.)
>>> >
>>> > Your one of real server is on completely new network subnet
>>> (192.168.18.x)
>>> > and your second real server is on public ip 65.39.179.197 and
>>> > currently none of them are active.
>>> >
>>> > I don't think this configuration will work.
>>> >
>>> > Your configuration should be like:
>>> >
>>> > 1. Any network like 192.168.0.x on (eth0) 2. Floating IP
>>> > Address/Virtual IP Address 192.168.0.254 on (eth0:1) and
>>> it
>>> > must be map with any pubic ip address in case you want to access
>>> > this VIP from outside of your network. During testing it is not
>>> > required to map it with any public ip address.
>>> > 3. Your real server should be on the same network 192.168.0.x (I.e.
>>> > 192.168.0.1/2/3).
>>> >
>>> > If you are planning to use Linux Director in Direct Mode then their
>>> > must
>>> be
>>> > an existing gateway available.
>>> >
>>> > All real servers and Linux Director should point to their gateway
>>> > towards that router/gateway.
>>> >
>>> > As far as your listing of port/service question is concern, If your
>>> > linux director works properly, still your linux director will not
>>> > listen on
>>> port
>>> > 80 but your load balancing will work. (I was also confused on this
>>> > for
>>> > 2
>>> > days and after 2 days i realize that load balancing is working
>>> > although port 80 is not listing.)
>>> >
>>> > Note: Please stick with one configuration as when you post your
>>> > problem, your linux directory was working under Direct mode and now
>>> > it is working
>>> in
>>> > tunnel mode. (I have no experience of tunnel mode but i can help
>>> > you on direct and nat mode.)
>>> >
>>> > Regards
>>> > Tapan Thapa
>>> > India
>>> >
>>> > On Wed, Nov 4, 2009 at 1:16 AM, mojorising <moj0rising at aim.com> wrote:
>>> >
>>> >> Thanks for your offers of help!
>>> >>
>>> >> I have made some changes since reading your message saying the
>>> >> servers should all be on the same net -- now I have one real
>>> >> server and it is on the same network as the load balancer. The
>>> >> output of the ipvsadmn command you requested is below.
>>> >>
>>> >> [root at omsbuild ~]# ipvsadm -L -n
>>> >> IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port
>>> >> Scheduler Flags
>>> >>  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
>>> >> TCP  192.168.0.69:80 lc
>>> >>
>>> >> NIC/IP information:
>>> >>
>>> >> [root at omsbuild ~]# ifconfig -a
>>> >> eth0      Link encap:Ethernet  HWaddr 00:50:56:AE:14:E3
>>> >>          inet addr:192.168.3.28  Bcast:192.168.7.255
>>> >> Mask:255.255.248.0
>>> >>          inet6 addr: fe80::250:56ff:feae:14e3/64 Scope:Link
>>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>> >>          RX packets:35121740 errors:1120 dropped:1231 overruns:0
>>> >> frame:0
>>> >>          TX packets:8682408 errors:0 dropped:0 overruns:0 carrier:0
>>> >>          collisions:0 txqueuelen:1000
>>> >>          RX bytes:4182471094 (3.8 GiB)  TX bytes:606337720 (578.2
>>> >> MiB)
>>> >>          Interrupt:177 Base address:0x1400
>>> >>
>>> >> eth0:1    Link encap:Ethernet  HWaddr 00:50:56:AE:14:E3
>>> >>          inet addr:192.168.0.69  Bcast:192.168.7.255
>>> >> Mask:255.255.248.0
>>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>> >>          Interrupt:177 Base address:0x1400
>>> >>
>>> >>
>>> >> I'm still getting the same errors from nanny even though the real
>>> >> server is now on the same net as the load balancer:
>>> >>
>>> >> Nov  3 10:44:22 omsbuild nanny[16919]: READ to 192.168.3.38:80
>>> >> timed
>>> out
>>> >>
>>> >> As represented in eth0:1, my virtual server is listening on
>>> >> 192.168.0.69:80. If I do netstat, do a port/service check from a
>>> >> workstation to that IP or similar, shouldn't I see the load
>>> >> balancer listening on 80? Right now, I do not see the load
>>> >> balancer waiting for connections on port 80.
>>> >>
>>> >> Here is what my lvs.conf file looks like now:
>>> >>
>>> >> serial_no = 89
>>> >> primary = 192.168.3.28
>>> >> service = lvs
>>> >> backup = 0.0.0.0
>>> >> heartbeat = 1
>>> >> heartbeat_port = 539
>>> >> keepalive = 6
>>> >> deadtime = 18
>>> >> network = tunnel
>>> >> debug_level = NONE
>>> >> virtual test1 {
>>> >>     active = 1
>>> >>     address = 192.168.0.69 eth0:1
>>> >>     vip_nmask = 255.255.248.0
>>> >>      port = 80
>>> >>      expect = "HTTP"
>>> >>     use_regex = 0
>>> >>     load_monitor = none
>>> >>     scheduler = lc
>>> >>     protocol = tcp
>>> >>     timeout = 6
>>> >>     reentry = 15
>>> >>     quiesce_server = 0
>>> >>     server Speedy {
>>> >>         address = 192.168.18.29
>>> >>         active = 0
>>> >>         port = 80
>>> >>         weight = 1
>>> >>     }
>>> >>     server test1 {
>>> >>          address = 65.39.179.197
>>> >>         active = 0
>>> >>         port = 80
>>> >>         weight = 1
>>> >>     }
>>> >> :
>>> >>
>>> >>
>>> >> Mike
>>> >>
>>> >>
>>> >> On 02/11/2009, Tapan Thapa <tapan.thapa2000 at gmail.com> wrote:
>>> >> > Hello Mike,
>>> >> >
>>> >> > I am not an expert in IPVS but recently i have setup IPVS with
>>> >> > the
>>> help
>>> >> of
>>> >> > Piranha and i am quite comfortable with IPVS.
>>> >> >
>>> >> > Please let me know your Network diagram and also the output of
>>> >> > below mentioned command.
>>> >> >
>>> >> > ipvsadm -L -n
>>> >> >
>>> >> > ----------------------------------------
>>> >> > I think your network diagram should be---
>>> >> >
>>> >> > Linux Director ----(One NIC)--->First Real Server (One NIC)
>>> >> >                                        --->Second Real Server
>>> >> > (One
>>> NIC)
>>> >> > ----------------------------------------------
>>> >> >
>>> >> > Your Linux Director and Your real server should be on the same
>>> >> > network segment. Please also post your Linux Director's (Where
>>> >> > you have installed
>>> >> > Piranha) Network cards ip information.
>>> >> >
>>> >> >
>>> >> > Regards
>>> >> > Tapan Thapa
>>> >> > India
>>> >> >
>>> >> > On Tue, Nov 3, 2009 at 6:09 AM, mojorising <moj0rising at aim.com>
>>> wrote:
>>> >> >
>>> >> >> Hello!
>>> >> >>
>>> >> >> I have set up a test load balancer with IPVS and Piranha-GUI.
>>> >> >> For
>>> some
>>> >> >> reason, when I attempt to connect to one of the two web servers
>>> >> >> I
>>> have
>>> >> >> set-up via the load balancer's virtual IP, the load balancer
>>> >> >> does not seem to pass those requests on to the real servers.
>>> >> >>
>>> >> >> The firewall on the Piranha box is off and I can successfully
>>> >> >> establish HTTP sessions with netcat and telnet from the Piranha
>>> >> >> box
>>> as
>>> >> >> well as from my workstation. So the web services are running
>>> >> >> and connectivity to them is good.
>>> >> >>
>>> >> >> The error I'm getting in /var/log/messages is (public IP
>>> >> >> changed for
>>> >> >> privacy):
>>> >> >>
>>> >> >> Nov  2 14:28:09 omsbuild nanny[13583]: READ to 65.39.169.xxx:80
>>> >> >> timed
>>> >> out
>>> >> >> Nov  2 14:28:10 omsbuild nanny[13582]: READ to
>>> >> >> 192.168.18.29:80timed
>>> >> out
>>> >> >>
>>> >> >> It looks like nanny can't talk to the web servers but I can't
>>> >> >> figure out why. That may not be the only problem I have here
>>> >> >> but it's probably one of them. All the other services are up
>>> >> >> and seem to be running fine.
>>> >> >>
>>> >> >> I've googled around quite a bit and checked the documentation
>>> >> >> but I haven't found anything in those places that gets me to a
> solution.
>>> >> >>
>>> >> >> Can anyone out there give me a little push in the right
>>> >> >> direction as to what the problem might be?
>>> >> >>
>>> >> >>
>>> >> >> Thank you!
>>> >> >>
>>> >> >> Mike
>>> >> >>
>>> >> >>
>>> >> >> My lvs.conf file:
>>> >> >>
>>> >> >> serial_no = 76
>>> >> >> primary = 192.168.3.28
>>> >> >> service = lvs
>>> >> >> backup = 0.0.0.0
>>> >> >> heartbeat = 1
>>> >> >> heartbeat_port = 539
>>> >> >> keepalive = 6
>>> >> >> deadtime = 18
>>> >> >> network = direct
>>> >> >> debug_level = NONE
>>> >> >> virtual test1 {
>>> >> >>     active = 1
>>> >> >>     address = 192.168.0.69 eth0:1
>>> >> >>     vip_nmask = 255.255.248.0
>>> >> >>     port = 3128
>>> >> >>     expect = "HTTP"
>>> >> >>     use_regex = 0
>>> >> >>     load_monitor = none
>>> >> >>     scheduler = lc
>>> >> >>     protocol = tcp
>>> >> >>     timeout = 6
>>> >> >>     reentry = 15
>>> >> >>     quiesce_server = 0
>>> >> >>     server Speedy {
>>> >> >>         address = 192.168.18.29
>>> >> >>         active = 1
>>> >> >>         port = 80
>>> >> >>         weight = 1
>>> >> >>     }
>>> >> >>     server test1 {
>>> >> >>         address = 65.39.169.xxx
>>> >> >>         active = 1
>>> >> >>         port = 80
>>> >> >>         weight = 1
>>> >> >>     }
>>> >> >>
>>> >> >> _______________________________________________
>>> >> >> Piranha-list mailing list
>>> >> >> Piranha-list at redhat.com
>>> >> >> https://www.redhat.com/mailman/listinfo/piranha-list
>>> >> >>
>>> >> >
>>> >>
>>> >> _______________________________________________
>>> >> Piranha-list mailing list
>>> >> Piranha-list at redhat.com
>>> >> https://www.redhat.com/mailman/listinfo/piranha-list
>>> >>
>>> >
>>>
>>> _______________________________________________
>>> Piranha-list mailing list
>>> Piranha-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/piranha-list
>>>
>>
>
> _______________________________________________
> Piranha-list mailing list
> Piranha-list at redhat.com
> https://www.redhat.com/mailman/listinfo/piranha-list
>
> _______________________________________________
> Piranha-list mailing list
> Piranha-list at redhat.com
> https://www.redhat.com/mailman/listinfo/piranha-list
>