[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: AW: Still much more than 350 sockets needed!



On Thu, 2006-04-27 at 13:37 -0700, Wes Shull wrote:
> On 4/27/06, Andrew Haley <aph redhat com> wrote:
> > It works for me, bouncing data back and forth.  What does it do on
> > your box?  What is the output?
> 
> unadultered output at http://kuoi.asui.uidaho.edu/~wes/out.txt ; here
> is an abbreviated annotated version:
> 
> Creating listener on 3000
> Connecting to listener on 127.0.0.1:3000
> [...]
> Creating listener on 3499
> Connecting to listener on 127.0.0.1:3499
> 
> I'm doing the connect asynchronously so all this really says it that
> it's not hitting the file descriptor limit...
> 
> client: 3499<-marco
> server: 3499 accepted
> [...]
> client: 3482<-marco
> server: 3482 accepted
> 
> 18 of the connections are made and the server side sees the data, but...
> 
> client: 3481<-marco
> [...]
> client: 3000<-marco
> 
> The other 500-18 connections are never accepted.  Tcl channels also go
> writable on error (and I just realized my code doesn't check for that)
> so it's probable it shouldn't even have been trying to send on these
> because they never connected.
> 
> client_readable: 3000 went away
> [...]
> client_readable: 3481 went away
> 
> Here the client side catches that the other 500-18 connections didn't make it
> 
> server: 3482<-marco
> [...]
> server: 3499<-marco
> 
> server: 3499->polo
> [...]
> server: 3482->polo
> 
> client: 3482->polo
> [...]
> client: 3499->polo
> 
> client: 3499<-marco
> [...]
> client: 3482<-marco
> 
> These last four sequences continue ad infinitum; it's successfully
> passing data back and forth, but only on those 18 connections.
> 
> Just for S&G, I ran this on a FreeBSD 6.0-RELEASE-p4 system, and while
> it doesn't behave exactly the same, it didn't do what I expected there
> either.  Just to make sure I'm not running into some issue with
> different Tcl builds (limited event queue length, limited # of active
> connections), I'll rewrite this in C (bleah) tonight and see if it
> behaves any better.

Seems it is the kernel refusing to accept() any more connections, the
reason being that it takes 2 descriptors for each of your listeners:-

$ strace -e trace=socket,accept,connect ./sock.tcl 

Creating listener on 3000
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5
Connecting to listener on 127.0.0.1:3000
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
connect(6, {sa_family=AF_INET, sin_port=htons(3000),
sin_addr=inet_addr("127.0.0.1")}, 16)
= -1 EINPROGRESS (Operation now in progress)
Creating listener on 3001
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 7
Connecting to listener on 127.0.0.1:3001
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 8
connect(8, {sa_family=AF_INET, sin_port=htons(3001),
sin_addr=inet_addr("127.0.0.1")}, 16)
= -1 EINPROGRESS (Operation now in progress)

...

So when you come to connect you only get:
ulimit(1024) - 2*500 - initialFD(5) = 19 
for new connections:-

server: 3484 accepted
client: 3483<-marco
accept(971, {sa_family=AF_INET, sin_port=htons(44952),
sin_addr=inet_addr("127.0.0.1")}, [8589934608]) = 1021
server: 3483 accepted
client: 3482<-marco
accept(969, {sa_family=AF_INET, sin_port=htons(56034),
sin_addr=inet_addr("127.0.0.1")}, [8589934608]) = 1022
server: 3482 accepted
client: 3481<-marco
accept(967, {sa_family=AF_INET, sin_port=htons(38003),
sin_addr=inet_addr("127.0.0.1")}, [8589934608]) = 1023
server: 3481 accepted
client: 3480<-marco
accept(965, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3479<-marco
accept(963, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3478<-marco
accept(961, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3477<-marco
accept(959, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3476<-marco
...

If run as root with and raise the limit to 10k then you see 1500 file
descriptors once the test is fully connected.

Perhaps the most confusing thing is that the second FD for each listener
doesn't appear in /proc/<pid>/fd until something connects. Notice how
you only see odd numbered FD's for the unconnected listeners below.

$ ls -l /proc/29381/fd | sort -n -k 9 
 0 -> /dev/pts/0
 1 -> /home/jburgess/socket/out
 2 -> /home/jburgess/socket/out
 3 -> pipe:[8919517]
 4 -> pipe:[8919517]
 5 -> socket:[8919518]
 7 -> socket:[8919520]
 9 -> socket:[8919522]
 11 -> socket:[8919524]
 13 -> socket:[8919526]

	Jon



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]