AW: Still much more than 350 sockets needed!

Jon Burgess jburgess at uklinux.net
Thu Apr 27 21:38:35 UTC 2006


On Thu, 2006-04-27 at 13:37 -0700, Wes Shull wrote:
> On 4/27/06, Andrew Haley <aph at redhat.com> wrote:
> > It works for me, bouncing data back and forth.  What does it do on
> > your box?  What is the output?
> 
> unadultered output at http://kuoi.asui.uidaho.edu/~wes/out.txt ; here
> is an abbreviated annotated version:
> 
> Creating listener on 3000
> Connecting to listener on 127.0.0.1:3000
> [...]
> Creating listener on 3499
> Connecting to listener on 127.0.0.1:3499
> 
> I'm doing the connect asynchronously so all this really says it that
> it's not hitting the file descriptor limit...
> 
> client: 3499<-marco
> server: 3499 accepted
> [...]
> client: 3482<-marco
> server: 3482 accepted
> 
> 18 of the connections are made and the server side sees the data, but...
> 
> client: 3481<-marco
> [...]
> client: 3000<-marco
> 
> The other 500-18 connections are never accepted.  Tcl channels also go
> writable on error (and I just realized my code doesn't check for that)
> so it's probable it shouldn't even have been trying to send on these
> because they never connected.
> 
> client_readable: 3000 went away
> [...]
> client_readable: 3481 went away
> 
> Here the client side catches that the other 500-18 connections didn't make it
> 
> server: 3482<-marco
> [...]
> server: 3499<-marco
> 
> server: 3499->polo
> [...]
> server: 3482->polo
> 
> client: 3482->polo
> [...]
> client: 3499->polo
> 
> client: 3499<-marco
> [...]
> client: 3482<-marco
> 
> These last four sequences continue ad infinitum; it's successfully
> passing data back and forth, but only on those 18 connections.
> 
> Just for S&G, I ran this on a FreeBSD 6.0-RELEASE-p4 system, and while
> it doesn't behave exactly the same, it didn't do what I expected there
> either.  Just to make sure I'm not running into some issue with
> different Tcl builds (limited event queue length, limited # of active
> connections), I'll rewrite this in C (bleah) tonight and see if it
> behaves any better.

Seems it is the kernel refusing to accept() any more connections, the
reason being that it takes 2 descriptors for each of your listeners:-

$ strace -e trace=socket,accept,connect ./sock.tcl 

Creating listener on 3000
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5
Connecting to listener on 127.0.0.1:3000
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
connect(6, {sa_family=AF_INET, sin_port=htons(3000),
sin_addr=inet_addr("127.0.0.1")}, 16)
= -1 EINPROGRESS (Operation now in progress)
Creating listener on 3001
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 7
Connecting to listener on 127.0.0.1:3001
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 8
connect(8, {sa_family=AF_INET, sin_port=htons(3001),
sin_addr=inet_addr("127.0.0.1")}, 16)
= -1 EINPROGRESS (Operation now in progress)

...

So when you come to connect you only get:
ulimit(1024) - 2*500 - initialFD(5) = 19 
for new connections:-

server: 3484 accepted
client: 3483<-marco
accept(971, {sa_family=AF_INET, sin_port=htons(44952),
sin_addr=inet_addr("127.0.0.1")}, [8589934608]) = 1021
server: 3483 accepted
client: 3482<-marco
accept(969, {sa_family=AF_INET, sin_port=htons(56034),
sin_addr=inet_addr("127.0.0.1")}, [8589934608]) = 1022
server: 3482 accepted
client: 3481<-marco
accept(967, {sa_family=AF_INET, sin_port=htons(38003),
sin_addr=inet_addr("127.0.0.1")}, [8589934608]) = 1023
server: 3481 accepted
client: 3480<-marco
accept(965, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3479<-marco
accept(963, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3478<-marco
accept(961, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3477<-marco
accept(959, 0x7fffffa7d020, [8589934608]) = -1 EMFILE (Too many open
files)
client: 3476<-marco
...

If run as root with and raise the limit to 10k then you see 1500 file
descriptors once the test is fully connected.

Perhaps the most confusing thing is that the second FD for each listener
doesn't appear in /proc/<pid>/fd until something connects. Notice how
you only see odd numbered FD's for the unconnected listeners below.

$ ls -l /proc/29381/fd | sort -n -k 9 
 0 -> /dev/pts/0
 1 -> /home/jburgess/socket/out
 2 -> /home/jburgess/socket/out
 3 -> pipe:[8919517]
 4 -> pipe:[8919517]
 5 -> socket:[8919518]
 7 -> socket:[8919520]
 9 -> socket:[8919522]
 11 -> socket:[8919524]
 13 -> socket:[8919526]

	Jon





More information about the fedora-devel-list mailing list