Problem with slow httplib connections on Windows (and maybe otherplatforms)

  • Thread starter Christoph Zwerschke
  • Start date
C

Christoph Zwerschke

It cost me a while to analyze the cause of the following problem.

The symptom was that testing a local web app with twill was fast
on Python 2.3, but very slow on Python 2.4-2.6 on a Win XP box.

This boiled down to the problem that if you run a SimpleHTTPServer
for localhost like this,

BaseHTTPServer.HTTPServer(('localhost', 8000),
SimpleHTTPServer.SimpleHTTPRequestHandler).serve_forever()

and access it using httplib.HTTPConnection on the same host like this

httplib.HTTPConnection('localhost', 8000).connect()

then this call is fast using Py 2.3, but slow with Py 2.4-2.6.

I found that this was caused by a mismatch of the ip version used
by SimpleHTTPServer and HTTPConnection for a "localhost" argument.

What actually happens is the following:

* BaseHTTPServer binds only to the IPv4 address of localhost, because
it's based on TCPServer which has address_family=AF_INET by default.

* HTTPConnection.connect() however tries to connect to all IP addresses
of localhost, in the order determined socket.getaddrinfo('localhost').

With Py 2.3 (without IPv6 support) this is only the IPv4 address,
but with Py 2.4-2.6 the order is (on my Win XP host) the IPv6 address
first, then the IPv4 address. Since the IPv6 address is checked first,
this gives a timeout and causes the slow connect() call. The order by
which getaddrinfo returns IPv4/v6 under Linux seems to vary depending
on the glibc version, so it may be a problem on other platforms, too.

You can see the cause of the slow connect() like this:

import httplib
conn = httplib.HTTPConnection('localhost', 8000)
conn.set_debuglevel(1)
conn.connect()

This is what I get:

connect: (localhost, 8000)
connect fail: ('localhost', 8000)
connect: (localhost, 8000)

The first (failing) connect is the attempt to connect to the IPv6
address which BaseHTTPServer doesn't listen to. (This is the debug
output of Py 2.5 which really should be improved to show the IP address
that is actually used. Unfortunately, in Py 2.6 the debug output when
connecting has even fallen prey to a refactoring. I think it should
either be added again or set_debuglevel() is now pretty meaningless.)

Can we do something about the mismatch that SimpleHTTPServer only serves
IPv4, but HTTPConnection tries to connect with IPv6 first?

I guess other people also stumbled over this, maybe without even
noticing and just wondering about the slow performance. E.g.:
http://schotime.net/blog/index.php/2008/05/27/slow-tcpclient-connection-sockets/

One possible solution would be to improve the TCPServer in the standard
lib so that it determines the address_family and real server_address
based on the first return value of socket.getaddrinfo, like this:

class TCPServer(BaseServer):
...

def __init__(self, server_address, RequestHandlerClass):
if server_address and len(server_address) == 2:
(self.address_family, dummy, dummy, dummy,
server_address) = socket.getaddrinfo(*server_address)[0]
else:
raise TypeError("server_address must be a 2-tuple")
BaseServer.__init__(self, server_address, RequestHandlerClass)
...

That way, if you either serve as or connect to 'localhost', you will
always consistently do this via IPv4 or IPv6, depending on what is
preferred on your platform.

Does this sound reasonable? Any better ideas?

-- Christoph
 
R

rdmurray

Quoth Christoph Zwerschke said:
What actually happens is the following:

* BaseHTTPServer binds only to the IPv4 address of localhost, because
it's based on TCPServer which has address_family=AF_INET by default.

* HTTPConnection.connect() however tries to connect to all IP addresses
of localhost, in the order determined socket.getaddrinfo('localhost').

With Py 2.3 (without IPv6 support) this is only the IPv4 address,
but with Py 2.4-2.6 the order is (on my Win XP host) the IPv6 address
first, then the IPv4 address. Since the IPv6 address is checked first,
this gives a timeout and causes the slow connect() call. The order by
which getaddrinfo returns IPv4/v6 under Linux seems to vary depending
on the glibc version, so it may be a problem on other platforms, too.

Based on something I read in another thread, this appears to be a problem
only under Windows. Everybody else implemented the TCP/IP stack according
to spec, and the IPV6 connect attempt times out immediately, producing
no slowdown.

Microsoft, however....

--RDM
 
C

Christoph Zwerschke

Based on something I read in another thread, this appears to be a problem
only under Windows. Everybody else implemented the TCP/IP stack according
to spec, and the IPV6 connect attempt times out immediately, producing
no slowdown.

Microsoft, however....

The order in which getaddrinfo returns IPv4 and IPv6 is probably not
written in the specs (Posix 1003.1g and RFC 2553). The fact that Windows
returns IPv6 addresses first is not wrong in itself.

For this discussion, see also
http://www.ops.ietf.org/lists/v6ops/v6ops.2002/msg00869.html
https://bugzilla.redhat.com/show_bug.cgi?id=190495

But yes, I also wonder why the connect to the IPv6 loopback address does
not time out more quickly on Windows.

-- Christoph
 
R

rdmurray

Quoth Christoph Zwerschke said:
The order in which getaddrinfo returns IPv4 and IPv6 is probably not
written in the specs (Posix 1003.1g and RFC 2553). The fact that Windows
returns IPv6 addresses first is not wrong in itself.

For this discussion, see also
http://www.ops.ietf.org/lists/v6ops/v6ops.2002/msg00869.html
https://bugzilla.redhat.com/show_bug.cgi?id=190495

But yes, I also wonder why the connect to the IPv6 loopback address does
not time out more quickly on Windows.

Right, it's not the order of the returned items that's the Microsoft
weirdness, it's the long timeout on an attempt to connect to something
that doesn't exist. There was a long discussion about this, and it might
even have been on python-dev, but I can't lay my hands on the thread.
In short, Microsoft retries and waits a while when the far end says
"no thanks" to a connection attempt, instead of immediately returning
the connection failure the way Linux and etc and etc do. This applies
to IPV4, too.

--RDM
 
S

Steve Holden

Right, it's not the order of the returned items that's the Microsoft
weirdness, it's the long timeout on an attempt to connect to something
that doesn't exist. There was a long discussion about this, and it might
even have been on python-dev, but I can't lay my hands on the thread.
In short, Microsoft retries and waits a while when the far end says
"no thanks" to a connection attempt, instead of immediately returning
the connection failure the way Linux and etc and etc do. This applies
to IPV4, too.
Search for the subject line "socket.create_connection slow" - this was
discovered by Kristjan Valur Jonsson. It certainly seems like a
Microsoft weirdness.

regards
Steve
 
C

Christoph Zwerschke

Steve said:
Search for the subject line "socket.create_connection slow" - this was
discovered by Kristjan Valur Jonsson. It certainly seems like a
Microsoft weirdness.

Thanks for the pointer, Steve. I hadn't seen that yet. I agree that's
actually the real problem here. The solution suggested in that thread,
using a dual-stacked socket for the TCPserver, seems a good one to me.

-- Christoph
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top