urllib leaves sockets open?

C

Chris Tavares

Hi all. I'm currently tracking down a problem in a little script[1] I have,
and I was hoping that those more experienced than myself could weigh in.

The script's job is to grab the status page off a DLink home router. This is
a really simple job: I just use urllib.urlopen() to grab the status page.
The router uses HTTP Basic authentication, so I've subclassed FancyURLOpener
to supply the credentials.

This all worked fine with an older router, but with the newer model there's
a long delay between sending the authentication information and actually
getting the response back. When just going in via a brower, there is no such
delay.

I did a little work with a tracing proxy, and I noticed something
interesting. urllib first makes an HTTP request without authentication
information. This gets back an HTTP 401 error code, as expected. urllib then
opens a second socket, and sends the Authentication header, again just as
expected.

Here's what I noticed: The socket for the first request that failed is still
connected. It looks like what's happening is that the router's only allowing
a single HTTP connection at a time. As a result, the second, authenticated
request, doesn't get it's response until there's some kind of timeout and
the first socket disconnects.

Is this normal behavior for urllib? Is there a way to force that initial
socket closed earlier? Is there something else I need to do?

Thanks for any insight,

-Chris

[1] The script in question is:

router_address = "xxx"
router_port = 80
router_user = "user"
router_password = "password"

class DI604Opener( urllib.FancyURLopener ):
def prompt_user_passwd( self, host, realm ):
return ( router_user, router_password )

urllib._urlopener = DI604Opener()

#
# Kick off the process when run from the command line
#
if __name__ == "__main__":
status_page = urllib.urlopen( "http://%s:%s/status.htm" % ( router_address,
router_port ) )
print status_page.read()
 
P

Paul Rubin

Chris Tavares said:
Is this normal behavior for urllib? Is there a way to force that initial
socket closed earlier? Is there something else I need to do?

I'd say open a sourceforge bug. There may be a way around it with the
fancy opener methods of urllib2, but it's a bug if regular urllib
opens a second socket without closing the first one. For http 1.1
it should be able to use just one socket anyway.
 
C

Chris Tavares

Paul Rubin said:
I'd say open a sourceforge bug. There may be a way around it with the
fancy opener methods of urllib2, but it's a bug if regular urllib
opens a second socket without closing the first one. For http 1.1
it should be able to use just one socket anyway.

Thanks, I'll do some poking around in urllib first and see if I can narrow
it down.

Is there a way to do HTTP 1.1 with urllib? The docs say 0.9 and 1.0 only.

Thanks,

-Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top