Urllib2 urlopen and read - difference

koranthala · Apr 15, 2010

Hi,
Suppose I am doing the following:
req = urllib2.urlopen('http://www.python.org')
data = req.read()

When is the actual data received? is it done by the first line? or
is it done only when req.read() is used?
My understanding is that when urlopen is done itself, we would have
received all the data, and req.read() just reads it from the file
descriptor.
But, when I read the source code of pylot, it mentioned the
following:
resp = opener.open(request) # this sends the HTTP request
and returns as soon as it is done connecting and sending
connect_end_time = self.default_timer()
content = resp.read()
req_end_time = self.default_timer()

Here, it seems to suggest that the data is received only after you do
resp.read(), which made me all confused.

If someone could help me out, it would be much helpful.

J. Cliff Dyer · Apr 15, 2010

Hi,
Suppose I am doing the following:
req = urllib2.urlopen('http://www.python.org')
data = req.read()

When is the actual data received? is it done by the first line? or
is it done only when req.read() is used?
My understanding is that when urlopen is done itself, we would have
received all the data, and req.read() just reads it from the file
descriptor.
But, when I read the source code of pylot, it mentioned the
following:
resp = opener.open(request) # this sends the HTTP request
and returns as soon as it is done connecting and sending
connect_end_time = self.default_timer()
content = resp.read()
req_end_time = self.default_timer()

Here, it seems to suggest that the data is received only after you do
resp.read(), which made me all confused.

If someone could help me out, it would be much helpful.

My understanding (please correct me if I'm wrong), is that when you call
open, you send a request to the server, and get a response object back.
The server immediately begins sending data (you can't control when they
send it, once you've requested it). When you call read() on your
response object, it reads all the data it has already received, and if
that amount of data isn't sufficient to handle your read call, it blocks
until it has enough.

So your opener returns as soon as the request is sent, and read() blocks
if it doesn't have enough data to handle your request.

Cheers,
Cliff

J. Cliff Dyer · Apr 15, 2010

Hi,
Suppose I am doing the following:
req = urllib2.urlopen('http://www.python.org')
data = req.read()

When is the actual data received? is it done by the first line? or
is it done only when req.read() is used?
My understanding is that when urlopen is done itself, we would have
received all the data, and req.read() just reads it from the file
descriptor.
But, when I read the source code of pylot, it mentioned the
following:
resp = opener.open(request) # this sends the HTTP request
and returns as soon as it is done connecting and sending
connect_end_time = self.default_timer()
content = resp.read()
req_end_time = self.default_timer()

Here, it seems to suggest that the data is received only after you do
resp.read(), which made me all confused.

If someone could help me out, it would be much helpful.

My understanding (please correct me if I'm wrong), is that when you call
open, you send a request to the server, and get a response object back.
The server immediately begins sending data (you can't control when they
send it, once you've requested it). When you call read() on your
response object, it reads all the data it has already received, and if
that amount of data isn't sufficient to handle your read call, it blocks
until it has enough.

So your opener returns as soon as the request is sent, and read() blocks
if it doesn't have enough data to handle your request.

Cheers,
Cliff

Aahz · Apr 26, 2010

My understanding (please correct me if I'm wrong), is that when you call
open, you send a request to the server, and get a response object back.
The server immediately begins sending data (you can't control when they
send it, once you've requested it). When you call read() on your
response object, it reads all the data it has already received, and if
that amount of data isn't sufficient to handle your read call, it blocks
until it has enough.

So your opener returns as soon as the request is sent, and read() blocks
if it doesn't have enough data to handle your request.

Close. urlopen() returns after it receives the HTTP header (that's why
you can get an HTTP exception on e.g. 404 without the read()).

IOError 35 when trying to read the result of call to urllib2.urlopen	2	Sep 10, 2011
urllib2.urlopen+BadStatusLine+https	0	May 12, 2011
Send array back in result from urllib2.urlopen(request, postData)	5	Jan 10, 2014
urllib2.urlopen issue	4	Jun 24, 2009
urllib2 timeout issue	1	Oct 16, 2013
urllib2.urlopen taking way too much time	0	Apr 19, 2010
urllib2 error	2	Nov 17, 2010
urllib2 urlopen takes too much time	1	Jun 21, 2009

Urllib2 urlopen and read - difference

koranthala

J. Cliff Dyer

J. Cliff Dyer

Aahz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads