urllib, urlretrieve method, how to get headers?

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð² · Jul 1, 2011

Hello, everyone!

How can I get headers with urlretrieve? I want to send request and get
headers with necessary information before I execute urlretrieve(). Or
are there any alternatives for urlretrieve()?

Peter Otten · Jul 1, 2011

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð² said:
How can I get headers with urlretrieve? I want to send request and get
headers with necessary information before I execute urlretrieve(). Or
are there any alternatives for urlretrieve()?

It's easy to do it manually:

Connect to website and inspect headers:

f = urllib2.urlopen("http://www.python.org")
f.headers["Content-Type"]

Click to expand...

Click to expand...

'text/html'

Write page content to file:
.... dest.writelines(f)
....

Did we get what we expected?

with open("tmp.html") as f: print f.read().split("title")[1]

Click to expand...

Click to expand...

....
Python Programming Language – Official Website</

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð² · Jul 1, 2011

Thanks, everyone!
Problem solved.

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð² · Jul 1, 2011

Hello again!
Another question: urlopen() reads full file's content, but how can I
get page by small parts?

Regards,
Daniil

Kushal Kumaran · Jul 1, 2011

Hello again!
Another question: urlopen() reads full file's content, but how can I
get page by small parts?

Set the Range header for HTTP requests. The format is specified here:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35. Note
that web servers are not *required* to support this header.

In [10]: req = urllib2.Request('http://cdimage.debian.org/debian-cd/6.0.2..1/amd64/iso-cd/debian-6.0.2.1-amd64-CD-1.iso',
headers = { 'Range' : 'bytes=0-499' })

In [11]: f = urllib2.urlopen(req)

In [12]: data = f.read()

In [13]: len(data)
Out[13]: 500

In [14]: print f.headers
Date: Fri, 01 Jul 2011 16:59:39 GMT
Server: Apache/2.2.14 (Unix)
Last-Modified: Sun, 26 Jun 2011 16:54:45 GMT
ETag: "ebff2f-28700000-4a6a04ab27f10"
Accept-Ranges: bytes
Content-Length: 500
Age: 225
Content-Range: bytes 0-499/678428672
Connection: close
Content-Type: application/octet-stream

Chris Rebert · Jul 1, 2011

Hello again!
Another question: urlopen() reads full file's content, but how can I
get page by small parts?

I don't think that's true. Just pass .read() the number of bytes you
want to read, just as you would with an actual file object.

Cheers,
Chris

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð² · Jul 2, 2011

Thanks, everyone!
Problem solved.

How to get HTTP error when using urlretrieve()	1	Apr 29, 2007
Downloading/Saving to a Directory	0	Nov 28, 2013
[Newbie] Is there any method to urlretrieve to a file the html source	3	Feb 25, 2008
Axios 403 error when sending get request	3	Jul 4, 2023
Urllib's urlopen and urlretrieve	9	Feb 21, 2013
urlretrieve get file name	6	Nov 9, 2006
urllib (54, 'Connection reset by peer') error	5	Jun 13, 2008
Generating Filenames from Feeds	5	Mar 14, 2013

urllib, urlretrieve method, how to get headers?

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð²

Peter Otten

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð²

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð²

Kushal Kumaran

Chris Rebert

Ð”Ð°Ð½Ð¸Ð¸Ð» Ð Ñ‹Ð¶ÐºÐ¾Ð²

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads