Is it possible to get image size before/without downloading?

aldonnelley · Jul 22, 2006

Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

Cheers, Al.

Josiah Manson · Jul 22, 2006

In the head of an HTTP response, most servers will specify a
Content-Length that is the number of bytes in the body of the response.
Normally, when using the GET method, the header is returned with the
body following. It is possible to make a HEAD request to the server
that will only return header information that will hopefully tell you
the file size.

If you want to know the actual dimensions of the image, I don't know of
anything in HTTP that will tell you. You will probably just have to
download the image to find that out. Relevant HTTP specs below if you
care.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

The above is true regardless of language. In python it appears there an
httplib module. I would call request using the method head.

http://docs.python.org/lib/httpconnection-objects.html

aldonnelley · Jul 22, 2006

Thanks Josiah

I thought as much... Still, it'll help me immensely to cut the
downloads from a page to only those that are within a file-size range,
even if this gets me some images that are out-of-spec dimensionally.

Cheers, Al.

(Oh, and if anyone still has a bright idea about how to get image
dimensions without downloading, it'd be great to hear!)

Peter Otten · Jul 22, 2006

Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

The PIL can determine the size of an image from some "large enough" chunk at
the beginning of the image, e. g:

import Image
import urllib
from StringIO import StringIO

f = urllib.urlopen("http://www.python.org/images/success/nasa.jpg")
s = StringIO(f.read(512))
print Image.open(s).size

Peter

Marc 'BlackJack' Rintsch · Jul 22, 2006

aldonnelley said:
(Oh, and if anyone still has a bright idea about how to get image
dimensions without downloading, it'd be great to hear!)

Most image formats have some sort of header with the dimensions
information so it's enough to download this header. Depends on the image
format how much of the file has to be read and how the information is
encoded.

Ciao,
Marc 'BlackJack' Rintsch

Image issues	0	Apr 2, 2023
Get await function in loop to finish before script ends	0	Oct 14, 2021
Downloading/Saving to a Directory	0	Nov 28, 2013
What should I do Before I give up programming?	6	Jan 14, 2023
Is it possible to get string from function?	7	Jan 16, 2014
Pyautogui, cv2 and cannot find image	0	Feb 7, 2023
I am trying to detect Which image id="" was clicked ?	22	Jan 3, 2023
How to use Densenet121 in monai	0	Feb 16, 2024

Is it possible to get image size before/without downloading?

aldonnelley

Josiah Manson

aldonnelley

Peter Otten

Marc 'BlackJack' Rintsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads