Determining if file is valid image file

  • Thread starter =?iso-8859-1?B?QW5kcuk=?=
  • Start date
?

=?iso-8859-1?B?QW5kcuk=?=

Other than installing PIL, is there a "simple" way using Python only
to determine if a file is a valid image file?

I'd be happy if I could at least identify valid images files for gif,
jpeg and png. Pointers to existing modules or examples would be
appreciated.

The reason why I'd prefer not using PIL is that I'd like to bundle
such a function/module in my app.

André
 
?

=?iso-8859-1?B?QW5kcuk=?=

Other than installing PIL, is there a "simple" way using Python only
to determine if a file is a valid image file?

I'd be happy if I could at least identify valid images files for gif,
jpeg and png. Pointers to existing modules or examples would be
appreciated.

The reason why I'd prefer not using PIL is that I'd like to bundle
such a function/module in my app.

André

I should have added: I'm interesting in validating the file *content*
- not the filename :)
 
L

Larry Bates

André said:
I should have added: I'm interesting in validating the file *content*
- not the filename :)
And what's wrong with bundling PIL in your application?

-Larry
 
J

Jarek Zgoda

André napisa³(a):
I should have added: I'm interesting in validating the file *content*
- not the filename :)

Is the module imghdr enough for your needs?
 
T

Thomas Jollans

I should have added: I'm interesting in validating the file *content*
- not the filename :)

The file name has nothing to do with the type :p

A straightforward way you won't like: read the specs for all formats you're
interested in and write the function yourself ;-)
 
K

kyosohma

?

=?iso-8859-1?B?QW5kcuk=?=

Use the md5 module to create checksums. Links below:

Sorry, I fail to see how this helps me to identify if a file I
retrieve from somewhere is a valid image file...
http://www.peterbe.com/plog/using-m...htmhttp://docs.python.org/lib/module-md5.html

Larry is right too...what's wrong with bundling PIL or any third party
module?

Why not bundling PIL?: Because I'm trying to keep the size of my app
as small as possible.
I don't mind bundling some other modules from third parties (in fact,
I already do include
three modules from ElementTree...).

André
 
D

Dave Hughes

André said:
Other than installing PIL, is there a "simple" way using Python only
to determine if a file is a valid image file?

I'd be happy if I could at least identify valid images files for gif,
jpeg and png. Pointers to existing modules or examples would be
appreciated.

The reason why I'd prefer not using PIL is that I'd like to bundle
such a function/module in my app.

Any reason you don't want to bundle PIL? The license looks like a
fairly standard BSD style license to me which I don't think precludes
you from bundling it (other than having to reproduce the (very small)
license text in any documentation).

Otherwise, it depends on exactly what you mean by "valid". You could do
something as simple as check the "magic" number in the header of the
file. Most image formats have something like this:

* PNG: byte sequence 89 50 4E 47 0D 0A 1A 0A
* GIF: "GIF89a" or "GIF87a"
* JPG: byte sequence FF D8 FF E0 nn nn 4A 46 49 46 00 (for JFIF)

Naturally, this won't guarantee the rest of the file is valid, but
might be sufficient for your purposes (it's one of the methods the
"file" command uses for recognizing file types).


HTH,

Dave.
--
 
B

brad

André said:
I should have added: I'm interesting in validating the file *content*
- not the filename :)

Some formats have identifying headers... I think jpeg is an example of
this. Open it with a hex editor or just read the first few bytes and see
for yourself.

Brad
 
J

Jarek Zgoda

André napisa³(a):
Yes, thanks.

Be aware that broken images (i.e. partially downloaded) in many cases
pass the imghdr.what() test. This function checks for patterns in files,
just like "file" utility.
 
?

=?iso-8859-1?B?QW5kcuk=?=

André napisa³(a):



Be aware that broken images (i.e. partially downloaded) in many cases
pass the imghdr.what() test. This function checks for patterns in files,
just like "file" utility.

That's all I need; I'm not concerned about broken images. I am
writing a web app and need to prevent someone using redirection to
send malicious content when I'm supposedly loading an image file. So,
what I plan to do is open the file using urlopen, preload the image
and see if it is valid; if so, I pass it on to the browser.

To find out more, look for "redirect" on the following page (it is the
first occurence of that word)
http://ha.ckers.org/xss.html

 
T

Terry Reedy

André napisa³(a):
[...]
Be aware that broken images (i.e. partially downloaded) in many cases
pass the imghdr.what() test.

To put it another way, the only way to determine whether a coded file is
valid may be to decode it. And even then, it may be corrupted in the sense
that the decoded version may have artifacts not in the original. I have
seen the latter both in jpeg images and movie DVDs.

tjr
 
J

Jarek Zgoda

Terry Reedy napisa³(a):
Other than installing PIL, is there a "simple" way using Python only
to determine if a file is a valid image file?
[...]
Be aware that broken images (i.e. partially downloaded) in many cases
pass the imghdr.what() test.

To put it another way, the only way to determine whether a coded file is
valid may be to decode it. And even then, it may be corrupted in the sense
that the decoded version may have artifacts not in the original. I have
seen the latter both in jpeg images and movie DVDs.

That's what I mean, images that cann't be read using PIL sometimes are
recognized by imghdr.what(), as it happens with "file" too - both of
these tools are the identification (not validation) utilities. To be
sure the image is "really valid", you have to use some image
manipulation program (or library), like ImageMagick (or PIL). Sometimes
imghdr.what() is enough, sometimes you need more. ;)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top