Determining if file is valid image file

Discussion in 'Python' started by =?iso-8859-1?B?QW5kcuk=?=, Aug 2, 2007.

  1. Other than installing PIL, is there a "simple" way using Python only
    to determine if a file is a valid image file?

    I'd be happy if I could at least identify valid images files for gif,
    jpeg and png. Pointers to existing modules or examples would be
    appreciated.

    The reason why I'd prefer not using PIL is that I'd like to bundle
    such a function/module in my app.

    André
    =?iso-8859-1?B?QW5kcuk=?=, Aug 2, 2007
    #1
    1. Advertising

  2. On Aug 2, 11:14 am, André <> wrote:
    > Other than installing PIL, is there a "simple" way using Python only
    > to determine if a file is a valid image file?
    >
    > I'd be happy if I could at least identify valid images files for gif,
    > jpeg and png. Pointers to existing modules or examples would be
    > appreciated.
    >
    > The reason why I'd prefer not using PIL is that I'd like to bundle
    > such a function/module in my app.
    >
    > André


    I should have added: I'm interesting in validating the file *content*
    - not the filename :)
    =?iso-8859-1?B?QW5kcuk=?=, Aug 2, 2007
    #2
    1. Advertising

  3. =?iso-8859-1?B?QW5kcuk=?=

    Larry Bates Guest

    André wrote:
    > On Aug 2, 11:14 am, André <> wrote:
    >> Other than installing PIL, is there a "simple" way using Python only
    >> to determine if a file is a valid image file?
    >>
    >> I'd be happy if I could at least identify valid images files for gif,
    >> jpeg and png. Pointers to existing modules or examples would be
    >> appreciated.
    >>
    >> The reason why I'd prefer not using PIL is that I'd like to bundle
    >> such a function/module in my app.
    >>
    >> André

    >
    > I should have added: I'm interesting in validating the file *content*
    > - not the filename :)
    >

    And what's wrong with bundling PIL in your application?

    -Larry
    Larry Bates, Aug 2, 2007
    #3
  4. =?iso-8859-1?B?QW5kcuk=?=

    Jarek Zgoda Guest

    André napisa³(a):

    >> Other than installing PIL, is there a "simple" way using Python only
    >> to determine if a file is a valid image file?
    >>
    >> I'd be happy if I could at least identify valid images files for gif,
    >> jpeg and png. Pointers to existing modules or examples would be
    >> appreciated.
    >>
    >> The reason why I'd prefer not using PIL is that I'd like to bundle
    >> such a function/module in my app.
    >>
    >> André

    >
    > I should have added: I'm interesting in validating the file *content*
    > - not the filename :)


    Is the module imghdr enough for your needs?

    --
    Jarek Zgoda
    Skype: jzgoda | GTalk: | voice: +48228430101

    "We read Knuth so you don't have to." (Tim Peters)
    Jarek Zgoda, Aug 2, 2007
    #4
  5. On Thursday 02 August 2007, André wrote:
    > On Aug 2, 11:14 am, André <> wrote:
    > > Other than installing PIL, is there a "simple" way using Python only
    > > to determine if a file is a valid image file?
    > >
    > > I'd be happy if I could at least identify valid images files for gif,
    > > jpeg and png. Pointers to existing modules or examples would be
    > > appreciated.
    > >
    > > The reason why I'd prefer not using PIL is that I'd like to bundle
    > > such a function/module in my app.

    >
    > I should have added: I'm interesting in validating the file *content*
    > - not the filename :)


    The file name has nothing to do with the type :p

    A straightforward way you won't like: read the specs for all formats you're
    interested in and write the function yourself ;-)
    Thomas Jollans, Aug 2, 2007
    #5
  6. =?iso-8859-1?B?QW5kcuk=?=

    Guest

    On Aug 2, 9:35 am, Thomas Jollans <> wrote:
    > On Thursday 02 August 2007, André wrote:
    >
    > > On Aug 2, 11:14 am, André <> wrote:
    > > > Other than installing PIL, is there a "simple" way using Python only
    > > > to determine if a file is a valid image file?

    >
    > > > I'd be happy if I could at least identify valid images files for gif,
    > > > jpeg and png. Pointers to existing modules or examples would be
    > > > appreciated.

    >
    > > > The reason why I'd prefer not using PIL is that I'd like to bundle
    > > > such a function/module in my app.

    >
    > > I should have added: I'm interesting in validating the file *content*
    > > - not the filename :)

    >
    > The file name has nothing to do with the type :p
    >
    > A straightforward way you won't like: read the specs for all formats you're
    > interested in and write the function yourself ;-)


    Use the md5 module to create checksums. Links below:

    http://www.peterbe.com/plog/using-md5-to-check-equality-between-files
    http://effbot.org/librarybook/md5.htm
    http://docs.python.org/lib/module-md5.html

    Larry is right too...what's wrong with bundling PIL or any third party
    module?

    Mike
    , Aug 2, 2007
    #6
  7. On Aug 2, 11:34 am, Jarek Zgoda <> wrote:
    > André napisa³(a):
    >
    > >> Other than installing PIL, is there a "simple" way using Python only
    > >> to determine if a file is a valid image file?

    >
    > >> I'd be happy if I could at least identify valid images files for gif,
    > >> jpeg and png. Pointers to existing modules or examples would be
    > >> appreciated.

    >
    > >> The reason why I'd prefer not using PIL is that I'd like to bundle
    > >> such a function/module in my app.

    >
    > >> André

    >
    > > I should have added: I'm interesting in validating the file *content*
    > > - not the filename :)

    >
    > Is the module imghdr enough for your needs?
    >


    Yes, thanks.


    > --
    > Jarek Zgoda
    > Skype: jzgoda | GTalk: | voice: +48228430101
    >
    > "We read Knuth so you don't have to." (Tim Peters)
    =?iso-8859-1?B?QW5kcuk=?=, Aug 2, 2007
    #7
  8. On Aug 2, 11:38 am, wrote:
    > On Aug 2, 9:35 am, Thomas Jollans <> wrote:
    >
    >
    >
    > > On Thursday 02 August 2007, André wrote:

    >
    > > > On Aug 2, 11:14 am, André <> wrote:
    > > > > Other than installing PIL, is there a "simple" way using Python only
    > > > > to determine if a file is a valid image file?

    >
    > > > > I'd be happy if I could at least identify valid images files for gif,
    > > > > jpeg and png. Pointers to existing modules or examples would be
    > > > > appreciated.

    >
    > > > > The reason why I'd prefer not using PIL is that I'd like to bundle
    > > > > such a function/module in my app.

    >
    > > > I should have added: I'm interesting in validating the file *content*
    > > > - not the filename :)

    >
    > > The file name has nothing to do with the type :p

    >
    > > A straightforward way you won't like: read the specs for all formats you're
    > > interested in and write the function yourself ;-)

    >
    > Use the md5 module to create checksums. Links below:
    >


    Sorry, I fail to see how this helps me to identify if a file I
    retrieve from somewhere is a valid image file...

    > http://www.peterbe.com/plog/using-m...htmhttp://docs.python.org/lib/module-md5.html
    >
    > Larry is right too...what's wrong with bundling PIL or any third party
    > module?
    >


    Why not bundling PIL?: Because I'm trying to keep the size of my app
    as small as possible.
    I don't mind bundling some other modules from third parties (in fact,
    I already do include
    three modules from ElementTree...).

    André

    > Mike
    =?iso-8859-1?B?QW5kcuk=?=, Aug 2, 2007
    #8
  9. =?iso-8859-1?B?QW5kcuk=?=

    Dave Hughes Guest

    André wrote:

    > Other than installing PIL, is there a "simple" way using Python only
    > to determine if a file is a valid image file?
    >
    > I'd be happy if I could at least identify valid images files for gif,
    > jpeg and png. Pointers to existing modules or examples would be
    > appreciated.
    >
    > The reason why I'd prefer not using PIL is that I'd like to bundle
    > such a function/module in my app.


    Any reason you don't want to bundle PIL? The license looks like a
    fairly standard BSD style license to me which I don't think precludes
    you from bundling it (other than having to reproduce the (very small)
    license text in any documentation).

    Otherwise, it depends on exactly what you mean by "valid". You could do
    something as simple as check the "magic" number in the header of the
    file. Most image formats have something like this:

    * PNG: byte sequence 89 50 4E 47 0D 0A 1A 0A
    * GIF: "GIF89a" or "GIF87a"
    * JPG: byte sequence FF D8 FF E0 nn nn 4A 46 49 46 00 (for JFIF)

    Naturally, this won't guarantee the rest of the file is valid, but
    might be sufficient for your purposes (it's one of the methods the
    "file" command uses for recognizing file types).


    HTH,

    Dave.
    --
    Dave Hughes, Aug 2, 2007
    #9
  10. =?iso-8859-1?B?QW5kcuk=?=

    brad Guest

    André wrote:

    > I should have added: I'm interesting in validating the file *content*
    > - not the filename :)


    Some formats have identifying headers... I think jpeg is an example of
    this. Open it with a hex editor or just read the first few bytes and see
    for yourself.

    Brad
    brad, Aug 2, 2007
    #10
  11. =?iso-8859-1?B?QW5kcuk=?=

    Jarek Zgoda Guest

    André napisa³(a):

    >>>> Other than installing PIL, is there a "simple" way using Python only
    >>>> to determine if a file is a valid image file?
    >>>> I'd be happy if I could at least identify valid images files for gif,
    >>>> jpeg and png. Pointers to existing modules or examples would be
    >>>> appreciated.
    >>>> The reason why I'd prefer not using PIL is that I'd like to bundle
    >>>> such a function/module in my app.
    >>>> André
    >>> I should have added: I'm interesting in validating the file *content*
    >>> - not the filename :)

    >> Is the module imghdr enough for your needs?

    >
    > Yes, thanks.


    Be aware that broken images (i.e. partially downloaded) in many cases
    pass the imghdr.what() test. This function checks for patterns in files,
    just like "file" utility.

    --
    Jarek Zgoda
    http://jpa.berlios.de/
    Jarek Zgoda, Aug 2, 2007
    #11
  12. On Aug 2, 4:25 pm, Jarek Zgoda <> wrote:
    > André napisa³(a):
    >
    > >>>> Other than installing PIL, is there a "simple" way using Python only
    > >>>> to determine if a file is a valid image file?
    > >>>> I'd be happy if I could at least identify valid images files for gif,
    > >>>> jpeg and png. Pointers to existing modules or examples would be
    > >>>> appreciated.
    > >>>> The reason why I'd prefer not using PIL is that I'd like to bundle
    > >>>> such a function/module in my app.
    > >>>> André
    > >>> I should have added: I'm interesting in validating the file *content*
    > >>> - not the filename :)
    > >> Is the module imghdr enough for your needs?

    >
    > > Yes, thanks.

    >
    > Be aware that broken images (i.e. partially downloaded) in many cases
    > pass the imghdr.what() test. This function checks for patterns in files,
    > just like "file" utility.
    >


    That's all I need; I'm not concerned about broken images. I am
    writing a web app and need to prevent someone using redirection to
    send malicious content when I'm supposedly loading an image file. So,
    what I plan to do is open the file using urlopen, preload the image
    and see if it is valid; if so, I pass it on to the browser.

    To find out more, look for "redirect" on the following page (it is the
    first occurence of that word)
    http://ha.ckers.org/xss.html


    > --
    > Jarek Zgodahttp://jpa.berlios.de/
    =?iso-8859-1?B?QW5kcuk=?=, Aug 2, 2007
    #12
  13. =?iso-8859-1?B?QW5kcuk=?=

    Terry Reedy Guest

    "Jarek Zgoda" <> wrote in message
    news:f8tbd6$rnh$...
    André napisa³(a):

    >>>> Other than installing PIL, is there a "simple" way using Python only
    >>>> to determine if a file is a valid image file?

    [...]
    > Be aware that broken images (i.e. partially downloaded) in many cases
    > pass the imghdr.what() test.


    To put it another way, the only way to determine whether a coded file is
    valid may be to decode it. And even then, it may be corrupted in the sense
    that the decoded version may have artifacts not in the original. I have
    seen the latter both in jpeg images and movie DVDs.

    tjr
    Terry Reedy, Aug 2, 2007
    #13
  14. =?iso-8859-1?B?QW5kcuk=?=

    Jarek Zgoda Guest

    Terry Reedy napisa³(a):

    >>>>> Other than installing PIL, is there a "simple" way using Python only
    >>>>> to determine if a file is a valid image file?

    > [...]
    >> Be aware that broken images (i.e. partially downloaded) in many cases
    >> pass the imghdr.what() test.

    >
    > To put it another way, the only way to determine whether a coded file is
    > valid may be to decode it. And even then, it may be corrupted in the sense
    > that the decoded version may have artifacts not in the original. I have
    > seen the latter both in jpeg images and movie DVDs.


    That's what I mean, images that cann't be read using PIL sometimes are
    recognized by imghdr.what(), as it happens with "file" too - both of
    these tools are the identification (not validation) utilities. To be
    sure the image is "really valid", you have to use some image
    manipulation program (or library), like ImageMagick (or PIL). Sometimes
    imghdr.what() is enough, sometimes you need more. ;)

    --
    Jarek Zgoda
    http://jpa.berlios.de/
    Jarek Zgoda, Aug 2, 2007
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Irishmaninusa

    determining an image file height and width

    Irishmaninusa, Jul 16, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    417
    Irishmaninusa
    Jul 16, 2004
  2. Laszlo Zsolt Nagy
    Replies:
    1
    Views:
    1,278
    Kartic
    Jan 26, 2005
  3. Travis Newbury
    Replies:
    0
    Views:
    425
    Travis Newbury
    Aug 1, 2008
  4. cwdjrxyz
    Replies:
    0
    Views:
    402
    cwdjrxyz
    Aug 2, 2008
  5. damon
    Replies:
    3
    Views:
    1,160
    Roedy Green
    Oct 29, 2010
Loading...

Share This Page