Finding MIME type for a data stream

Discussion in 'Python' started by Tobiah, Mar 8, 2012.

  1. Tobiah

    Tobiah Guest

    I'm pulling image data from a database blob, and serving
    it from a web2py app. I have to send the correct
    Content-Type header, so I need to detect the image type.

    Everything that I've found on the web so far, needs a file
    name on the disk, but I only have the data.

    It looks like the 'magic' package might be of use, but
    I can't find any documentation for it.

    Also, it seems like image/png works for other types
    of image data, while image/foo does not, yet I'm afraid
    that not every browser will play along as nicely.

    Thanks!

    Tobiah
     
    Tobiah, Mar 8, 2012
    #1
    1. Advertising

  2. Tobiah

    Dave Angel Guest

    On 03/08/2012 04:55 PM, Tobiah wrote:
    > I'm pulling image data from a database blob, and serving
    > it from a web2py app. I have to send the correct
    > Content-Type header, so I need to detect the image type.
    >
    > Everything that I've found on the web so far, needs a file
    > name on the disk, but I only have the data.
    >
    > It looks like the 'magic' package might be of use, but
    > I can't find any documentation for it.
    >
    > Also, it seems like image/png works for other types
    > of image data, while image/foo does not, yet I'm afraid
    > that not every browser will play along as nicely.
    >
    > Thanks!
    >
    > Tobiah


    First step, ask the authors of the database what format of data this
    blob is in.

    Failing that, write the same data locally as a binary file, and see what
    application can open it. Or if you're on a Linux system, run file on
    it. "file" can identify most data formats (not just images) just by
    looking at the data.

    That assumes, of course, that there's any consistency in the data coming
    out of the database. What happens if next time this blob is an Excel
    spreadsheet?

    --

    DaveA
     
    Dave Angel, Mar 8, 2012
    #2
    1. Advertising

  3. Tobiah

    Tobiah Guest

    On 03/08/2012 02:11 PM, Dave Angel wrote:
    > On 03/08/2012 04:55 PM, Tobiah wrote:
    >> I'm pulling image data from a database blob, and serving
    >> it from a web2py app. I have to send the correct
    >> Content-Type header, so I need to detect the image type.
    >>
    >> Everything that I've found on the web so far, needs a file
    >> name on the disk, but I only have the data.
    >>
    >> It looks like the 'magic' package might be of use, but
    >> I can't find any documentation for it.
    >>
    >> Also, it seems like image/png works for other types
    >> of image data, while image/foo does not, yet I'm afraid
    >> that not every browser will play along as nicely.
    >>
    >> Thanks!
    >>
    >> Tobiah

    >
    > First step, ask the authors of the database what format of data this
    > blob is in.
    >
    > Failing that, write the same data locally as a binary file, and see what
    > application can open it. Or if you're on a Linux system, run file on
    > it. "file" can identify most data formats (not just images) just by
    > looking at the data.
    >
    > That assumes, of course, that there's any consistency in the data coming
    > out of the database. What happens if next time this blob is an Excel
    > spreadsheet?
    >



    I should simplify my question. Let's say I have a string
    that contains image data called 'mystring'.

    I want to do

    mime_type = some_magic(mystring)

    and get back 'image/jpg' or 'image/png' or whatever is
    appropriate for the image data.

    Thanks!

    Tobiah
     
    Tobiah, Mar 8, 2012
    #3
  4. Tobiah

    Tobiah Guest

    Also, I realize that I could write the data to a file
    and then use one of the modules that want a file path.
    I would prefer not to do that.

    Thanks
     
    Tobiah, Mar 8, 2012
    #4
  5. Tobiah

    Dave Angel Guest

    On 03/08/2012 05:28 PM, Tobiah wrote:
    > <snip>
    >
    >
    > I should simplify my question. Let's say I have a string
    > that contains image data called 'mystring'.
    >
    > I want to do
    >
    > mime_type = some_magic(mystring)
    >
    > and get back 'image/jpg' or 'image/png' or whatever is
    > appropriate for the image data.
    >
    > Thanks!
    >
    > Tobiah


    I have to assume you're talking python 2, since in python 3, strings
    cannot generally contain image data. In python 2, characters are pretty
    much interchangeable with bytes.

    Anyway, I don't know any way in the standard lib to distinguish
    arbitrary image formats. (There very well could be one.) The file
    program I referred to was an external utility, which you could run with
    the multiprocessing module.

    if you're looking for a specific, small list of file formats, you could
    make yourself a signature list. Most (not all) formats distinguish
    themselves in the first few bytes. For example, a standard zip file
    starts with "PK" for Phil Katz. A Windows exe starts with "MZ" for
    Mark Zbikowsky. And I believe a jpeg file starts hex(d8) (ff) (e0) (ff)

    If you'd like to see a list of available modules, help() is your
    friend. You can start with help("modules") to see quite a long list.
    And I was surprised how many image related things already are there. So
    maybe there's something I don't know about that could help.

    --

    DaveA
     
    Dave Angel, Mar 8, 2012
    #5
  6. Tobiah

    Tobiah Guest


    > I have to assume you're talking python 2, since in python 3, strings
    > cannot generally contain image data. In python 2, characters are pretty
    > much interchangeable with bytes.


    Yeah, python 2


    > if you're looking for a specific, small list of file formats, you could
    > make yourself a signature list. Most (not all) formats distinguish
    > themselves in the first few bytes.


    Yeah, maybe I'll just do that. I'm alowing users to paste
    images into a rich-text editor, so I'm pretty much looking
    at .png, .gif, or .jpg. Those should be pretty easy to
    distinguish by looking at the first few bytes.

    Pasting images may sound weird, but I'm using a jquery
    widget called cleditor that takes image data from the
    clipboard and replaces it with inline base64 data.
    The html from the editor ends up as an email, and the
    inline images cause the emails to be tossed in the
    spam folder for most people. So I'm parsing the
    emails, storing the image data, and replacing the
    inline images with an img tag that points to a
    web2py app that takes arguments that tell it which
    image to pull from the database.

    Now that I think of it, I could use php to detect the
    image type, and store that in the database. Not quite
    as clean, but that would work.

    Tobiah
     
    Tobiah, Mar 8, 2012
    #6
  7. On Thu, 08 Mar 2012 15:40:13 -0800, Tobiah <> declaimed
    the following in gmane.comp.python.general:


    > Pasting images may sound weird, but I'm using a jquery
    > widget called cleditor that takes image data from the
    > clipboard and replaces it with inline base64 data.


    In Windows, I'd expect "device independent bitmap" to be the result
    of a clipboard image...
    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Mar 9, 2012
    #7
  8. On 8-3-2012 23:34, Tobiah wrote:
    > Also, I realize that I could write the data to a file
    > and then use one of the modules that want a file path.
    > I would prefer not to do that.
    >
    > Thanks
    >


    Use StringIO then, instead of a file on disk

    Irmen
     
    Irmen de Jong, Mar 9, 2012
    #8
  9. Tobiah

    Jon Clements Guest

    On Thursday, 8 March 2012 23:40:13 UTC, Tobiah wrote:
    > > I have to assume you're talking python 2, since in python 3, strings
    > > cannot generally contain image data. In python 2, characters are pretty
    > > much interchangeable with bytes.

    >
    > Yeah, python 2
    >
    >
    > > if you're looking for a specific, small list of file formats, you could
    > > make yourself a signature list. Most (not all) formats distinguish
    > > themselves in the first few bytes.

    >
    > Yeah, maybe I'll just do that. I'm alowing users to paste
    > images into a rich-text editor, so I'm pretty much looking
    > at .png, .gif, or .jpg. Those should be pretty easy to
    > distinguish by looking at the first few bytes.
    >
    > Pasting images may sound weird, but I'm using a jquery
    > widget called cleditor that takes image data from the
    > clipboard and replaces it with inline base64 data.
    > The html from the editor ends up as an email, and the
    > inline images cause the emails to be tossed in the
    > spam folder for most people. So I'm parsing the
    > emails, storing the image data, and replacing the
    > inline images with an img tag that points to a
    > web2py app that takes arguments that tell it which
    > image to pull from the database.
    >
    > Now that I think of it, I could use php to detect the
    > image type, and store that in the database. Not quite
    > as clean, but that would work.
    >
    > Tobiah


    Something like the following might be worth a go:
    (untested)

    from PIL import Image
    img = Image.open(StringIO(blob))
    print img.format

    HTH
    Jon.

    PIL: http://www.pythonware.com/library/pil/handbook/image.htm
     
    Jon Clements, Mar 9, 2012
    #9
  10. Tobiah

    Peter Otten Guest

    Tobiah wrote:

    > I'm pulling image data from a database blob, and serving
    > it from a web2py app. I have to send the correct
    > Content-Type header, so I need to detect the image type.
    >
    > Everything that I've found on the web so far, needs a file
    > name on the disk, but I only have the data.
    >
    > It looks like the 'magic' package might be of use, but
    > I can't find any documentation for it.


    After some try-and-error and a look into example.py:

    >>> m = magic.open(magic.MAGIC_MIME_TYPE)
    >>> m.load()

    0
    >>> sample = open("tmp.png").read()
    >>> m.buffer(sample)

    'image/png'
     
    Peter Otten, Mar 9, 2012
    #10
  11. Tobiah

    Tobiah Guest

    On 03/08/2012 06:12 PM, Irmen de Jong wrote:
    > On 8-3-2012 23:34, Tobiah wrote:
    >> Also, I realize that I could write the data to a file
    >> and then use one of the modules that want a file path.
    >> I would prefer not to do that.
    >>
    >> Thanks
    >>

    >
    > Use StringIO then, instead of a file on disk
    >
    > Irmen
    >


    Nice. Thanks.
     
    Tobiah, Mar 9, 2012
    #11
  12. Tobiah

    Tobiah Guest

    On 03/08/2012 06:04 PM, Dennis Lee Bieber wrote:
    > On Thu, 08 Mar 2012 15:40:13 -0800, Tobiah <> declaimed
    > the following in gmane.comp.python.general:
    >
    >
    >> Pasting images may sound weird, but I'm using a jquery
    >> widget called cleditor that takes image data from the
    >> clipboard and replaces it with inline base64 data.

    >
    > In Windows, I'd expect "device independent bitmap" to be the result
    > of a clipboard image...


    This jquery editor seems to detect the image data and
    translate it into an inline image like:

    <img src="...

    I'm parsing those out with regular expressions and decoding
    the base64, and putting the resulting image data into a blob.
    Hmm... there's the mime type right there.
     
    Tobiah, Mar 9, 2012
    #12
  13. Tobiah

    Tobiah Guest

    > Something like the following might be worth a go:
    > (untested)
    >
    > from PIL import Image
    > img = Image.open(StringIO(blob))
    > print img.format
    >


    This worked quite nicely. I didn't
    see a list of all returned formats though
    in the docs. The one image I had returned

    PNG

    So I'm doing:

    mime_type = "image/%s" % img.format.lower()

    I'm hoping that will work for any image type.

    Thanks,

    Tobiah
     
    Tobiah, Mar 9, 2012
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Stephen Riek
    Replies:
    0
    Views:
    4,440
    Stephen Riek
    Sep 19, 2003
  2. CJ
    Replies:
    1
    Views:
    1,574
    Andrew Thompson
    Oct 29, 2004
  3. Totan
    Replies:
    0
    Views:
    959
    Totan
    Apr 17, 2006
  4. Jan Arickx
    Replies:
    0
    Views:
    202
    Jan Arickx
    Aug 25, 2003
  5. joe
    Replies:
    0
    Views:
    197
Loading...

Share This Page