Find similar images using python

Discussion in 'Python' started by Thomas W, Mar 29, 2006.

  1. Thomas W

    Thomas W Guest

    How can I use python to find images that looks quite similar? Thought
    I'd scale the images down to 32x32 and convert it to use a standard
    palette of 256 colors then compare the result pixel for pixel etc, but
    it seems as if this would take a very long time to do when processing
    lots of images.

    Any hint/clue on this subject would be appreciated.

    Best regards,
    Thomas
     
    Thomas W, Mar 29, 2006
    #1
    1. Advertising

  2. Thomas W

    Guest

    Use PIL..of course..

    Sudharshan S
     
    , Mar 29, 2006
    #2
    1. Advertising

  3. Thomas W wrote:
    > How can I use python to find images that looks quite similar? Thought
    > I'd scale the images down to 32x32 and convert it to use a standard
    > palette of 256 colors then compare the result pixel for pixel etc, but
    > it seems as if this would take a very long time to do when processing
    > lots of images.
    >
    > Any hint/clue on this subject would be appreciated.


    This question is immensely non-trivial unless you can form a precise
    definition of "images that look quite similar." It is one of those
    deceptive problems that seem straightforward, but become less and less
    well-defined the more you study it. If you can solve this you can
    get a PhD, get rich, get famous, or a combination of all three.

    The US Supreme Court gave up on identifying pornography, because
    the best definition anyone ever came up with was "I know it when I
    see it," a judgment quite reasonable in a legal system based on
    trusted authorities, but not a good one in a society "ruled by laws
    and not men."

    --Scott David Daniels
     
    Scott David Daniels, Mar 29, 2006
    #3
  4. Thomas W wrote:

    > How can I use python to find images that looks quite similar? Thought
    > I'd scale the images down to 32x32 and convert it to use a standard
    > palette of 256 colors then compare the result pixel for pixel etc, but
    > it seems as if this would take a very long time to do when processing
    > lots of images.
    >
    > Any hint/clue on this subject would be appreciated.


    You are aware that this is one of the most sophisticated research areas in
    CS in general? Your approach is by no means appropriate for even the
    slightest of differences in the image - after all, your only reducing
    resolution. That doesn't e.g account for different lighting conditions -
    you wouldn't be able to connect a still photograph of a house taken by a
    mounted camera at dusk and at dawn. And so on. So as long as you don't have
    a _very_ homogene image source, this is a way more complicated task - if
    not undoable.

    Diez
     
    Diez B. Roggisch, Mar 29, 2006
    #4
  5. Thomas W

    Guest

    , Mar 29, 2006
    #5
  6. wrote:

    > I dont get it..cant the matching take place efficiently with PIL, only
    > that you need to have a condition i.e if the mismatch exceeds a certain
    > threshold, they are not similar,
    >
    >

    http://gumuz.looze.net/wordpress/index.php/archives/2005/06/06/python-webcam-fun-motion-detection/
    >
    > Check the above link, only diiference is that instead of files as in ur
    > case, the code here compares two pixels of consecutive frames for
    > changes..


    No, the difference is fundamental: two consecutive frames of a still-mounted
    camera are - except noise and changing lightning conditions - the same.
    detecting a difference in case of motion is easy.

    But similarity between two images is a totally different beast. I would say
    that an image of my grandma with me on her knee and another one with my
    brother are very similar. But your approach would certainly fail to say
    so...

    diez
     
    Diez B. Roggisch, Mar 29, 2006
    #6
  7. [Thomas]
    > How can I use python to find images that looks quite similar?


    Have you looked at http://www.imgseek.net/ ? It's an Open Source Python photo
    collection manager that does exactly what you're asking for.

    --
    Richie Hindle
     
    Richie Hindle, Mar 29, 2006
    #7
  8. Thomas W

    Andrew Guest

    I did this once for a motion dection algorithm. I used luminescence
    calculations to determine this. I basically broke the image into a
    grid of nine (3x3) areas and calculated the luminescene for each
    section and if it changed signficantly enough then there has been
    motions. The more sections, the more precise the detection will be.
    This was written in Visual Basic 4 and was able to do about 10 frames
    per seconds so you can get decent performance.

    I got the luminescence calculation from a classic math/computer science
    equation book. I should know the title but I'm blanking on it.

    Andy
     
    Andrew, Mar 29, 2006
    #8
  9. Thomas W

    nikie Guest

    > How can I use python to find images that looks quite similar? Thought
    > I'd scale the images down to 32x32 and convert it to use a standard
    > palette of 256 colors then compare the result pixel for pixel etc, but
    > it seems as if this would take a very long time to do when processing
    > lots of images.
    >
    > Any hint/clue on this subject would be appreciated.


    A company I used to work for has been doing research in this area
    (finding differences between images) for years, and the results are
    still hardy generalizable, so don't expect to get perfect results after
    a weekend ;-)

    I'm not sure what you mean by "similar": I assume for the moment that
    you want to detect if you really have the same photo, but scanned with
    a different resolution, or with a different scanner or with a digital
    camera that's slightly out of focus. This is still hard enough!

    There are many approaches to this problem, downsampling the image might
    work (use supersampling!), but it doesn't cover rotations, or different
    borders or blur..., so you'll have to put some additional efforts into
    the comparison algorithm. Also, converting the images to a paletted
    format is almost definitly the wrong way - convert them to greyscale,
    or work on 24 bit (RGB or HSV).
    Another approach that you might try is comparing the image histograms:
    they aren't affected by geometric transformations, and should still
    contain some information about the original image. Even if they aren't
    sufficient, they might help you to narrow down your search, so you have
    more processing time for advanced algorithms.

    If you have performance problems, NumPy and Psyco might both be worth a
    look.
     
    nikie, Mar 29, 2006
    #9
  10. Thomas W

    John J. Lee Guest

    Richie Hindle <> writes:

    > [Thomas]
    > > How can I use python to find images that looks quite similar?

    >
    > Have you looked at http://www.imgseek.net/ ? It's an Open Source Python photo
    > collection manager that does exactly what you're asking for.


    Maybe... I don't recall if it had a duplicate search feature. What I
    remember is the GUI in which you scribbled a picture, and asked it to
    pull up images that looked like that. Amusing, though didn't seem to
    work terribly well. No bad reflection on the author: it's a hard
    problem of course.


    John
     
    John J. Lee, Mar 31, 2006
    #10
  11. John J. Lee schreef:
    > Richie Hindle <> writes:
    >
    >> [Thomas]
    >>> How can I use python to find images that looks quite similar?

    >> Have you looked at http://www.imgseek.net/ ? It's an Open Source Python photo
    >> collection manager that does exactly what you're asking for.

    >
    > Maybe... I don't recall if it had a duplicate search feature. What I
    > remember is the GUI in which you scribbled a picture, and asked it to
    > pull up images that looked like that. Amusing, though didn't seem to
    > work terribly well. No bad reflection on the author: it's a hard
    > problem of course.


    It does have a duplicate search feature (see screenshot
    http://www.imgseek.net/sshot/e1c93fe060c6622a6aee90a22921c49a.png for
    example), though I don't know how well it works.

    --
    If I have been able to see further, it was only because I stood
    on the shoulders of giants. -- Isaac Newton

    Roel Schroeven
     
    Roel Schroeven, Mar 31, 2006
    #11
  12. Thomas W wrote:

    >How can I use python to find images that looks quite similar? Thought
    >I'd scale the images down to 32x32 and convert it to use a standard
    >palette of 256 colors then compare the result pixel for pixel etc, but
    >it seems as if this would take a very long time to do when processing
    >lots of images.
    >
    >Any hint/clue on this subject would be appreciated.
    >
    >

    This really depends on what is meant by "quite similar".

    If you mean "to the human eye, the two pictures are identical",
    as in the case of a tool to get rid of trivially-different duplications,
    then you can use the technique you propose. I don't imagine that
    you can save any time over that process. You'd use something
    like PIL to do the comparisons, of course -- I suspect you want to
    do something like:

    1) resize both
    2) quantize the colors
    3) subtract the two images
    4) resize to 1x1
    5) threshhold the result (i.e. we've used PIL to sum the differences)

    strictly speaking, it might be more mathematically ideal to take
    the sum of the difference of the squares of the pixels (i.e. compute
    chi-square). This of course, avoids the painfully slow process of
    comparing pixel-by-pixel in a Python loop, which would, of course
    be painfully slow.

    This is conceptually equivalent to using an "epsilon" to test "equality"
    of floating point numbers.

    The more general case of matching images with similar content (but
    which would be recognizeably different to the human eye), is a much
    more challenging cutting-edge AI problem, as has already been
    mentioned -- but I was going to mention imgSeek myself (I see someone's
    already given you the link).
     
    Terry Hancock, Mar 31, 2006
    #12
  13. On 29 Mar 2006 05:06:10 -0800, rumours say that "Thomas W"
    <> might have written:

    >How can I use python to find images that looks quite similar? Thought
    >I'd scale the images down to 32x32 and convert it to use a standard
    >palette of 256 colors then compare the result pixel for pixel etc, but
    >it seems as if this would take a very long time to do when processing
    >lots of images.


    I see someone suggested imgseek. This uses a Haar transform to compare
    images (check on it). I did make a module based on imgseek, and together
    with PIL, I manage my archive of email attachments (it's incredible how many
    different versions of the same picture people send you: gif, jpg in
    different sizes etc) and it works fairly well.

    E-mail me if you want the module, I don't think I have it currently online
    anywhere.
    --
    TZOTZIOY, I speak England very best.
    "Dear Paul,
    please stop spamming us."
    The Corinthians
     
    Christos Georgiou, Mar 31, 2006
    #13
  14. Christos Georgiou wrote:
    > .... I did make a module based on imgseek, and together with PIL,
    > I manage my archive of email attachments (it's incredible how many
    > different versions of the same picture people send you: gif, jpg
    > in different sizes etc) and it works fairly well.
    >
    > E-mail me if you want the module, I don't think I have it currently online
    > anywhere.


    This sounds like a great recipe for the cookbook:
    http://aspn.activestate.com/ASPN/Cookbook/Python

    --
    -Scott David Daniels
     
    Scott David Daniels, Apr 1, 2006
    #14
  15. Thomas W

    Ravi Teja Guest

    Finding similar images is not at all a trivial task. Entire PhD
    dissertations have been committed to it. The solutions are still very
    unreliable as of yet. If you want to find more, you can read the
    research out of the ongoing Image CLEF track. I worked with them
    briefly a couple of years ago in context of medical images.

    http://ir.shef.ac.uk/imageclef/
     
    Ravi Teja, Apr 1, 2006
    #15
  16. On Fri, 31 Mar 2006 15:10:11 -0800, rumours say that Scott David Daniels
    <> might have written:

    >Christos Georgiou wrote:
    >> .... I did make a module based on imgseek, and together with PIL,
    >> I manage my archive of email attachments (it's incredible how many
    >> different versions of the same picture people send you: gif, jpg
    >> in different sizes etc) and it works fairly well.
    >>
    >> E-mail me if you want the module, I don't think I have it currently online
    >> anywhere.


    >This sounds like a great recipe for the cookbook:
    > http://aspn.activestate.com/ASPN/Cookbook/Python


    Actually, it should go to the CheeseShop, since it is a python module that
    is a bridge between PIL and the C module (I don't believe multi-file modules
    are appropriate for the cookbook, but ICBW); however, my web space is out of
    reach for some months now (in a web server at a previous company I worked
    for), and I'm in the process of fixing that :)
    --
    TZOTZIOY, I speak England very best.
    "Dear Paul,
    please stop spamming us."
    The Corinthians
     
    Christos Georgiou, Apr 4, 2006
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tem

    Find similar items

    Tem, Jan 30, 2008, in forum: ASP .Net
    Replies:
    20
    Views:
    895
  2. Tem

    find similar content

    Tem, Mar 1, 2008, in forum: ASP .Net
    Replies:
    9
    Views:
    310
    Mike C#
    Mar 4, 2008
  3. Replies:
    14
    Views:
    252
    William James
    Sep 25, 2005
  4. ADvantage
    Replies:
    1
    Views:
    255
  5. richard
    Replies:
    0
    Views:
    109
    richard
    Oct 2, 2008
Loading...

Share This Page