ImSim: Image Similarity

Discussion in 'Python' started by n00m, Mar 5, 2011.

  1. n00m

    n00m Guest

    Let me present my newborn project (in Python) ImSim:

    http://sourceforge.net/projects/imsim/

    Its README.txt:
    ---------------------------------------------------------------------
    ImSim is a python script for finding the most similar pic(s) to
    a given one among a set/list/db of your pics.
    The script is very short and very easy to follow and understand.
    Its sample output looks like this:

    bears2.jpg
    --------------------
    bears2.jpg 0.00
    bears3.jpg 55.33
    bears1.jpg 68.87
    sky1.jpg 83.84
    sky2.jpg 84.41
    ff1.jpg 91.35
    lake1.jpg 95.14
    water1.jpg 96.94
    ff2.jpg 102.36
    roses1.jpg 115.02
    roses2.jpg 130.02

    Done!

    The *less* numeric value -- the *more similar* this pic is to the
    tested pic. If this value > 70 almost for sure these pictures are
    absolutely different (from totally different domains, so to speak).

    What is "similarity" and how can/could/should it be estimated this
    point I'm leaving for your consideration/contemplation/arguing etc.

    Several sample pics (*.jpg) are included into .zip.
    And of course the stuff requires PIL (Python Imaging Library), see:
    Home-page: http://www.pythonware.com/products/pil
    Download-URL: http://effbot.org/zone/pil-changes-116.htm
     
    n00m, Mar 5, 2011
    #1
    1. Advertising

  2. At least you could've tried to make the script more usable by adding
    the possibility to supply command line arguments, instead of editing
    the source every time you want to compare a couple of images.

    On Sat, Mar 5, 2011 at 11:23 AM, n00m <> wrote:
    > Let me present my newborn project (in Python) ImSim:
    >
    > http://sourceforge.net/projects/imsim/
    >
    > Its README.txt:
    > ---------------------------------------------------------------------
    > ImSim is a python script for finding the most similar pic(s) to
    > a given one among a set/list/db of your pics.
    > The script is very short and very easy to follow and understand.
    > Its sample output looks like this:
    >
    >  bears2.jpg
    > --------------------
    >  bears2.jpg    0.00
    >  bears3.jpg   55.33
    >  bears1.jpg   68.87
    >    sky1.jpg   83.84
    >    sky2.jpg   84.41
    >     ff1.jpg   91.35
    >   lake1.jpg   95.14
    >  water1.jpg   96.94
    >     ff2.jpg  102.36
    >  roses1.jpg  115.02
    >  roses2.jpg  130.02
    >
    > Done!
    >
    > The *less* numeric value -- the *more similar* this pic is to the
    > tested pic. If this value > 70 almost for sure these pictures are
    > absolutely different (from totally different domains, so to speak).
    >
    > What is "similarity" and how can/could/should it be estimated this
    > point I'm leaving for your consideration/contemplation/arguing etc.
    >
    > Several sample pics (*.jpg) are included into .zip.
    > And of course the stuff requires PIL (Python Imaging Library), see:
    > Home-page: http://www.pythonware.com/products/pil
    > Download-URL: http://effbot.org/zone/pil-changes-116.htm
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
     
    Grigory Javadyan, Mar 5, 2011
    #2
    1. Advertising

  3. n00m

    n00m Guest

    I uploaded a new version of the subject with a
    VERY MINOR correction in it. Namely, in line #55:

    print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,)

    instead of

    print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,)

    I.e. I normalized it to base = 100.
    Now the values of similarity can't be greater than 100
    and can be treated as some "regular" percents (%%).

    Also, due to this change, the *empirical* threshold of
    "system alarmity" moved down from "number 70" to "20%".

    bears2.jpg
    --------------------
    bears2.jpg 0.00
    bears3.jpg 15.37
    bears1.jpg 19.13
    sky1.jpg 23.29
    sky2.jpg 23.45
    ff1.jpg 25.37
    lake1.jpg 26.43
    water1.jpg 26.93
    ff2.jpg 28.43
    roses1.jpg 31.95
    roses2.jpg 36.12

    Done!
     
    n00m, Mar 5, 2011
    #3
  4. n00m

    Mel Guest

    n00m wrote:

    >
    > I uploaded a new version of the subject with a
    > VERY MINOR correction in it. Namely, in line #55:
    >
    > print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,)
    >
    > instead of
    >
    > print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,)
    >
    > I.e. I normalized it to base = 100.
    > Now the values of similarity can't be greater than 100
    > and can be treated as some "regular" percents (%%).
    >
    > Also, due to this change, the *empirical* threshold of
    > "system alarmity" moved down from "number 70" to "20%".
    >
    > bears2.jpg
    > --------------------
    > bears2.jpg 0.00
    > bears3.jpg 15.37
    > bears1.jpg 19.13
    > sky1.jpg 23.29
    > sky2.jpg 23.45
    > ff1.jpg 25.37
    > lake1.jpg 26.43
    > water1.jpg 26.93
    > ff2.jpg 28.43
    > roses1.jpg 31.95
    > roses2.jpg 36.12


    I'd like to see a *lot* more structure in there, with modularization, so the
    internal functions could be used from another program. Once I'd figured out
    what it was doing, I had this:


    from PIL import Image
    from PIL import ImageStat

    def row_column_histograms (file_name):
    '''Reduce the image to a 5x5 square of b/w brightness levels 0..3
    Return two brightness histograms across Y and X
    packed into a 10-item list of 4-item histograms.'''
    im = Image.open (file_name)
    im = im.convert ('L') # convert to 8-bit b/w
    w, h = 300, 300
    im = im.resize ((w, h))
    imst = ImageStat.Stat (im)
    sr = imst.mean[0] # average pixel level in layer 0
    sr_low, sr_mid, sr_high = (sr*2)/3, sr, (sr*4)/3
    def foo (t):
    if t < sr_low: return 0
    if t < sr_mid: return 1
    if t < sr_high: return 2
    return 3
    im = im.point (foo) # reduce to brightness levels 0..3
    yhist = [[0]*4 for i in xrange(5)]
    xhist = [[0]*4 for i in xrange(5)]
    for y in xrange (h):
    for x in xrange (w):
    k = im.getpixel ((x, y))
    yhist[y / 60][k] += 1
    xhist[x / 60][k] += 1
    return yhist + xhist


    def difference_ranks (test_histogram, sample_histograms):
    '''Return a list of difference ranks between the test histograms and
    each of the samples.'''
    result = [0]*len (sample_histograms)
    for k, s in enumerate (sample_histograms): # for each image
    for i in xrange(10): # for each histogram slot
    for j in xrange(4): # for each brightness level
    result[k] += abs (s[j] - test_histogram[j])
    return result


    if __name__ == '__main__':
    import getopt, sys
    opts, args = getopt.getopt (sys.argv[1:], '', [])
    if not args:
    args = [
    'bears1.jpg',
    'bears2.jpg',
    'bears3.jpg',
    'roses1.jpg',
    'roses2.jpg',
    'ff1.jpg',
    'ff2.jpg',
    'sky1.jpg',
    'sky2.jpg',
    'water1.jpg',
    'lake1.jpg',
    ]
    test_pic = 'bears2.jpg'
    else:
    test_pic, args = args[0], args[1:]

    z = [row_column_histograms (a) for a in args]
    test_z = row_column_histograms (test_pic)

    file_ranks = zip (difference_ranks (test_z, z), args)
    file_ranks.sort()

    print '%12s' % (test_pic,)
    print '--------------------'
    for r in file_ranks:
    print '%12s %7.2f' % (r[1], r[0] / 3600.0,)



    (omitting a few comments that wrapped around.) The test-case still agrees
    with your archived version:

    mwilson@tecumseth:~/sandbox/im_sim$ python image_rank.py bears2.jpg *.jpg
    bears2.jpg
    --------------------
    bears2.jpg 0.00
    bears3.jpg 15.37
    bears1.jpg 19.20
    sky1.jpg 23.20
    sky2.jpg 23.37
    ff1.jpg 25.30
    lake1.jpg 26.38
    water1.jpg 26.98
    ff2.jpg 28.43
    roses1.jpg 32.01


    I'd vaguely wanted to do something like this for a while, but I never dug
    far enough into PIL to even get started. An additional kind of ranking that
    takes colour into account would also be good -- that's the first one I never
    did.

    Cheers, Mel.
     
    Mel, Mar 5, 2011
    #4
  5. n00m

    n00m Guest

    On Mar 5, 7:10 pm, Mel <> wrote:
    > n00m wrote:
    >
    > > I uploaded a new version of the subject with a
    > > VERY MINOR correction in it. Namely, in line #55:

    >
    > >     print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,)

    >
    > > instead of

    >
    > >     print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,)

    >
    > > I.e. I normalized it to base = 100.
    > > Now the values of similarity can't be greater than 100
    > > and can be treated as some "regular" percents (%%).

    >
    > > Also, due to this change, the *empirical* threshold of
    > > "system alarmity" moved down from "number 70" to "20%".

    >
    > >   bears2.jpg
    > > --------------------
    > >   bears2.jpg    0.00
    > >   bears3.jpg   15.37
    > >   bears1.jpg   19.13
    > >     sky1.jpg   23.29
    > >     sky2.jpg   23.45
    > >      ff1.jpg   25.37
    > >    lake1.jpg   26.43
    > >   water1.jpg   26.93
    > >      ff2.jpg   28.43
    > >   roses1.jpg   31.95
    > >   roses2.jpg   36.12

    >
    > I'd like to see a *lot* more structure in there, with modularization, so the
    > internal functions could be used from another program.  Once I'd figured out
    > what it was doing, I had this:
    >
    > from PIL import Image
    > from PIL import ImageStat
    >
    > def row_column_histograms (file_name):
    >     '''Reduce the image to a 5x5 square of b/w brightness levels 0..3
    >     Return two brightness histograms across Y and X
    >     packed into a 10-item list of 4-item histograms.'''
    >     im = Image.open (file_name)
    >     im = im.convert ('L')       # convert to 8-bit b/w
    >     w, h = 300, 300
    >     im = im.resize ((w, h))
    >     imst = ImageStat.Stat (im)
    >     sr = imst.mean[0]   # average pixel level in layer 0
    >     sr_low, sr_mid, sr_high = (sr*2)/3, sr, (sr*4)/3
    >     def foo (t):
    >         if t < sr_low: return 0
    >         if t < sr_mid: return 1
    >         if t < sr_high: return 2
    >         return 3
    >     im = im.point (foo) # reduce to brightness levels 0..3
    >     yhist = [[0]*4 for i in xrange(5)]
    >     xhist = [[0]*4 for i in xrange(5)]
    >     for y in xrange (h):
    >         for x in xrange (w):
    >             k = im.getpixel ((x, y))
    >             yhist[y / 60][k] += 1
    >             xhist[x / 60][k] += 1
    >     return yhist + xhist
    >
    > def difference_ranks (test_histogram, sample_histograms):
    >     '''Return a list of difference ranks between the test histograms and
    > each of the samples.'''
    >     result = [0]*len (sample_histograms)
    >     for k, s in enumerate (sample_histograms):  # for each image
    >         for i in xrange(10):    # for each histogram slot
    >             for j in xrange(4): # for each brightness level
    >                 result[k] += abs (s[j] - test_histogram[j])      
    >     return result
    >
    > if __name__ == '__main__':
    >     import getopt, sys
    >     opts, args = getopt.getopt (sys.argv[1:], '', [])
    >     if not args:
    >         args = [
    >             'bears1.jpg',
    >             'bears2.jpg',
    >             'bears3.jpg',
    >             'roses1.jpg',
    >             'roses2.jpg',
    >             'ff1.jpg',
    >             'ff2.jpg',
    >             'sky1.jpg',
    >             'sky2.jpg',
    >             'water1.jpg',
    >             'lake1.jpg',
    >         ]
    >         test_pic = 'bears2.jpg'
    >     else:
    >         test_pic, args = args[0], args[1:]
    >
    >     z = [row_column_histograms (a) for a in args]
    >     test_z = row_column_histograms (test_pic)
    >
    >     file_ranks = zip (difference_ranks (test_z, z), args)      
    >     file_ranks.sort()
    >
    >     print '%12s' % (test_pic,)
    >     print '--------------------'
    >     for r in file_ranks:
    >         print '%12s %7.2f' % (r[1], r[0] / 3600.0,)
    >
    > (omitting a few comments that wrapped around.)  The test-case still agrees
    > with your archived version:
    >
    > mwilson@tecumseth:~/sandbox/im_sim$ python image_rank.py bears2.jpg *.jpg
    >   bears2.jpg
    > --------------------
    >   bears2.jpg    0.00
    >   bears3.jpg   15.37
    >   bears1.jpg   19.20
    >     sky1.jpg   23.20
    >     sky2.jpg   23.37
    >      ff1.jpg   25.30
    >    lake1.jpg   26.38
    >   water1.jpg   26.98
    >      ff2.jpg   28.43
    >   roses1.jpg   32.01
    >
    > I'd vaguely wanted to do something like this for a while, but I never dug
    > far enough into PIL to even get started.  An additional kind of rankingthat
    > takes colour into account would also be good -- that's the first one I never
    > did.
    >
    >         Cheers,         Mel.



    Very nice, Mel.

    As for using color info...
    my current strong opinion is: the colors must be forgot for good.
    Paradoxically but "profound" elaboration and detailization can/will
    spoil/undermine the whole thing. Just my current imo.


    ===========================
    Vitali
     
    n00m, Mar 5, 2011
    #5
  6. n00m

    Jorgen Grahn Guest

    On Sat, 2011-03-05, Grigory Javadyan wrote:
    > At least you could've tried to make the script more usable by adding
    > the possibility to supply command line arguments, instead of editing
    > the source every time you want to compare a couple of images.
    >
    > On Sat, Mar 5, 2011 at 11:23 AM, n00m <> wrote:
    >> Let me present my newborn project (in Python) ImSim:
    >>
    >> http://sourceforge.net/projects/imsim/
    >>
    >> Its README.txt:
    >> ---------------------------------------------------------------------
    >> ImSim is a python script for finding the most similar pic(s) to
    >> a given one among a set/list/db of your pics.
    >> The script is very short and very easy to follow and understand.
    >> Its sample output looks like this:

    ....
    >> The *less* numeric value -- the *more similar* this pic is to the
    >> tested pic. If this value > 70 almost for sure these pictures are
    >> absolutely different (from totally different domains, so to speak).
    >>
    >> What is "similarity" and how can/could/should it be estimated this
    >> point I'm leaving for your consideration/contemplation/arguing etc.


    So basically you're saying you won't tell the users what the program
    *does*. I don't get that.

    Is it better than this?
    - scale each image to 100x100
    - go black&white in such a way that half the pixels are black
    - XOR the images and count the mismatches

    That takes care of JPEG quality, scaling and possibly gamma
    correction, but not cropping or rotation. I'm sure there are better,
    well-known algorithms.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
     
    Jorgen Grahn, Mar 5, 2011
    #6
  7. n00m

    n00m Guest

    >
    > Is it better than this?
    > - scale each image to 100x100
    > - go black&white in such a way that half the pixels are black
    > - XOR the images and count the mismatches



    It's *much* better but I'm not *much* about to prove it.



    > I'm sure there are better,
    > well-known algorithms.



    The best well-known algorithm is to hire a man with good eyesight
    for to do the job of comparing, ranking and selecting the pictures.
     
    n00m, Mar 5, 2011
    #7
  8. n00m

    n00m Guest

    n00m, Mar 5, 2011
    #8
  9. n00m

    Mel Guest

    n00m wrote:

    > As for using color info...
    > my current strong opinion is: the colors must be forgot for good.
    > Paradoxically but "profound" elaboration and detailization can/will
    > spoil/undermine the whole thing. Just my current imo.


    Yeah. I guess including color info cubes the complexity of the answer.
    Might be too complicated to know what to do with an answer like that.

    Mel.
     
    Mel, Mar 6, 2011
    #9
  10. n00m

    n00m Guest

    On Mar 6, 6:10 am, Mel <> wrote:
    > n00m wrote:
    > > As for using color info...
    > > my current strong opinion is: the colors must be forgot for good.
    > > Paradoxically but "profound" elaboration and detailization can/will
    > > spoil/undermine the whole thing. Just my current imo.

    >
    > Yeah.  I guess including color info cubes the complexity of the answer. 
    > Might be too complicated to know what to do with an answer like that.
    >
    >         Mel.


    Uhmm, Mel. Totally agree with you.
    +
    I included "roses1.jpg" & "roses2.jpg" on purpose:
    the 1st one is a painting by Abbott Handerson Thayer,
    the 2nd is its copy by some obscure Russian painter.
    But it's of course a creative & revamped copy.

    In strict sense they are 2 different images (look at their colors etc)
    , on the other hand they are closely related to each other.
    Plus, we can't tell *in principle* what is original and what is copy
    what colors are "right/good" and what colors are "wrong/bad"
     
    n00m, Mar 6, 2011
    #10
  11. n00m

    n00m Guest

    n00m, Mar 6, 2011
    #11
  12. n00m

    n00m Guest

    Obviously if we'd use it in practice (in a web-museum ?)
    all pic's matrices should be precalculated only once and
    stored in a table with fourty fields v00 ... v93 like:

    -----------------------------------------------
    pic_title v00 v01 v02 ... v93
    -----------------------------------------------
    bears2.jpg 1234 4534 8922 ... 333
    ....
    ....
    -----------------------------------------------

    Then SQL query will look like this:

    select top 3 pic_title from table
    order by
    abs(v00 - w[0][0]) +
    abs(v01 - w[0][1]) +
    .... +
    abs(v93 - w[9][3])

    here w[][] is the matrix of a newly-entering picture.


    P.S.
    If someone will encounter 2 apparently unrelated pics
    but for which ImSim gives value of their mutual diff.
    *** less than 20% *** please emailed them to me.
     
    n00m, Mar 6, 2011
    #12
  13. n00m

    John Bokma Guest

    n00m <> writes:

    > http://www.nga.gov/search/index.shtm
    > http://deyoung.famsf.org/search-collections
    > etc
    > Seems they all offer search only by keywords and this kind.
    > What about to submit e.g. roses2.jpg (copy) and to find its
    > original? Assume we don't know its author neither its title


    Title: TinEye, author: http://ideeinc.com/
    Search: http://www.tineye.com/

    Example:
    http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/

    Notice how it finds modified images.

    --
    John Bokma j3b

    Blog: http://johnbokma.com/ Facebook: http://www.facebook.com/j.j.j.bokma
    Freelance Perl & Python Development: http://castleamber.com/
     
    John Bokma, Mar 6, 2011
    #13
  14. n00m

    n00m Guest

    On Mar 6, 8:55 pm, John Bokma <> wrote:
    > n00m <> writes:
    > >http://www.nga.gov/search/index.shtm
    > >http://deyoung.famsf.org/search-collections
    > > etc
    > > Seems they all offer search only by keywords and this kind.
    > > What about to submit e.g. roses2.jpg (copy) and to find its
    > > original? Assume we don't know its author neither its title

    >
    > Title: TinEye, author:http://ideeinc.com/
    > Search:http://www.tineye.com/
    >
    > Example:
    >  http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/
    >
    > Notice how it finds modified images.
    >
    > --
    > John Bokma                                                               j3b
    >
    > Blog:http://johnbokma.com/   Facebook:http://www.facebook.com/j.j.j.bokma
    >     Freelance Perl & Python Development:http://castleamber.com/



    It's for kids.
    Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see
    his message)
     
    n00m, Mar 6, 2011
    #14
  15. n00m

    n00m Guest

    On Mar 6, 10:17 pm, n00m <> wrote:
    > On Mar 6, 8:55 pm, John Bokma <> wrote:
    >
    >
    >
    > > n00m <> writes:
    > > >http://www.nga.gov/search/index.shtm
    > > >http://deyoung.famsf.org/search-collections
    > > > etc
    > > > Seems they all offer search only by keywords and this kind.
    > > > What about to submit e.g. roses2.jpg (copy) and to find its
    > > > original? Assume we don't know its author neither its title

    >
    > > Title: TinEye, author:http://ideeinc.com/
    > > Search:http://www.tineye.com/

    >
    > > Example:
    > >  http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/

    >
    > > Notice how it finds modified images.

    >
    > > --
    > > John Bokma                                                               j3b

    >
    > > Blog:http://johnbokma.com/  Facebook:http://www.facebook.com/j.j.j.bokma
    > >     Freelance Perl & Python Development:http://castleamber.com/

    >
    > It's for kids.
    > Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see
    > his message)



    Even his algo will be an overhead.
    Comparing meta-data/EXIF of image files will be enough in 99% cases.
     
    n00m, Mar 6, 2011
    #15
  16. n00m

    John Bokma Guest

    n00m <> writes:

    > On Mar 6, 10:17 pm, n00m <> wrote:
    >> On Mar 6, 8:55 pm, John Bokma <> wrote:
    >>
    >>
    >>
    >> > n00m <> writes:
    >> > >http://www.nga.gov/search/index.shtm
    >> > >http://deyoung.famsf.org/search-collections
    >> > > etc
    >> > > Seems they all offer search only by keywords and this kind.
    >> > > What about to submit e.g. roses2.jpg (copy) and to find its
    >> > > original? Assume we don't know its author neither its title

    >>
    >> > Title: TinEye, author:http://ideeinc.com/
    >> > Search:http://www.tineye.com/

    >>
    >> > Example:
    >> >  http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/

    >>
    >> > Notice how it finds modified images.

    >>
    >> > --
    >> > John Bokma                                                               j3b

    >>
    >> > Blog:http://johnbokma.com/  Facebook:http://www.facebook.com/j.j.j.bokma
    >> >     Freelance Perl & Python Development:http://castleamber.com/

    >>
    >> It's for kids.
    >> Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see
    >> his message)

    >
    >
    > Even his algo will be an overhead.
    > Comparing meta-data/EXIF of image files will be enough in 99% cases.


    Yes, yes, we get it. You're so much smarter (but not smart enough to not
    quote a signature...). Anyway, I guess that's the reason big names use
    tineye and not your algorithm...

    --
    John Bokma j3b

    Blog: http://johnbokma.com/ Facebook: http://www.facebook.com/j.j.j.bokma
    Freelance Perl & Python Development: http://castleamber.com/
     
    John Bokma, Mar 6, 2011
    #16
  17. n00m

    n00m Guest

    As for "proper" quoting: I read/post to this group via my web-browser.
    And for me everything looks OK. I don't even quite understand what
    exactly
    do you mean by your remark. I'm not a facebookie/forumish/twitterish
    thing.
    Btw I don't know what is the twitter. I don't need it, neither to know
    nor
    to use it. Oh... Pres. Medvedev knows what is the twitter and uses it.
     
    n00m, Mar 6, 2011
    #17
  18. n00m

    John Bokma Guest

    n00m <> writes:

    > As for "proper" quoting: I read/post to this group via my web-browser.
    > And for me everything looks OK. I don't even quite understand what
    > exactly
    > do you mean by your remark. I'm not a facebookie/forumish/twitterish
    > thing.


    Exactly. It's Usenet, something I've been using for, oh, just over 20
    years now, and even then it was not new. You know, before the web thing
    you're talking about...

    --
    John Bokma j3b

    Blog: http://johnbokma.com/ Facebook: http://www.facebook.com/j.j.j.bokma
    Freelance Perl & Python Development: http://castleamber.com/
     
    John Bokma, Mar 6, 2011
    #18
  19. n00m

    n00m Guest

    On Mar 6, 7:54 pm, n00m <> wrote:
    > If someone will encounter 2 apparently unrelated pics
    > but for which ImSim gives value of their mutual diff.
    > *** less than 20% *** please emailed them to me.


    Never mind, people.
    I've found such a pair of images in my .zipped project.
    It's "sky1.jpg" and "lake1.jpg", with sim. value < 15%.

    sky1.jpg
    --------------------
    sky1.jpg 0.00
    sky2.jpg 0.77
    lake1.jpg 14.28 <-----
    bears2.jpg 23.29
    bears3.jpg 26.60
    roses2.jpg 29.41
    roses1.jpg 31.36
    ff1.jpg 33.47
    bears1.jpg 36.60
    ff2.jpg 39.52
    water1.jpg 40.11

    But funny thing takes place.
    At first thought it's a false-positive: some modern South East
    Asian town and a lake somewhere in Russia, more than 100 years
    ago. Nothing similar in them?

    On both pics we see:
    -- a lot of water on foreground;
    -- a lot of blue sky at sunny mid-day;
    -- a bit of light white clouds in the sky;

    In short,
    the notion of similarity can be speculated about just endlessly.
     
    n00m, Mar 7, 2011
    #19
  20. Just admit that your algorithm doesn't work that well already :)
    Or give a solid formal definition of "similarity" and prove that your
    algo works with that definition.

    On Mon, Mar 7, 2011 at 4:22 PM, n00m <> wrote:
    >
    > In short,
    > the notion of similarity can be speculated about just endlessly.
    >
     
    Grigory Javadyan, Mar 7, 2011
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fabian Leitritz

    Document-Document similarity

    Fabian Leitritz, Jan 14, 2005, in forum: Java
    Replies:
    0
    Views:
    417
    Fabian Leitritz
    Jan 14, 2005
  2. =?iso-8859-1?B?bW9vcJk=?=

    What are the similarity and difference b/w EBJ and COM+?

    =?iso-8859-1?B?bW9vcJk=?=, May 30, 2006, in forum: Java
    Replies:
    1
    Views:
    414
    dimitar
    May 30, 2006
  3. Luca Montecchiani

    String similarity

    Luca Montecchiani, Oct 10, 2003, in forum: Python
    Replies:
    0
    Views:
    551
    Luca Montecchiani
    Oct 10, 2003
  4. Tim Churches

    Re: String similarity

    Tim Churches, Oct 10, 2003, in forum: Python
    Replies:
    1
    Views:
    386
    Luca Montecchiani
    Oct 10, 2003
  5. Casimir
    Replies:
    8
    Views:
    268
    ara.t.howard
    Jun 23, 2008
Loading...

Share This Page