wordnet semantic similarity: how to refer to elements of a pair in alist? can we sort dictionary acc

Discussion in 'Python' started by Token Type, Oct 7, 2012.

  1. Token Type

    Token Type Guest

    In order to solve the following question, http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html:
    ★ Use one of the predefined similarity measures to score the similarity of each of the following pairs of words. Rank the pairs in order of decreasing similarity. How close is your ranking to the order given here, an order that was established experimentally by (Miller & Charles, 1998): car-automobile, gem-jewel, journey-voyage, boy-lad, coast-shore, asylum-madhouse, magician-wizard, midday-noon, furnace-stove, food-fruit, bird-cock, bird-crane, tool-implement, brother-monk, lad-brother, crane-implement, journey-car, monk-oracle, cemetery-woodland, food-rooster, coast-hill, forest-graveyard, shore-woodland, monk-slave, coast-forest, lad-wizard, chord-smile, glass-magician, rooster-voyage, noon-string.

    (1) First, I put the word pairs in a list eg.
    pairs = [(car, automobile), (gem, jewel), (journey, voyage) ]. According to http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html, I need to put them in the following format so as to calculate teh semantic similarity : wn..synset('right_whale.n.01').path_similarity(wn.synset('minke_whale.n.01')).

    In this case, I need to use loop to iterate each element in the above pairs.. How can I refer to each element in the above pairs, i.e. pairs = [(car,automobile), (gem, jewel), (journey, voyage) ]. What's the index for 'car'and for 'automobile'? Thanks for your tips.

    (2) Since I can't solve the above index issue. I try to use dictionary as follows:
    >>> import nltk
    >>> from nltk.corpus import wordnet as wn
    >>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}
    >>> for key in pairs:

    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    print key+'-'+pairs[key],similarity


    car-automobile 1.0
    journey-voyage 0.25
    gem-jewel 0.125

    Now it seems that I can calculate the semantic similarity for each groups in the above dictionary. However, I want to sort according to the similarityvalue in the result before print the result out. Can sort dictionary elements according to their values? This is one of the requirement in this exercise. How can we make each group of words (e.g. car-automobile, jounrney-voyage, gem-jewel)
    sorted according to their similarity value?
    Thanks for your tips.
    Token Type, Oct 7, 2012
    #1
    1. Advertising

  2. Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    On 07/10/2012 17:15, Token Type wrote:
    > In order to solve the following question, http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html:
    > ★ Use one of the predefined similarity measures to score the similarity of each of the following pairs of words. Rank the pairs in order of decreasing similarity. How close is your ranking to the order given here, an order that was established experimentally by (Miller & Charles, 1998): car-automobile, gem-jewel, journey-voyage, boy-lad, coast-shore, asylum-madhouse, magician-wizard, midday-noon, furnace-stove, food-fruit, bird-cock, bird-crane, tool-implement, brother-monk, lad-brother, crane-implement, journey-car, monk-oracle, cemetery-woodland, food-rooster, coast-hill, forest-graveyard, shore-woodland, monk-slave, coast-forest, lad-wizard, chord-smile, glass-magician, rooster-voyage, noon-string.
    >
    > (1) First, I put the word pairs in a list eg.
    > pairs = [(car, automobile), (gem, jewel), (journey, voyage) ]. According to http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html, I need to put them in the following format so as to calculate teh semantic similarity : wn.synset('right_whale.n.01').path_similarity(wn.synset('minke_whale.n.01')).
    >
    > In this case, I need to use loop to iterate each element in the above pairs. How can I refer to each element in the above pairs, i.e. pairs = [(car, automobile), (gem, jewel), (journey, voyage) ]. What's the index for 'car' and for 'automobile'? Thanks for your tips.
    >
    > (2) Since I can't solve the above index issue. I try to use dictionary as follows:
    >>>> import nltk
    >>>> from nltk.corpus import wordnet as wn
    >>>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}
    >>>> for key in pairs:

    > word1 = wn.synset(str(key) + '.n.01')
    > word2 = wn.synset(str(pairs[key])+'.n.01')
    > similarity = word1.path_similarity(word2)
    > print key+'-'+pairs[key],similarity
    >
    >
    > car-automobile 1.0
    > journey-voyage 0.25
    > gem-jewel 0.125
    >
    > Now it seems that I can calculate the semantic similarity for each groups in the above dictionary. However, I want to sort according to the similarity value in the result before print the result out. Can sort dictionary elements according to their values? This is one of the requirement in this exercise. How can we make each group of words (e.g. car-automobile, jounrney-voyage, gem-jewel)
    > sorted according to their similarity value?
    > Thanks for your tips.
    >


    In your for loop save the data in a list rather than print it out and
    sort according to this
    http://wiki.python.org/moin/HowTo/Sorting#Operator_Module_Functions

    --
    Cheers.

    Mark Lawrence.
    Mark Lawrence, Oct 7, 2012
    #2
    1. Advertising

  3. Token Type

    Terry Reedy Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    On 10/7/2012 12:15 PM, Token Type wrote:

    > In this case, I need to use loop to iterate each element in the above
    > pairs. How can I refer to each element in the above pairs, i.e. pairs
    > = [(car, automobile), (gem, jewel), (journey, voyage) ]. What's the
    > index for 'car' and for 'automobile'? Thanks for your tips.


    >>> pairs = [('car', 'automobile'), ('gem', 'jewel')]
    >>> pairs[0][0]

    'car'
    >>> pairs[1][1]

    'jewel'
    >>> for a,b in pairs: a,b


    ('car', 'automobile')
    ('gem', 'jewel')

    --
    Terry Jan Reedy
    Terry Reedy, Oct 7, 2012
    #3
  4. Token Type

    yujian Guest

    How to control the internet explorer?

    I want to save all the URLs in current opened windows, and then close
    all the windows.
    yujian, Oct 8, 2012
    #4
  5. Token Type

    alex23 Guest

    Re: How to control the internet explorer?

    On Oct 8, 1:03 pm, yujian <> wrote:
    > I want to save all the URLs in current opened windows,  and then close
    > all the windows.


    Try mechanize or Selenium.
    alex23, Oct 8, 2012
    #5
  6. Token Type

    Token Type Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    yes, thanks all your tips. I did try sorted with itemgetter. However, the sorted results are same as follows whether I set reverse=True or reverse= False. Isn't it strange? Thanks.

    >>> import nltk
    >>> from nltk.corpus import wordnet as wn
    >>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}
    >>> for key in pairs:

    list_simi=[]
    from operator import itemgetter
    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    list_simi.append((key+'-'+pairs[key],similarity))
    sorted(list_simi, key=itemgetter(1), reverse=True)


    [('car-automobile', 1.0)]
    [('journey-voyage', 0.25)]
    [('gem-jewel', 0.125)]
    >>> for key in pairs:

    list_simi=[]
    from operator import itemgetter
    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    list_simi.append((key+'-'+pairs[key],similarity))
    sorted(list_simi, key=itemgetter(1), reverse=False)


    [('car-automobile', 1.0)]
    [('journey-voyage', 0.25)]
    [('gem-jewel', 0.125)]
    Token Type, Oct 9, 2012
    #6
  7. Token Type

    Token Type Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    yes, thanks all your tips. I did try sorted with itemgetter. However, the sorted results are same as follows whether I set reverse=True or reverse= False. Isn't it strange? Thanks.

    >>> import nltk
    >>> from nltk.corpus import wordnet as wn
    >>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}
    >>> for key in pairs:

    list_simi=[]
    from operator import itemgetter
    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    list_simi.append((key+'-'+pairs[key],similarity))
    sorted(list_simi, key=itemgetter(1), reverse=True)


    [('car-automobile', 1.0)]
    [('journey-voyage', 0.25)]
    [('gem-jewel', 0.125)]
    >>> for key in pairs:

    list_simi=[]
    from operator import itemgetter
    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    list_simi.append((key+'-'+pairs[key],similarity))
    sorted(list_simi, key=itemgetter(1), reverse=False)


    [('car-automobile', 1.0)]
    [('journey-voyage', 0.25)]
    [('gem-jewel', 0.125)]
    Token Type, Oct 9, 2012
    #7
  8. Token Type

    Token Type Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    Dear all, the problem has been solved as follows. Thanks anyway:
    >>> import nltk
    >>> from nltk.corpus import wordnet as wn
    >>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}
    >>> list_simi=[]
    >>> for key in pairs:

    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    list_simi.append((key+'-'+pairs[key],similarity))


    >>> from operator import itemgetter
    >>> sorted(list_simi, key=itemgetter(1), reverse=False)

    [('gem-jewel', 0.125), ('journey-voyage', 0.25), ('car-automobile', 1.0)]
    >>> sorted(list_simi, key=itemgetter(1), reverse=True)

    [('car-automobile', 1.0), ('journey-voyage', 0.25), ('gem-jewel', 0.125)]
    >>> sorted(list_simi, key=itemgetter(1))

    [('gem-jewel', 0.125), ('journey-voyage', 0.25), ('car-automobile', 1.0)]
    Token Type, Oct 9, 2012
    #8
  9. Token Type

    Token Type Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    Dear all, the problem has been solved as follows. Thanks anyway:
    >>> import nltk
    >>> from nltk.corpus import wordnet as wn
    >>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}
    >>> list_simi=[]
    >>> for key in pairs:

    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    list_simi.append((key+'-'+pairs[key],similarity))


    >>> from operator import itemgetter
    >>> sorted(list_simi, key=itemgetter(1), reverse=False)

    [('gem-jewel', 0.125), ('journey-voyage', 0.25), ('car-automobile', 1.0)]
    >>> sorted(list_simi, key=itemgetter(1), reverse=True)

    [('car-automobile', 1.0), ('journey-voyage', 0.25), ('gem-jewel', 0.125)]
    >>> sorted(list_simi, key=itemgetter(1))

    [('gem-jewel', 0.125), ('journey-voyage', 0.25), ('car-automobile', 1.0)]
    Token Type, Oct 9, 2012
    #9
  10. Token Type

    Ian Kelly Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    On Mon, Oct 8, 2012 at 9:13 PM, Token Type <> wrote:
    > yes, thanks all your tips. I did try sorted with itemgetter. However, the sorted results are same as follows whether I set reverse=True or reverse= False. Isn't it strange? Thanks.


    First of all, "sorted" does not sort the list in place as you seem to
    be expecting.
    It returns a new sorted list. Since your code does not store the
    return value of the sorted call anywhere, the sorted list is discarded
    and only the original list is kept. If you want to sort a list in
    place, use the list.sort method instead.

    Second, you're not sorting the overall list. On each iteration your
    code: 1) assigns a new empty list to list_simi; 2) processes one of
    the pairs; 3) adds the pair to the empty list; and 4) sorts the list.
    On the next iteration you then start all over again with a new empty
    list, and so when you get to the sorting step you're only sorting one
    item each time. You need to accumulate the list instead of wiping it
    out on each iteration, and only sort it after the loop has completed.
    Ian Kelly, Oct 9, 2012
    #10
  11. Token Type

    alex23 Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    On Oct 9, 1:13 pm, Token Type <> wrote:
    > yes, thanks all your tips. I did try sorted with itemgetter.
    > However, the sorted results are same as follows whether I
    > set reverse=True or reverse= False. Isn't it strange? Thanks.


    That's because you're sorting each entry individually, not the entire
    result. For every key-value pair, you create a new empty list, append
    one tuple, and then sort it. The consistent order you're seeing is the
    outcome of stepping through the dictionary keys.

    This is untested, but it should be closer to what you're after, I
    think. First it creates `list_simi` as a generator, then it sorts it.

    import nltk
    from nltk.corpus import wordnet as wn
    from operator import itemgetter

    pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}

    def find_similarity(word1, word2):
    as_synset = lambda word: wn.synset( str(word) + '.n.01' )
    return as_synset(word1).path_similarity( as_synset(word2) )

    similarity_value = itemgetter(1)

    list_simi = (
    ('%s-%s' % (word1, word2), find_similarity(word1, word2) )
    for word1, word2 in pairs.iteritems()
    )
    list_simi = sorted(list_simi, key=similarity_value, reverse=True)
    alex23, Oct 9, 2012
    #11
  12. Token Type

    Token Type Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    Thanks indeed for all your suggestions. When I try my above codes, what puzzles me is that when the data in the dictionary increase, some data become missing in the sorted result. Quite odd. In the pairs, we have {'journey':'voyage'} but in the sorted result no ('journey-voyage',0.25), which did appear in my first post which was a small scale experiment. I am quite puzzled....

    >>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage','boy':'lad','coast':'shore', 'asylum':'madhouse', 'magician':'wizard', 'midday':'noon', 'furnace':'stove', 'food':'fruit', 'bird':'cock', 'bird':'crane', 'tool':'implement', 'brother':'monk', 'lad':'brother', 'crane':'implement', 'journey':'car', 'monk':'oracle', 'cemetery':'woodland', 'food':'rooster', 'coast':'hill', 'forest':'graveyard', 'shore':'woodland', 'monk':'slave', 'coast':'forest','lad':'wizard', 'chord':'smile', 'glass':'magician', 'rooster':'voyage', 'noon':'string'}
    >>> list_simi=[]
    >>> for key in pairs:

    word1 = wn.synset(str(key) + '.n.01')
    word2 = wn.synset(str(pairs[key])+'.n.01')
    similarity = word1.path_similarity(word2)
    list_simi.append((key+'-'+pairs[key],similarity))


    >>> from operator import itemgetter


    >>> sorted(list_simi, key=itemgetter(1), reverse=True)

    [('midday-noon', 1.0), ('car-automobile', 1.0), ('tool-implement', 0.5), ('boy-lad', 0.3333333333333333), ('lad-wizard', 0.2), ('monk-slave', 0.2), ('shore-woodland', 0.2), ('magician-wizard', 0.16666666666666666), ('brother-monk', 0.125), ('asylum-madhouse', 0.125), ('gem-jewel', 0.125), ('cemetery-woodland', 0.1111111111111111), ('bird-crane', 0.1111111111111111), ('glass-magician', 0.1111111111111111), ('crane-implement', 0.1), ('chord-smile',0.09090909090909091), ('coast-forest', 0.09090909090909091), ('furnace-stove', 0.07692307692307693), ('forest-graveyard', 0.07142857142857142), ('food-rooster', 0.0625), ('noon-string', 0.058823529411764705), ('journey-car',0.05), ('rooster-voyage', 0.041666666666666664)]
    Token Type, Oct 9, 2012
    #12
  13. Token Type

    alex23 Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    On Oct 9, 2:16 pm, Token Type <> wrote:
    > When I try my above codes, what puzzles me is that when
    > the data in the dictionary increase, some data become
    > missing in the sorted result. Quite odd. In the pairs,
    > we have {'journey':'voyage'} but in the sorted result no (
    > 'journey-voyage',0.25)
    >
    > >>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage','boy':'lad','coast':'shore', 'asylum':'madhouse', 'magician':'wizard', 'midday':'noon', 'furnace':'stove', 'food':'fruit', 'bird':'cock', 'bird':'crane', 'tool':'implement', 'brother':'monk', 'lad':'brother', 'crane':'implement','journey':'car', 'monk':'oracle', 'cemetery':'woodland', 'food':'rooster','coast':'hill', 'forest':'graveyard', 'shore':'woodland', 'monk':'slave', 'coast':'forest','lad':'wizard', 'chord':'smile', 'glass':'magician', 'rooster':'voyage', 'noon':'string'}


    Keys are unique in dictionaries. You have two uses of 'journey'; the
    second will overwrite the first.

    Do you _need_ these items to be a dictionary? Are you doing any look
    up? If not, just make it a list of tuples:

    pairs = [ ('car', 'automobile'), ('gem', 'jewel') ...]

    Then make your main loop:

    for word1, word2 in pairs:

    If you do need a dictionary for other reasons, you might want to try a
    dictionary of lists:

    pairs = {
    'car': ['automobile', 'vehicle'],
    'gem': ['jewel'],
    }

    for word1, synonyms in pairs:
    for word2 in synonyms:
    ...
    alex23, Oct 9, 2012
    #13
  14. Token Type

    Token Type Guest

    Re: wordnet semantic similarity: how to refer to elements of a pairin a list? can we sort dictionary according to the value?

    Thanks indeed for your tips. Now I understand the difference between tuples and dictionaries deeper.
    Token Type, Oct 9, 2012
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robert Oschler
    Replies:
    0
    Views:
    327
    Robert Oschler
    Oct 22, 2004
  2. Robert Oschler
    Replies:
    0
    Views:
    344
    Robert Oschler
    Oct 25, 2004
  3. Vincent Davis
    Replies:
    0
    Views:
    214
    Vincent Davis
    Jan 31, 2009
  4. Michael Granger

    [ANN] Ruby-WordNet 0.02

    Michael Granger, Sep 15, 2003, in forum: Ruby
    Replies:
    0
    Views:
    86
    Michael Granger
    Sep 15, 2003
  5. MW
    Replies:
    14
    Views:
    210
    Lori Fleetwood
    Aug 29, 2003
Loading...

Share This Page