Is there a unique method in python to unique a list?

C

Chris Angelico

Is there a unique method in python to unique a list? thanks

I don't believe there's a method for that, but if you don't care about
order, try turning your list into a set and then back into a list.

ChrisA
 
C

Chris Angelico

However, if I don't put list(set(lemma_list)) to a variable name, it works
much faster.

Try backdenting that statement. You're currently doing it at every
iteration of the loop - that's why it's so much slower.

But you'll probably find it better to work with the set directly,
instead of uniquifying a list as a separate operation.

ChrisA
 
T

Token Type

Try backdenting that statement. You're currently doing it at every

iteration of the loop - that's why it's so much slower.

Thanks. I works now.
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))
1


But you'll probably find it better to work with the set directly,

instead of uniquifying a list as a separate operation.

Yes, the following second methods still runs faster if I don't give a separate variable name to list(set(lemma_list)). Why will this happen?
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))
1
 
T

Token Type

Try backdenting that statement. You're currently doing it at every

iteration of the loop - that's why it's so much slower.

Thanks. I works now.
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))
1


But you'll probably find it better to work with the set directly,

instead of uniquifying a list as a separate operation.

Yes, the following second methods still runs faster if I don't give a separate variable name to list(set(lemma_list)). Why will this happen?
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))
1
 
P

Paul Rubin

Token Type said:
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))

I think you mean (untested):

synsets = wn.all_synsets(pos)
sense_number = 0
lemma_set = set()
for synset in synsets:
lemma_set.add(synset.lemma_names)
for lemma in lemma_set:
sense_number += len(wn.synsets(lemma,pos))
return sense_number / len(lemma_set)
 
P

Paul Rubin

Paul Rubin said:
I think you mean (untested):

synsets = wn.all_synsets(pos)
sense_number = 0
lemma_set = set()
for synset in synsets:
lemma_set.add(synset.lemma_names)
for lemma in lemma_set:
sense_number += len(wn.synsets(lemma,pos))
return sense_number / len(lemma_set)

Or even:

lemma_set = set(synset for synset in wn.all_synsets(pos))
sense_number = sum(len(wn.synsets(lemma, pos)) for lemma in lemma_set)
return sense_number / len(lemma_set)
 
T

Token Type

Thanks. I try to use set() suggested by you. However, not successful. Please see:
synsets = list(wn.all_synsets('n'))
synsets[:5] [Synset('entity.n.01'), Synset('physical_entity.n.01'), Synset('abstraction.n.06'), Synset('thing.n.12'), Synset('object.n.01')]
lemma_set = set()
for synset in synsets:
lemma_set.add(synset.lemma_names)


Traceback (most recent call last):
File "<pyshell#43>", line 2, in <module>
lemma_set.add(synset.lemma_names)
TypeError: unhashable type: 'list'lemma_set.add(set(synset.lemma_names))

Traceback (most recent call last):
File "<pyshell#45>", line 2, in <module>
lemma_set.add(set(synset.lemma_names))
TypeError: unhashable type: 'set'
 
C

Chris Angelico

lemma_set.add(synset.lemma_names)

That tries to add the whole list as a single object, which doesn't
work because lists can't go into sets. There are two solutions,
depending on what you want to do.

1) If you want each addition to remain discrete, make a tuple instead:
lemma_set.add(tuple(synset.lemma_names))

2) If you want to add the elements of that list individually into the
set, use update:
lemma_set.update(synset.lemma_names)

I'm thinking you probably want option 2 here.

ChrisA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top