delete duplicates in list

C

christof hoeke

Alex said:
christof hoeke wrote:
...
i have a list from which i want a simpler list without the duplicates


Canonical is:

import sets
simplerlist = list(sets.Set(thelist))

if you're allright with destroying order, as your example solution suggests.
But dict.fromkeys(a).keys() is probably faster. Your assertion:

there should be an easier or more intuitive solution, maybe with a list
comprehension=


doesn't seem self-evident to me. A list-comprehension might be, e.g:

[ x for i, x in enumerate(a) if i==a.index(x) ]

and it does have the advantages of (a) keeping order AND (b) not
requiring hashable (nor even inequality-comparable!) elements -- BUT
it has the non-indifferent cost of being O(N*N) while the others
are about O(N). If you really want something similar to your approach:

b = [x for x in a if x not in b]


you'll have, o horrors!-), to do a loop, so name b is always bound to
"the result list so far" (in the LC, name b is only bound at the end):

b = []
for x in a:
if x not in b:
b.append(x)

However, this is O(N*N) too. In terms of "easier or more intuitive",
I suspect only this latter solution might qualify.


Alex
i was looking at the cookbook site but could not find the solution in
the short time i was looking, so thanks for the link to Bernard.

but thanks to all for the interesting discussion which i at least partly
able to follow as i am still a novice pythoner.

as for speed i did not care really as my script will not be used regularly.
but it seems i guessed right using pythons dictionary functions, it
still seems kind of the easiest and fastest ways (i should add that the
list i am using consists only of strings, which can be dictionary keys
then).


thanks again
chris




thanks to all
 
B

Bengt Richter

hello,
this must have come up before, so i am already sorry for asking but a
quick googling did not give me any answer.

i have a list from which i want a simpler list without the duplicates
an easy but somehow contrived solution would be
a = [1, 2, 2, 3]
d = {}.fromkeys(a)
b = d.keys()
print b
[1, 2, 3]

there should be an easier or more intuitive solution, maybe with a list
comprehension=

somthing like
b = [x for x in a if x not in b]
print b
[]

does not work though.
If you want to replace the original list without a temporary new list,
and your original is sorted (or you don't mind having it sorted), then
you could do the following (not tested beyond what you see ;-), which
as an extra benefit doesn't require hashability:
... thelist.sort() # remove if you just want to eliminate adjacent duplicates
... i = 0
... for item in thelist:
... if item==thelist: continue
... i += 1
... thelist = item
... del thelist[i+1:]
...
>>> a = [1, 2, 2, 3]
>>> elimdups(a)
>>> a [1, 2, 3]
>>> a=[]
>>> elimdups(a)
>>> a []
>>> a = [123]
>>> elimdups(a)
>>> a [123]
>>> a = ['a', ['b', 2], ['c',3], ['b',2], 'd']
>>> a ['a', ['b', 2], ['c', 3], ['b', 2], 'd']
>>> elimdups(a)
>>> a [['b', 2], ['c', 3], 'a', 'd']
>>> a = list('deaacbb')
>>> elimdups(a)
>>> a
['a', 'b', 'c', 'd', 'e']

Not sure how this was decided, but that's the way it works:
True

Hm, it would have been nicer to have an optional sort flag as
a second parameter. Oh, well, another time...

Regards,
Bengt Richter
 
M

Michael Hudson

Thomas Heller said:
[about object.__hash__]
Bernhard Herzog wrote:
...

I don't know why the official docs don't mention this fact -- I do
mention it in "Python in a Nutshell", of course (p. 132).


The reason this is done is because it's useful.

Well, Guido seems to disagree. Per his request, when I complained about
the default __hash__() method he asked me to enter a bug about it:
http://www.python.org/sf/660098. But it doesn't seem easy at all to fix
this...

That's actually something different: new-style classes that define
__eq__ remain hashable. I don't think making new-style classes that
*don't* define __eq__ *un*hashable is on the table, is it? If it is,
let me complain about that :)

Cheers,
mwh
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,162
Latest member
GertrudeMa
Top