Unique Elements in a List

superprad · May 9, 2005

Is there an easy way to grab the Unique elements from a list?
For Example:
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

what I am looking for is the unique elements 0.4 and 0.9 with their
index from the list.
Probably something like a Hash Table approach!!
I would like to get this done without unnecessary overhead.And the list
could be essentially anything strings,floats,int etc...

Or is it already avaliable as an attr to a list or an array?
I dont remember seeing anything like that.

superprad · May 9, 2005

OK I need to be more clear I guess!Unique Elements I mean, elements
that are non repeating. so in the above list 0.4, 0.9 are unique as
they exist only once in the list.

Roie Kerstein · May 9, 2005

Is there an easy way to grab the Unique elements from a list?

Yes.
The set data type.
Look in the reference of python 2.4.

James Stroud · May 9, 2005

Is there an easy way to grab the Unique elements from a list?
For Example:
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

what I am looking for is the unique elements 0.4 and 0.9 with their
index from the list.
Probably something like a Hash Table approach!!
I would like to get this done without unnecessary overhead.And the list
could be essentially anything strings,floats,int etc...

Or is it already avaliable as an attr to a list or an array?
I dont remember seeing anything like that.

from sets import Set

data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

[x for x in Set(data) if data.count(x) == 1]

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/

runes · May 10, 2005

This is not the most beautiful idiom, but it works...

d = {}
for k in data:
try:
d[k] += 1
except:
d[k] = 1

for k,v in d.items():
if v == 1:
print k

Michael J. Fromberger · May 10, 2005

"[email protected] said:
Is there an easy way to grab the Unique elements from a list?
For Example:
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

what I am looking for is the unique elements 0.4 and 0.9 with their
index from the list.
Probably something like a Hash Table approach!!
I would like to get this done without unnecessary overhead.And the list
could be essentially anything strings,floats,int etc...

Or is it already avaliable as an attr to a list or an array?
I dont remember seeing anything like that.

From your comments downthread, it seems you want to find those elements
of the input sequence which occur exactly once, and return not only
these elements, but also their positions.

One reasonable solution might be as follows:

def unique_elts(seq):
elts = {}
for pos, elt in enumerate(seq):
elts.setdefault(elt, []).append(pos)

return [ (x, p[0]) for (x, p) in elts.iteritems()
if len(p) == 1 ]

This returns a list of tuples of the form (x, pos), where x is an
element of seq that occurs exactly once, and pos is its index in the
original sequence. This implementation traverses the input sequence
exactly once, and requires storage proportional to the length of the
input.

-M

bearophileHUGS · May 10, 2005

runes:

d = {}
for k in data:
try:
d[k] += 1
except:
d[k] = 1

for k,v in d.items():
if v == 1:
print k

For this problem, if duplicated keys are common enough, this version is
probably the faster.

With this *different* problem, it seems that sometimes the faster
solution is using "in":
http://pyalot.blogspot.com/2005/04/dictionary-speed.html

Bye,
Bearophile

Aahz · May 10, 2005

This is not the most beautiful idiom, but it works...

d = {}
for k in data:
try:
d[k] += 1
except:
d[k] = 1

for k,v in d.items():
if v == 1:
print k

Only if "works" does not include "order preserving".

Paul Rubin · May 10, 2005

[email protected] said:
Is there an easy way to grab the Unique elements from a list?
For Example:
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

Untested: here's an iterator that gives you the index and value,
without reordering. Uses dicts instead of sets for backwards compatibility.

def unique_elements(data):
seen = {}
for n,x in data:
if x not in seen:
seen[x] = 1
yield n,x

Rune Strand · May 10, 2005

You're right. I somehow missed the index part :-o. It's easy to fix
though.

Fredrik Lundh · May 10, 2005

Paul said:
Is there an easy way to grab the Unique elements from a list?
For Example:
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

Click to expand...

Untested: here's an iterator that gives you the index and value,
without reordering. Uses dicts instead of sets for backwards compatibility.

def unique_elements(data):
seen = {}
for n,x in data:
if x not in seen:
seen[x] = 1
yield n,x

you forgot enumerate()

and if you fix that, you'll notice that the output doesn't quite match
the OP's spec:

> For Example:
> data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
>
> what I am looking for is the unique elements 0.4 and 0.9 with their
> index from the list.

here's a straight-forward variation that gives the specified output,
in the original order:

def unique_elements(data):
count = {}
data = list(enumerate(data))
for n,x in data:
count[x] = count.setdefault(x, 0) + 1
for n,x in data:
if count[x] == 1:
yield x, n # value with index

depending on the data, it might be more efficient to store the
"last seen index" in a dictionary, and sort the result on the way
out (if necessary). something like

from operator import itemgetter

def unique_elements(data):
seen = {}; index = {}
for n, x in enumerate(data):
if x in seen:
del index[x]
else:
index[x] = seen[x] = n
index = index.items()
index.sort(key=itemgetter(1)) # leave this out if order doesn't matter
return index

could work.

</F>

Edvard Majakari · May 10, 2005

James Stroud said:
from sets import Set

data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

[x for x in Set(data) if data.count(x) == 1]

Um.

....I must have missed something, but I'll post nevertheless:

wouldn't just

[x for x in data if data.count(x) == 1]

suffice? it is also "stable" preserving order of items. Lemme demo:

[x for x in Set(data) if data.count(x) == 1]

Click to expand...

Click to expand...

[0.90000000000000002, 0.40000000000000002]

[x for x in data if data.count(x) == 1]

Click to expand...

Click to expand...

[0.40000000000000002, 0.90000000000000002]

Though I'll admit I also thought of Sets first, because I didn't remember
there was such a nice method list.count().

--
# Edvard Majakari Software Engineer
# PGP PUBLIC KEY available Soli Deo Gloria!
One day, when he was naughty, Mr Bunnsy looked over the hedge into Farmer
Fred's field and it was full of fresh green lettuces. Mr Bunnsy, however, was
not full of lettuces. This did not seem fair. --Mr Bunnsy has an adventure

Max M · May 10, 2005

Fredrik said:
depending on the data, it might be more efficient to store the
"last seen index" in a dictionary, and sort the result on the way
out (if necessary). something like

form sets import Set

data = list(Set([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]))

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Edvard Majakari · May 10, 2005

Michael J. Fromberger said:
One reasonable solution might be as follows:

def unique_elts(seq):
elts = {}
for pos, elt in enumerate(seq):
elts.setdefault(elt, []).append(pos)

return [ (x, p[0]) for (x, p) in elts.iteritems()
if len(p) == 1 ]

This is neat. Being lazy (the wrong way) I've never payed much attention why
would I need dict.setdefault() when I have things as dict.get(k, [def]). This
was a nice example of good use for setdefault() because it works like
dict.get() except it also sets the value if it didn't exist.

+1 IOTW (idiom of the week).

--
# Edvard Majakari Software Engineer
# PGP PUBLIC KEY available Soli Deo Gloria!

$_ = '456476617264204d616a616b6172692c20612043687269737469616e20'; print
join('',map{chr hex}(split/(\w{2})/)),uc substr(crypt(60281449,'es'),2,4),"\n";

Fredrik Lundh · May 10, 2005

Max said:
depending on the data, it might be more efficient to store the
"last seen index" in a dictionary, and sort the result on the way
out (if necessary). something like

Click to expand...

form sets import Set

data = list(Set([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]))

read the OP's spec again.

</F>

Bengt Richter · May 10, 2005

OK I need to be more clear I guess!Unique Elements I mean, elements
that are non repeating. so in the above list 0.4, 0.9 are unique as
they exist only once in the list.

You want to be careful of your definitions, especially with floating point values,
which may surprise the uninitiated. Dicts and sets hash numerically equal values
to the same hash, and then do equality test, so you get legitimate results like:

>>> set([9*.1-8*.1, .1]) set([0.10000000000000001, 0.099999999999999978])
>>> set([123, 123.0, 123L]) set([123])
>>> set([123.0, 123, 123L]) set([123.0])
>>> set([123L, 123, 123.0])

Click to expand...

Click to expand...

set([123L])

You may want to consider creating representations other than the original data
to use in your uniqueness testing, depending on your definition.

If you are happy with the way dicts and sets compare elements, you could do
a dict that keeps counts, or FTHOI something different (hot off untested griddle,
so test before relying ;-)

>>> data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]
>>> once=set()
>>> more=set()
>>> for el in data:

Click to expand...

Click to expand...

... if el in once: more.add(el)
... else: once.add(el)
... set([0.90000000000000002, 0.40000000000000002])

Not the most efficient space-wise, not sure about speed.

Regards,
Bengt Richter

Max M · May 10, 2005

Fredrik said:
Max M wrote:

depending on the data, it might be more efficient to store the
"last seen index" in a dictionary, and sort the result on the way
out (if necessary). something like

Click to expand...

form sets import Set

data = list(Set([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]))

Click to expand...

read the OP's spec again.

Well if you want to haggle over minor details like a spec, it's easy to
be critical!

data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

freqs = {}
for position in xrange(len(data)):
freqs.setdefault(data[position], []).append(position)
unique = [(value, positions[0]) for (value, positions) in freqs.items()
if len(positions) == 1]

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Paul Rubin · May 10, 2005

Fredrik Lundh said:
you forgot enumerate()

Oh, oops.

and if you fix that, you'll notice that the output doesn't quite match
the OP's spec:

Oh, I see, he wanted the elements that only occur once. I misunderstood.

Aahz · May 10, 2005

James Stroud said:
James Stroud said:

from sets import Set

data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

[x for x in Set(data) if data.count(x) == 1]

Click to expand...

Um.

...I must have missed something, but I'll post nevertheless:

wouldn't just

[x for x in data if data.count(x) == 1]

suffice? it is also "stable" preserving order of items. Lemme demo:

Only for small datasets -- this is an O(N^2) algorithm.

Steven Bethard · May 10, 2005

Is there an easy way to grab the Unique elements from a list?
For Example:
data = [0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9]

what I am looking for is the unique elements 0.4 and 0.9 with their
index from the list.

This is basically the same as Fredrik Lundh's solution, but here it is
anyway:

py> def f(lst):
.... counts = dicttools.counts(lst)
.... return [(i, elem)
.... for i, elem in enumerate(lst)
.... if counts[elem] == 1]
....
py> f([0.1,0.5,0.6,0.4,0.1,0.5,0.6,0.9])
[(3, 0.40000000000000002), (7, 0.90000000000000002)]

Where dicttools.counts is defined as:

def counts(iterable, key=None):
result = {}
for item in iterable:
# apply key function if necessary
if key is None:
k = item
else:
k = key(item)
# increment key's count
try:
result[k] += 1
except KeyError:
result[k] = 1
return result

(I use this so often that I have a module called dicttools to hold it.
You don't actually need the key= argument for your problem, but I just
copied the code directly.)

STeVe

generating unique set of dicts from a list of dicts	0	Jan 10, 2012
Hi, I am a webflow user. I am looking for CSS code that can KEEP ALL ELEMENTS POSITIONED in the SAME spot across all resolutions	0	Oct 27, 2023
unique-ifying a list	5	Aug 7, 2009
unique elements from list of lists	6	Feb 9, 2007
Best way to check if all elements in a List are unique	35	Mar 1, 2010
FAQ 4.41 How can I remove duplicate elements from a list or array?	0	Mar 1, 2011
unique elements in a list....	9	Apr 17, 2006
shuffling elements of a list	22	May 31, 2006

Unique Elements in a List

superprad

superprad

Roie Kerstein

James Stroud

runes

Michael J. Fromberger

bearophileHUGS

Aahz

Paul Rubin

Rune Strand

Fredrik Lundh

Edvard Majakari

Max M

Edvard Majakari

Fredrik Lundh

Bengt Richter

Max M

Paul Rubin

Aahz

Steven Bethard

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads