unique elements from list of lists

Tekkaman · Feb 9, 2007

I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item

def uniter2(listOfLists):
for item in reduce(
lambda x,y: x|y,
[set(list_) for list_ in listOfLists]
): yield item

speed test with timeit says the first one is slightly faster. What
bothers me is that it builds a set from an iterator and then another
iterator from the set. Is there a way to implement this using only
iterators? I also tried a "full python" implementation (by that I mean
one that does not use the built-in set and keeps track of previously
yielded items in a list) but the one I pulled out is about 180 times
slower. Here it is:

def uniter3(listOfLists):
done = []
for list_ in listOfLists:
for item in list_:
if not item in done:
done.append(item)
yield item

Thanks in advance for any contribution.

-- Simone

Peter Otten · Feb 9, 2007

Tekkaman said:
I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item

def uniter(lists):
return iter(set(chain(*lists)))

This avoids the explicit for-loop.

Peter

azrael · Feb 9, 2007

try something else. im posting the code from a kiosk which has no
python, sooo..... no code. only explanation

if my memory works well there is a function in python that takes a
multidimensional list and returns its values as a one-dimension list.

def main():
list =unknownFunction([['a', 'b', 'd'], ['b', 'c'], ['a', 'c',
'd']) # =a, b, d, b, c, a, c, d
temp = []
for i in list:
check(i, temp, list)
sort(list)

def check(pos, temp, list ):
for i in temp:
if temp== list[pos]
del list[pos]
check(pos, temp, list)
temp.append(list[pos])

im not sure if this should work but the meaning is:
the first three elements will be appended into the list directly
because there was no like this in the temp
while there are elements in the list take a pivot value and check if
they are unique in the list.

check:
while there are elements in the temporary list check if the pivot
exists in the temp. if false then append it to temp.

if true delete the element and go into the recursion on the same index
why?

temp a, b, d
list a, b, d, (b), c, a, c, d

temp a, b, d
list a, b, d, (c) a, c, d

temp a, b, d, c
list a, b, d, c, (a) c, d

temp a, b, d, c
list a, b, d, c, (c), d

temp a, b, d, c
list a, b, d, c, (d)

temp a, b, d, c
list a, b, d, c

list a, b, c, d

one way to do it. this works well if you need the list in the order
they appear. sort it and it works well

but i think that there is the possibility to do it somthing like the
merge sort. maybee it would work. try it. Merge the sublists and
remove the duplicates. (L1,L2,L3) -> (L12, L3) -> (L123)
this should work pretty well

Tekkaman je napisao/la:

I have a list of lists and I want to define an iterator (let's call
that uniter) over all unique elements, in any order. For example,
calling:

sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

must return ['a', 'b', 'c', 'd']. I tried the following
implementations:

from itertools import chain
def uniter1(listOfLists):
for item in set(chain(*listOfLists)): yield item

def uniter2(listOfLists):
for item in reduce(
lambda x,y: x|y,
[set(list_) for list_ in listOfLists]
): yield item

speed test with timeit says the first one is slightly faster. What
bothers me is that it builds a set from an iterator and then another
iterator from the set. Is there a way to implement this using only
iterators? I also tried a "full python" implementation (by that I mean
one that does not use the built-in set and keeps track of previously
yielded items in a list) but the one I pulled out is about 180 times
slower. Here it is:

def uniter3(listOfLists):
done = []
for list_ in listOfLists:
for item in list_:
if not item in done:
done.append(item)
yield item

Thanks in advance for any contribution.

-- Simone

Click to expand...

azrael · Feb 9, 2007

tra using the firs sublist (list[1]) as cell.then take zhe second
sublist and take a value from it at once and if the value from list[2]
doesnt exist in list[1] then insert it into list[1] at the correct
place. Something like the insertionsort.

bearophileHUGS · Feb 9, 2007

Tekkaman:

If the sublists contain hashable elements you can use this:

def uniter(lists):
merge = set()
for sub in lists:
merge = merge.union(sub)
for el in merge:
yield el

data = [['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]
print list(uniter(data))

But often this too may be enough:

def uniter(lists):
merge = set()
for sub in lists:
merge = merge.union(sub)
return merge

Bye to the gentle Pegas too,
bearophile

Tekkaman · Feb 9, 2007

Thanks everybody!
Azrael: your suggestions involve python-level membership testing and
dummy list construction just like my uniter3 example, so I'm afraid
they would not go any faster. There's no built-in sequence flattening
function that I know of btw.
bearophile: your implementation is very similar to my uniter2
(subsequent set unions) but without the additional lambda def
overhead, and in fact it goes faster. Not faster than uniter though.
Peter: your solution is the fastest, removing the explicit for loop
resulted in a 30% speed gain. Looks like using set() is a must.

-- Simone

azrael · Feb 10, 2007

no heart feelings. i was just throwing ideas. no time to testing it.

Count each unique element in list of lists	1	Nov 8, 2013
generating unique set of dicts from a list of dicts	0	Jan 10, 2012
Range / empty list issues??	1	Dec 10, 2023
Changing string value that is an element of a list	3	Feb 10, 2025
Extracting elements over multiple lists?	15	Nov 7, 2011
How do I transfer from Microsoft 365 to Google Workspace?	8	Jan 20, 2025
Converting a list of lists to a single list	8	Jul 23, 2013
Dealing with Lists	14	Sep 10, 2013

unique elements from list of lists

Tekkaman

Peter Otten

azrael

azrael

bearophileHUGS

Tekkaman

azrael

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads