unique elements from list of lists

Discussion in 'Python' started by Tekkaman, Feb 9, 2007.

  1. Tekkaman

    Tekkaman Guest

    I have a list of lists and I want to define an iterator (let's call
    that uniter) over all unique elements, in any order. For example,
    calling:

    sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))

    must return ['a', 'b', 'c', 'd']. I tried the following
    implementations:

    from itertools import chain
    def uniter1(listOfLists):
    for item in set(chain(*listOfLists)): yield item

    def uniter2(listOfLists):
    for item in reduce(
    lambda x,y: x|y,
    [set(list_) for list_ in listOfLists]
    ): yield item

    speed test with timeit says the first one is slightly faster. What
    bothers me is that it builds a set from an iterator and then another
    iterator from the set. Is there a way to implement this using only
    iterators? I also tried a "full python" implementation (by that I mean
    one that does not use the built-in set and keeps track of previously
    yielded items in a list) but the one I pulled out is about 180 times
    slower. Here it is:

    def uniter3(listOfLists):
    done = []
    for list_ in listOfLists:
    for item in list_:
    if not item in done:
    done.append(item)
    yield item

    Thanks in advance for any contribution.

    -- Simone
    Tekkaman, Feb 9, 2007
    #1
    1. Advertising

  2. Tekkaman

    Peter Otten Guest

    Tekkaman wrote:

    > I have a list of lists and I want to define an iterator (let's call
    > that uniter) over all unique elements, in any order. For example,
    > calling:
    >
    > sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))
    >
    > must return ['a', 'b', 'c', 'd']. I tried the following
    > implementations:
    >
    > from itertools import chain
    > def uniter1(listOfLists):
    > for item in set(chain(*listOfLists)): yield item


    def uniter(lists):
    return iter(set(chain(*lists)))

    This avoids the explicit for-loop.

    Peter
    Peter Otten, Feb 9, 2007
    #2
    1. Advertising

  3. Tekkaman

    azrael Guest

    try something else. im posting the code from a kiosk which has no
    python, sooo..... no code. only explanation

    if my memory works well there is a function in python that takes a
    multidimensional list and returns its values as a one-dimension list.

    def main():
    list =unknownFunction([['a', 'b', 'd'], ['b', 'c'], ['a', 'c',
    'd']) # =a, b, d, b, c, a, c, d
    temp = []
    for i in list:
    check(i, temp, list)
    sort(list)

    def check(pos, temp, list ):
    for i in temp:
    if temp== list[pos]
    del list[pos]
    check(pos, temp, list)
    temp.append(list[pos])

    im not sure if this should work but the meaning is:
    the first three elements will be appended into the list directly
    because there was no like this in the temp
    while there are elements in the list take a pivot value and check if
    they are unique in the list.

    check:
    while there are elements in the temporary list check if the pivot
    exists in the temp. if false then append it to temp.

    if true delete the element and go into the recursion on the same index
    why?

    temp a, b, d
    list a, b, d, (b), c, a, c, d

    temp a, b, d
    list a, b, d, (c) a, c, d

    temp a, b, d, c
    list a, b, d, c, (a) c, d

    temp a, b, d, c
    list a, b, d, c, (c), d

    temp a, b, d, c
    list a, b, d, c, (d)


    temp a, b, d, c
    list a, b, d, c

    list a, b, c, d

    one way to do it. this works well if you need the list in the order
    they appear. sort it and it works well


    but i think that there is the possibility to do it somthing like the
    merge sort. maybee it would work. try it. Merge the sublists and
    remove the duplicates. (L1,L2,L3) -> (L12, L3) -> (L123)
    this should work pretty well




    Tekkaman je napisao/la:
    > I have a list of lists and I want to define an iterator (let's call
    > that uniter) over all unique elements, in any order. For example,
    > calling:
    >
    > sorted(uniter([['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]))
    >
    > must return ['a', 'b', 'c', 'd']. I tried the following
    > implementations:
    >
    > from itertools import chain
    > def uniter1(listOfLists):
    > for item in set(chain(*listOfLists)): yield item
    >
    > def uniter2(listOfLists):
    > for item in reduce(
    > lambda x,y: x|y,
    > [set(list_) for list_ in listOfLists]
    > ): yield item
    >
    > speed test with timeit says the first one is slightly faster. What
    > bothers me is that it builds a set from an iterator and then another
    > iterator from the set. Is there a way to implement this using only
    > iterators? I also tried a "full python" implementation (by that I mean
    > one that does not use the built-in set and keeps track of previously
    > yielded items in a list) but the one I pulled out is about 180 times
    > slower. Here it is:
    >
    > def uniter3(listOfLists):
    > done = []
    > for list_ in listOfLists:
    > for item in list_:
    > if not item in done:
    > done.append(item)
    > yield item
    >
    > Thanks in advance for any contribution.
    >
    > -- Simone
    azrael, Feb 9, 2007
    #3
  4. Tekkaman

    azrael Guest

    tra using the firs sublist (list[1]) as cell.then take zhe second
    sublist and take a value from it at once and if the value from list[2]
    doesnt exist in list[1] then insert it into list[1] at the correct
    place. Something like the insertionsort.
    azrael, Feb 9, 2007
    #4
  5. Tekkaman

    Guest

    Tekkaman:

    If the sublists contain hashable elements you can use this:

    def uniter(lists):
    merge = set()
    for sub in lists:
    merge = merge.union(sub)
    for el in merge:
    yield el

    data = [['a', 'b', 'd'], ['b', 'c'], ['a', 'c', 'd']]
    print list(uniter(data))

    But often this too may be enough:

    def uniter(lists):
    merge = set()
    for sub in lists:
    merge = merge.union(sub)
    return merge

    Bye to the gentle Pegas too,
    bearophile
    , Feb 9, 2007
    #5
  6. Tekkaman

    Tekkaman Guest

    Thanks everybody!
    Azrael: your suggestions involve python-level membership testing and
    dummy list construction just like my uniter3 example, so I'm afraid
    they would not go any faster. There's no built-in sequence flattening
    function that I know of btw.
    bearophile: your implementation is very similar to my uniter2
    (subsequent set unions) but without the additional lambda def
    overhead, and in fact it goes faster. Not faster than uniter though.
    Peter: your solution is the fastest, removing the explicit for loop
    resulted in a 30% speed gain. Looks like using set() is a must.

    -- Simone
    Tekkaman, Feb 9, 2007
    #6
  7. Tekkaman

    azrael Guest

    no heart feelings. i was just throwing ideas. no time to testing it.


    On Feb 9, 3:55 pm, "Tekkaman" <> wrote:
    > Thanks everybody!Azrael: your suggestions involve python-level membership testing and
    > dummy list construction just like my uniter3 example, so I'm afraid
    > they would not go any faster. There's no built-in sequence flattening
    > function that I know of btw.
    > bearophile: your implementation is very similar to my uniter2
    > (subsequent set unions) but without the additional lambda def
    > overhead, and in fact it goes faster. Not faster than uniter though.
    > Peter: your solution is the fastest, removing the explicit for loop
    > resulted in a 30% speed gain. Looks like using set() is a must.
    >
    > -- Simone
    azrael, Feb 11, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==

    List of lists of lists of lists...

    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==, May 8, 2006, in forum: Python
    Replies:
    5
    Views:
    392
    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==
    May 15, 2006
  2. antar2
    Replies:
    2
    Views:
    382
    Bighead
    Jul 17, 2008
  3. antar2
    Replies:
    3
    Views:
    363
    Niklas Norrthon
    Jul 23, 2008
  4. ToshiBoy
    Replies:
    6
    Views:
    832
    ToshiBoy
    Aug 12, 2008
  5. Token Type
    Replies:
    9
    Views:
    341
    Chris Angelico
    Sep 9, 2012
Loading...

Share This Page