Is there such an idiom?

Discussion in 'Python' started by Per, Mar 20, 2006.

  1. Per

    Per Guest

    http://jaynes.colorado.edu/PythonIdioms.html

    """Use dictionaries for searching, not lists. To find items in common
    between two lists, make the first into a dictionary and then look for
    items in the second in it. Searching a list for an item is linear-time,
    while searching a dict for an item is constant time. This can often let
    you reduce search time from quadratic to linear."""

    Is this correct?
    s = [1,2,3,4,5...]
    t = [4,5,6,,8,...]
    how to find whether there is/are common item(s) between two list in
    linear-time?
    how to find the number of common items between two list in linear-time?
    Per, Mar 20, 2006
    #1
    1. Advertising

  2. Per

    Rene Pijlman Guest

    Per:
    >how to find whether there is/are common item(s) between two list
    >in linear-time?


    To find items in common between two lists, make the first into a
    dictionary and then look for items in the second in it.

    --
    René Pijlman

    Wat wil jij leren? http://www.leren.nl
    Rene Pijlman, Mar 20, 2006
    #2
    1. Advertising

  3. Per

    Ron Adam Guest

    Per wrote:
    > http://jaynes.colorado.edu/PythonIdioms.html
    >
    > """Use dictionaries for searching, not lists. To find items in common
    > between two lists, make the first into a dictionary and then look for
    > items in the second in it. Searching a list for an item is linear-time,
    > while searching a dict for an item is constant time. This can often let
    > you reduce search time from quadratic to linear."""
    >
    > Is this correct?
    > s = [1,2,3,4,5...]
    > t = [4,5,6,,8,...]
    > how to find whether there is/are common item(s) between two list in
    > linear-time?
    > how to find the number of common items between two list in linear-time?


    I'm not sure if it works in linear time, but if there are no duplicates
    in each list, sets would be the easiest way to do these with.

    >>> s = set([1,2,3,4,5,6])
    >>> t = set([4,5,6,7,8,9])
    >>> s.intersection(t)

    set([4, 5, 6])
    >>> len(s.intersection(t))

    3

    Cheers,
    Ron
    Ron Adam, Mar 20, 2006
    #3
  4. Per wrote:

    > how to find the number of common items between two list in linear-time?
    >


    Not really sure about linear-time, but you could try the following:

    >>> a=[1,2,3,4]
    >>> b=[3,4,5,6]
    >>> set(a) & set(b)

    set([3, 4])


    --Irmen
    Irmen de Jong, Mar 20, 2006
    #4
  5. Per

    Per Guest

    Thanks Ron,
    surely set is the simplest way to understand the question, to see
    whether there is a non-empty intersection. But I did the following
    thing in a silly way, still not sure whether it is going to be linear
    time.
    def foo():
    l = [...]
    s = [...]
    dic = {}
    for i in l:
    dic = 0
    k=0
    while k <len(s):
    if s[k] in dic:
    return True
    else: pass
    k+=1
    if k == len(s):
    return False


    I am still a rookie, and partly just migrated from Haskell...
    I am not clear about how making one of the lists a dictionary is
    helpful
    Per, Mar 20, 2006
    #5
  6. Per wrote:
    > Thanks Ron,
    > surely set is the simplest way to understand the question, to see
    > whether there is a non-empty intersection. But I did the following
    > thing in a silly way, still not sure whether it is going to be linear
    > time.
    > def foo():
    > l = [...]
    > s = [...]
    > dic = {}
    > for i in l:
    > dic = 0
    > k=0
    > while k <len(s):
    > if s[k] in dic:
    > return True
    > else: pass
    > k+=1
    > if k == len(s):
    > return False
    >
    >
    > I am still a rookie, and partly just migrated from Haskell...
    > I am not clear about how making one of the lists a dictionary is
    > helpful
    >

    The dict-based approach is to do something like this:

    >>> ls1 = [1,3,5,7,9]
    >>> ls2 = [3,5,6,7,10]
    >>> d1 = dict.fromkeys(ls1)
    >>> d1

    {1: None, 3: None, 9: None, 5: None, 7: None}

    Note the values, are irrelevant - we care only about the keys
    Now, membership testing takes linear time:

    >>> [item for item in ls2 if item in d1]

    [3, 5, 7]
    >>>


    But, as you say, this approach is unnecessary, given sets.

    HTH
    Michael
    Michael Spencer, Mar 20, 2006
    #6
  7. Per

    Ron Adam Guest

    Per wrote:
    > Thanks Ron,
    > surely set is the simplest way to understand the question, to see
    > whether there is a non-empty intersection. But I did the following
    > thing in a silly way, still not sure whether it is going to be linear
    > time.
    > def foo():
    > l = [...]
    > s = [...]
    > dic = {}
    > for i in l:
    > dic = 0
    > k=0
    > while k <len(s):
    > if s[k] in dic:
    > return True
    > else: pass
    > k+=1
    > if k == len(s):
    > return False
    >
    >
    > I am still a rookie, and partly just migrated from Haskell...
    > I am not clear about how making one of the lists a dictionary is
    > helpful


    Lets compare them by checking different length with no overlap which is
    the worst case situation.


    ## is_interstion comparison

    def ii_set(a, b):
    return len(set(a).intersection(b))>0

    def ii_dict(l, s):
    dic = {}
    for i in l:
    dic = 0
    for i in s:
    if i in dic:
    return True
    return False

    def ii_dict2(l, s):
    dic = dict.fromkeys(l)
    for i in s:
    if i in dic:
    return True
    return False

    import time

    foos = [ii_set, ii_dict, ii_dict2]
    lsts = [10, 100, 1000, 10000, 100000, 1000000]

    for f in foos:
    for lst in lsts:
    a = range(lst)
    b = range(lst, lst*2)
    start = time.clock()
    assert f(a,b) == False
    t = time.clock()-start
    print f.__name__, lst, t
    print



    ii_set 10 1.25714301678e-005
    ii_set 100 2.45841301059e-005
    ii_set 1000 0.000162031766607
    ii_set 10000 0.0020256764477
    ii_set 100000 0.0238681173166
    ii_set 1000000 0.23067484834

    ii_dict 10 2.31873045317e-005
    ii_dict 100 6.73269926764e-005
    ii_dict 1000 0.000442234976792
    ii_dict 10000 0.0047891561637
    ii_dict 100000 0.0502407428877
    ii_dict 1000000 0.506360165887

    ii_dict2 10 2.70984161395e-005
    ii_dict2 100 5.55936578532e-005
    ii_dict2 1000 0.000317358770458
    ii_dict2 10000 0.00366638776716
    ii_dict2 100000 0.0394256811969
    ii_dict2 1000000 0.39200764343


    The sets solution seems to be the fastest. And it is usually is when
    you can move more of your task into the built-in methods which are
    programmed in C.

    From what I recently read here (not sure where) in another thread, in
    python 2.3 sets were implemented as a python module using dictionaries,
    and in 2.4 it was written in C code based on the dictionary C code.

    Cheers,
    Ron
    Ron Adam, Mar 20, 2006
    #7
  8. Ron Adam <> wrote:
    ...
    > From what I recently read here (not sure where) in another thread, in
    > python 2.3 sets were implemented as a python module using dictionaries,
    > and in 2.4 it was written in C code based on the dictionary C code.


    ....and in 2.5 they're due to be further optimized, using their own
    specialized code rather than piggybacking on dictionaries'.


    Alex
    Alex Martelli, Mar 20, 2006
    #8
  9. Per wrote:
    > http://jaynes.colorado.edu/PythonIdioms.html

    [snip]

    > Is this correct?
    > s = [1,2,3,4,5...]
    > t = [4,5,6,,8,...]
    > how to find whether there is/are common item(s) between two list in
    > linear-time?
    > how to find the number of common items between two list in linear-time?


    A common technique if both lists are sorted is to iterate through both
    lists in parallel, advancing the smaller iterator each time. This is
    the merge algorithm that is used by a merge sort, and it is O(s+t).

    For two lists, the algorithm would go something like:

    while not finished:
    if s_iter_val < t_iter_val:
    move s_iter on
    elif s_iter_val > t_iter_val:
    move t_iter on
    else:
    include / yield the value
    move both iters on

    For more on the standard merge algorithm, see:
    http://en.wikipedia.org/wiki/Merge_algorithm

    For an intersection merge, I hacked the recursive solution from
    there...

    def merge(a, b):
    if len(a) == 0: return []
    if len(b) == 0: return []
    if a[0] < b[0]: return merge(a[1:], b)
    elif a[0] > b[0]: return merge(a, b[1:])
    else: return a[0:1] + merge(a[1:], b[1:])

    #-------------------------8<-------------------------
    import unittest

    class TestMerge(unittest.TestCase):
    def test_merge(self):
    self.assertEquals(merge([1,2],[]), [])
    self.assertEquals(merge([],[1,2]), [])
    self.assertEquals(merge([1,3,5],[2,4,6]), [])
    self.assertEquals(merge([1,2,3],[3,4,5]), [3])
    self.assertEquals(merge([1,2,3,5,6,7],[3,4,5,7,8]), [3,5,7])

    if __name__ == "__main__":
    unittest.main()


    HTH,
    Bruce
    Bruce Cropley, Mar 20, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?ISO-8859-1?Q?Ernesto_Garc=EDa_Garc=EDa?=

    Is there a commas-in-between idiom?

    =?ISO-8859-1?Q?Ernesto_Garc=EDa_Garc=EDa?=, Nov 5, 2006, in forum: Python
    Replies:
    14
    Views:
    458
    Peter van Kampen
    Nov 9, 2006
  2. Christian Hackl
    Replies:
    9
    Views:
    379
    James Kanze
    Jun 11, 2007
  3. Mark Probert

    Idiom -- is there a better way?

    Mark Probert, Oct 20, 2004, in forum: Ruby
    Replies:
    7
    Views:
    110
    John Carter
    Oct 21, 2004
  4. Lyle Johnson

    Is there a name for this idiom?

    Lyle Johnson, Jul 6, 2007, in forum: Ruby
    Replies:
    3
    Views:
    93
    Brad Ediger
    Jul 6, 2007
  5. Kenneth McDonald

    Is there a standard "assert" idiom in Ruby?

    Kenneth McDonald, Aug 2, 2007, in forum: Ruby
    Replies:
    8
    Views:
    156
    John Carter
    Aug 3, 2007
Loading...

Share This Page