Is there such an idiom?

Per · Mar 20, 2006

http://jaynes.colorado.edu/PythonIdioms.html

"""Use dictionaries for searching, not lists. To find items in common
between two lists, make the first into a dictionary and then look for
items in the second in it. Searching a list for an item is linear-time,
while searching a dict for an item is constant time. This can often let
you reduce search time from quadratic to linear."""

Is this correct?
s = [1,2,3,4,5...]
t = [4,5,6,,8,...]
how to find whether there is/are common item(s) between two list in
linear-time?
how to find the number of common items between two list in linear-time?

Rene Pijlman · Mar 20, 2006

Per:

how to find whether there is/are common item(s) between two list
in linear-time?

To find items in common between two lists, make the first into a
dictionary and then look for items in the second in it.

Ron Adam · Mar 20, 2006

Per said:
http://jaynes.colorado.edu/PythonIdioms.html

"""Use dictionaries for searching, not lists. To find items in common
between two lists, make the first into a dictionary and then look for
items in the second in it. Searching a list for an item is linear-time,
while searching a dict for an item is constant time. This can often let
you reduce search time from quadratic to linear."""

Is this correct?
s = [1,2,3,4,5...]
t = [4,5,6,,8,...]
how to find whether there is/are common item(s) between two list in
linear-time?
how to find the number of common items between two list in linear-time?

I'm not sure if it works in linear time, but if there are no duplicates
in each list, sets would be the easiest way to do these with.

s = set([1,2,3,4,5,6])
t = set([4,5,6,7,8,9])
s.intersection(t) set([4, 5, 6])
len(s.intersection(t))

Click to expand...

Click to expand...

3

Cheers,
Ron

Irmen de Jong · Mar 20, 2006

Per said:
how to find the number of common items between two list in linear-time?

Not really sure about linear-time, but you could try the following:

>>> a=[1,2,3,4]
>>> b=[3,4,5,6]
>>> set(a) & set(b)

Click to expand...

Click to expand...

set([3, 4])

--Irmen

Per · Mar 20, 2006

Thanks Ron,
surely set is the simplest way to understand the question, to see
whether there is a non-empty intersection. But I did the following
thing in a silly way, still not sure whether it is going to be linear
time.
def foo():
l = [...]
s = [...]
dic = {}
for i in l:
dic = 0
k=0
while k <len(s):
if s[k] in dic:
return True
else: pass
k+=1
if k == len(s):
return False

I am still a rookie, and partly just migrated from Haskell...
I am not clear about how making one of the lists a dictionary is
helpful

Michael Spencer · Mar 20, 2006

Per said:
Thanks Ron,
surely set is the simplest way to understand the question, to see
whether there is a non-empty intersection. But I did the following
thing in a silly way, still not sure whether it is going to be linear
time.
def foo():
l = [...]
s = [...]
dic = {}
for i in l:
dic = 0
k=0
while k <len(s):
if s[k] in dic:
return True
else: pass
k+=1
if k == len(s):
return False

I am still a rookie, and partly just migrated from Haskell...
I am not clear about how making one of the lists a dictionary is
helpful

The dict-based approach is to do something like this:

>>> ls1 = [1,3,5,7,9]
>>> ls2 = [3,5,6,7,10]
>>> d1 = dict.fromkeys(ls1)
>>> d1

Click to expand...

Click to expand...

Click to expand...

{1: None, 3: None, 9: None, 5: None, 7: None}

Note the values, are irrelevant - we care only about the keys
Now, membership testing takes linear time:

>>> [item for item in ls2 if item in d1] [3, 5, 7]
>>>

Click to expand...

Click to expand...

Click to expand...

But, as you say, this approach is unnecessary, given sets.

HTH
Michael

Ron Adam · Mar 20, 2006

Per said:
Thanks Ron,
surely set is the simplest way to understand the question, to see
whether there is a non-empty intersection. But I did the following
thing in a silly way, still not sure whether it is going to be linear
time.
def foo():
l = [...]
s = [...]
dic = {}
for i in l:
dic = 0
k=0
while k <len(s):
if s[k] in dic:
return True
else: pass
k+=1
if k == len(s):
return False

I am still a rookie, and partly just migrated from Haskell...
I am not clear about how making one of the lists a dictionary is
helpful

Lets compare them by checking different length with no overlap which is
the worst case situation.

## is_interstion comparison

def ii_set(a, b):
return len(set(a).intersection(b))>0

def ii_dict(l, s):
dic = {}
for i in l:
dic = 0
for i in s:
if i in dic:
return True
return False

def ii_dict2(l, s):
dic = dict.fromkeys(l)
for i in s:
if i in dic:
return True
return False

import time

foos = [ii_set, ii_dict, ii_dict2]
lsts = [10, 100, 1000, 10000, 100000, 1000000]

for f in foos:
for lst in lsts:
a = range(lst)
b = range(lst, lst*2)
start = time.clock()
assert f(a,b) == False
t = time.clock()-start
print f.__name__, lst, t
print

ii_set 10 1.25714301678e-005
ii_set 100 2.45841301059e-005
ii_set 1000 0.000162031766607
ii_set 10000 0.0020256764477
ii_set 100000 0.0238681173166
ii_set 1000000 0.23067484834

ii_dict 10 2.31873045317e-005
ii_dict 100 6.73269926764e-005
ii_dict 1000 0.000442234976792
ii_dict 10000 0.0047891561637
ii_dict 100000 0.0502407428877
ii_dict 1000000 0.506360165887

ii_dict2 10 2.70984161395e-005
ii_dict2 100 5.55936578532e-005
ii_dict2 1000 0.000317358770458
ii_dict2 10000 0.00366638776716
ii_dict2 100000 0.0394256811969
ii_dict2 1000000 0.39200764343

The sets solution seems to be the fastest. And it is usually is when
you can move more of your task into the built-in methods which are
programmed in C.

From what I recently read here (not sure where) in another thread, in
python 2.3 sets were implemented as a python module using dictionaries,
and in 2.4 it was written in C code based on the dictionary C code.

Cheers,
Ron

Alex Martelli · Mar 20, 2006

Ron Adam said:
From what I recently read here (not sure where) in another thread, in
python 2.3 sets were implemented as a python module using dictionaries,
and in 2.4 it was written in C code based on the dictionary C code.

....and in 2.5 they're due to be further optimized, using their own
specialized code rather than piggybacking on dictionaries'.

Alex

Bruce Cropley · Mar 20, 2006

Per said:
http://jaynes.colorado.edu/PythonIdioms.html [snip]

Is this correct?
s = [1,2,3,4,5...]
t = [4,5,6,,8,...]
how to find whether there is/are common item(s) between two list in
linear-time?
how to find the number of common items between two list in linear-time?

A common technique if both lists are sorted is to iterate through both
lists in parallel, advancing the smaller iterator each time. This is
the merge algorithm that is used by a merge sort, and it is O(s+t).

For two lists, the algorithm would go something like:

while not finished:
if s_iter_val < t_iter_val:
move s_iter on
elif s_iter_val > t_iter_val:
move t_iter on
else:
include / yield the value
move both iters on

For more on the standard merge algorithm, see:
http://en.wikipedia.org/wiki/Merge_algorithm

For an intersection merge, I hacked the recursive solution from
there...

def merge(a, b):
if len(a) == 0: return []
if len(b) == 0: return []
if a[0] < b[0]: return merge(a[1:], b)
elif a[0] > b[0]: return merge(a, b[1:])
else: return a[0:1] + merge(a[1:], b[1:])

#-------------------------8<-------------------------
import unittest

class TestMerge(unittest.TestCase):
def test_merge(self):
self.assertEquals(merge([1,2],[]), [])
self.assertEquals(merge([],[1,2]), [])
self.assertEquals(merge([1,3,5],[2,4,6]), [])
self.assertEquals(merge([1,2,3],[3,4,5]), [3])
self.assertEquals(merge([1,2,3,5,6,7],[3,4,5,7,8]), [3,5,7])

if __name__ == "__main__":
unittest.main()

HTH,
Bruce

Is there a better way to do this snippet?	6	Apr 3, 2012
PHP RSS Feed Aggregator changing to todays date everytime feed is aggregated	1	Jan 11, 2022
dictionary idiom needed	6	Dec 11, 2008
Is there a commas-in-between idiom?	14	Nov 5, 2006
Naive idiom questions	10	Jan 31, 2008
An empty initializer is invalid for an array with unspecified bound	0	Jul 1, 2020
questions of idiom	3	Jun 7, 2010
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023

Is there such an idiom?

Per

Rene Pijlman

Ron Adam

Irmen de Jong

Per

Michael Spencer

Ron Adam

Alex Martelli

Bruce Cropley

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads