dict.has_key(x) versus 'x in dict'

Paul Melis · Dec 6, 2006

Hello,

I've always been using the has_key() method to test if a dictionary
contains a certain key. Recently I tried the same using 'in', e.g.

d = { ... }
if k in d:
...

and found that it seems to perform a lot better when lots of key-tests
are to be performed. I also noticed that has_key() is scheduled to be
removed from future (C)Python versions.

Does the 'in' way of testing have an optimized implementation compared
to has_key()? Is that the reason has_key() is being phased out?

Thanks,
Paul

Fredrik Lundh · Dec 6, 2006

Paul said:
I've always been using the has_key() method to test if a dictionary
contains a certain key. Recently I tried the same using 'in', e.g.

d = { ... }
if k in d:
...

and found that it seems to perform a lot better when lots of key-tests
are to be performed. I also noticed that has_key() is scheduled to be
removed from future (C)Python versions.

Does the 'in' way of testing have an optimized implementation compared
to has_key()?

no, but full method calls are a lot slower than the under-the-hood C-level
call used by the "in" operator.

this is why e.g.

string[:len(prefix)] == prefix

is often a lot faster than

string.startswith(prefix)

and why

if character in string:
string = string.replace(character, replacement)

is faster than

string = string.replace(character, replacement)

if the character isn't there most of the time, despite the fact that the "replace"
method doesn't actually do something if the character isn't found.

</F>

Bjoern Schliessmann · Dec 6, 2006

Paul said:
I've always been using the has_key() method to test if a
dictionary contains a certain key. Recently I tried the same using
'in', e.g.

d = { ... }
if k in d:
...

Wouldn't be "if k in d.keys()" be the exact replacement?

Regards,

Björn

Fredrik Lundh · Dec 6, 2006

Bjoern said:
Wouldn't be "if k in d.keys()" be the exact replacement?

no. that would convert an O(1) operation to an O(n) operation, which
would be rather silly.

</F>

Peter Otten · Dec 6, 2006

Bjoern said:
Wouldn't be "if k in d.keys()" be the exact replacement?

No, 'k in d' is equivalent to 'd.has_key(k)', only with less (constant)
overhead for the function call. 'k in d.keys()' on the other hand creates a
list of keys which is then searched linearly -- about the worst thing you
can do about both speed and memory footprint. Some numbers:

$ python2.5 -m timeit -s"N = 1; d = dict.fromkeys(range(N))" "N in d.keys()"
1000000 loops, best of 3: 0.693 usec per loop
$ python2.5 -m timeit -s"N = 1000; d = dict.fromkeys(range(N))" "N in
d.keys()"
10000 loops, best of 3: 77.2 usec per loop # ouch!

$ python2.5 -m timeit -s"N = 1; d = dict.fromkeys(range(N))" "N in d"
10000000 loops, best of 3: 0.192 usec per loop
~ $ python2.5 -m timeit -s"N = 1000; d = dict.fromkeys(range(N))" "N in d"
10000000 loops, best of 3: 0.192 usec per loop

$ python2.5 -m timeit -s"N = 1; d = dict.fromkeys(range(N))" "d.has_key(N)"
1000000 loops, best of 3: 0.376 usec per loop
$ python2.5 -m timeit -s"N = 1000; d = dict.fromkeys(range(N))"
"d.has_key(N)"
1000000 loops, best of 3: 0.385 usec per loop

Peter

Roberto Bonvallet · Dec 6, 2006

Fredrik Lundh wrote:
[...]

this is why e.g.

string[:len(prefix)] == prefix

is often a lot faster than

string.startswith(prefix)

This is interesting. In which cases does the former form perform better?
[I won't stop using str.startswith anyway

]

Regards.

Andy Dingley · Dec 6, 2006

Paul said:
I've always been using the has_key() method to test if a dictionary
contains a certain key. Recently I tried the same using 'in', e.g.

I've found using the set type to be the quickest way to do many of
these tasks. That leads me to another problem: how to cast / convert
sets as something else afterwards

What's the best (i.e. Pythonic) way to do this?
I need to generate a set (lots of intersections involved), but then I
need to display it sorted

lstBugsChanged = [ bugId for bugId in setBugsChanged ]
lstBugsChanged.sort()

Roberto Bonvallet · Dec 6, 2006

Andy said:
I need to generate a set (lots of intersections involved), but then I
need to display it sorted

lstBugsChanged = [ bugId for bugId in setBugsChanged ]
lstBugsChanged.sort()

In Python > 2.4:

sorted(setBugsChanged)

Fredrik Lundh · Dec 6, 2006

Andy said:
I need to generate a set (lots of intersections involved), but then I
need to display it sorted

lstBugsChanged = [ bugId for bugId in setBugsChanged ]
lstBugsChanged.sort()

lstBugsChanged = sorted(setBugsChanged)

</F>

Roberto Bonvallet · Dec 6, 2006

I said:
In Python > 2.4:

lastline.replace(">", ">=")

Bjoern Schliessmann · Dec 6, 2006

Peter said:
No, 'k in d' is equivalent to 'd.has_key(k)', only with less
(constant) overhead for the function call.

Ah, thx. Thought the "x in d" syntax might search in d.values() too.

Regards,

Björn

Paul Melis · Dec 6, 2006

Bjoern said:
Ah, thx. Thought the "x in d" syntax might search in d.values() too.

I don't think it does

Python 2.4.3 (#1, Nov 19 2006, 13:16:36)
[GCC 3.4.5 (Gentoo 3.4.5-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.False

Paul

Fredrik Lundh · Dec 6, 2006

Roberto said:
this is why e.g.

string[:len(prefix)] == prefix

is often a lot faster than

string.startswith(prefix)

Click to expand...

This is interesting. In which cases does the former form perform better?

no time to doublecheck right now, but iirc, last time we benchmarked
this, slicing was faster when len(prefix) < 300 characters or so.

</F>

Andy Dingley · Dec 6, 2006

Roberto said:
lstBugsChanged = [ bugId for bugId in setBugsChanged ]

Click to expand...

In Python > 2.4:

Hmmm. Thanks. Another reason to twist the admin's arm and get them to
upgrade the last 2.3.4 boxen

sorted(setBugsChanged)

Out of interest, whats the Pythonic way to simply cast (sic) the set to
a list, assuming I don't need it sorted? The list comprehension?

Peter Otten · Dec 6, 2006

Out of interest, whats the Pythonic way to simply cast (sic) the set to
a list, assuming I don't need it sorted? The list comprehension?

list(setBugsChanged)

Peter

Roberto Bonvallet · Dec 6, 2006

Andy said:
Out of interest, whats the Pythonic way to simply cast (sic) the set to
a list, assuming I don't need it sorted? The list comprehension?

mySet = set(myList)
myList = list(mySet)

Paul McGuire · Dec 6, 2006

Peter Otten said:
list(setBugsChanged)

Peter

Note that this is not really a "cast" in the C sense of the word.
list(setBugsChanged) constructs and returns a new list containing the
elements of setBugsChanges, and leaves setBugsChanged unchanged.

-- Paul

Bjoern Schliessmann · Dec 6, 2006

Paul said:
I don't think it does

Thanks for trying, I was too lazy

Regards,

Björn

Duncan Booth · Dec 6, 2006

Paul Melis said:
Ah, thx. Thought the "x in d" syntax might search in d.values() too.

Click to expand...

I don't think it does

Python 2.4.3 (#1, Nov 19 2006, 13:16:36)
[GCC 3.4.5 (Gentoo 3.4.5-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.False

It is easy enough to to check if you remember that 'in' maps to the
__contains__ method:
Help on built-in function __contains__:

__contains__(...)
D.__contains__(k) -> True if D has a key k, else False

skip · Dec 7, 2006

Peter> No, 'k in d' is equivalent to 'd.has_key(k)', only with less
Peter> (constant) overhead for the function call. 'k in d.keys()' on the
Peter> other hand creates a list of keys which is then searched linearly
Peter> -- about the worst thing you can do about both speed and memory
Peter> footprint.

I will admit that way back when (maybe 8 yrs ago) I actually did this in a
piece of frequently executed code that's been stable for a looong time. I
have no idea why I might have written it that way. Brain fart I suppose. I
only noticed my mistake a couple months ago during a trip into the
containing function for some other stuff. Man, was I embarrassed...

Skip

Python 3: dict & dict.keys()	4	Jul 24, 2013
Unexpected comparisons in dict lookup	1	Mar 18, 2014
dict and __cmp__() question	2	Sep 7, 2005
Help for my project in the last minute	0	Apr 23, 2022
Resolving OpenSSL version mismatch in OS X 10.5.8	3	Sep 22, 2009
Builtin classes list, set, dict reimplemented via B-trees	1	Sep 14, 2005
Python profiler	0	Oct 3, 2005
Why is dictionary.keys() a list and not a set?	90	Nov 23, 2005

dict.has_key(x) versus 'x in dict'

Paul Melis

Fredrik Lundh

Bjoern Schliessmann

Fredrik Lundh

Peter Otten

Roberto Bonvallet

Andy Dingley

Roberto Bonvallet

Fredrik Lundh

Roberto Bonvallet

Bjoern Schliessmann

Paul Melis

Fredrik Lundh

Andy Dingley

Peter Otten

Roberto Bonvallet

Paul McGuire

Bjoern Schliessmann

Duncan Booth

skip

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads