Unexpected comparisons in dict lookup

Steven D'Aprano · Mar 18, 2014

I stumbled across this unexpected behaviour with Python 2.7 and 3.3. When
you look up a key in a dict, the key is sometimes compared against other
keys twice instead of just once.

First, a class that reports when it is being tested for equality, with a
fixed hash so that we get collisions inside the dict:

class Spam:
def __init__(self, label):
self.label = label
def __str__(self):
return self.label
def __hash__(self):
return 100
def __eq__(self, other):
print("comparing %s with %s" % (self, other))
return self is other

Create a dict and prepare for collisions:

x = Spam("x")
y = Spam("y")
d = {100: 1} # hash(100) == 100

When we add x to the dict, it collides with the existing key 100, so I
expect one call to Spam.__eq__, and that's exactly what I get:

py> d[x] = 200
comparing x with 100

But when I try adding y to the dict, it collides with 100, then it
collides with x, then it apparently collides with x a second time:

py> d[y] = 300
comparing y with 100
comparing x with y
comparing x with y

I expected only two calls to __eq__, not three: first comparing with 100,
then comparing with x. Checking for keys gives the same result:

py> x in d
comparing x with 100
True
py> y in d
comparing y with 100
comparing x with y
comparing x with y
True

What's more, checking for a key which is not present also compares three
times instead of twice:

py> Spam("z") in d
comparing z with 100
comparing x with z
comparing x with z
comparing y with z
False

I expected it to compare z with 100, then x *once*, then y, then return
False.

Why is the dict lookup comparing against x twice? It doesn't seem to be
the fault of x. If I create a slightly different dict, with the
collisions in a different order:

py> e = {x: 100}
py> e[100] = 200
comparing x with 100
py> e[y] = 300
comparing x with y
comparing y with 100
comparing y with 100

Ian Kelly · Mar 18, 2014

I stumbled across this unexpected behaviour with Python 2.7 and 3.3. When
you look up a key in a dict, the key is sometimes compared against other
keys twice instead of just once.

From what I can see in the code, it adds a perturbation based on the

upper bits of the hash value to the probing scheme, to reduce
collisions for keys with unequal hashes. On the downside, this
cancels the guarantee that each bucket can only be checked at most
once. The perturbation gradually shifts to 0 after a few iterations,
so every bucket can still be reached within O(n) iterations.

See the comments starting at "Major subtleties ahead":
http://hg.python.org/cpython/file/f8b40d33e45d/Objects/dictobject.c#l106

Dict comprehension help	0	Dec 5, 2012
Blue J Ciphertext Program	2	Nov 22, 2023
Python 3: dict & dict.keys()	4	Jul 23, 2013
Secure Keyboard v2.0 Modern C++ Virtual Keyboard for Windows (Glassmorphism UI, Clipboard Auto-Clear)	0	Mar 26, 2026
Python code problem	2	Apr 23, 2023
Help for my project in the last minute	0	Apr 23, 2022
Using non-dict namespaces in functions	3	Mar 17, 2012
Overriding dict constructor	3	Sep 20, 2010

Unexpected comparisons in dict lookup

Steven D'Aprano

Ian Kelly

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads