Fast constant functions for Py2.5's defaultdict()

R

Raymond Hettinger

FWIW, here are three ways of writing constant functions for
collections.defaultdict():

d = defaultdict(int) # slowest way; works only for zero
d = defaultdict(lambda: 0) # faster way; works for any constant
d = defaultdict(itertools.repeat(0).next) # fastest way; works
for any constant

Another approach is to use the __missing__ method for dictionary
subclasses:

class zerodict (dict):
def __missing__ (self, key):
return 0 # fast on initial miss, but slow on
non-misses; works for any constant

The itertools.repeat(const).next approach wins on speed and
flexibility.


Raymond
 
G

Giovanni Bajo

FWIW, here are three ways of writing constant functions for
collections.defaultdict():

d = defaultdict(int) # slowest way; works only for zero
d = defaultdict(lambda: 0) # faster way; works for any constant
d = defaultdict(itertools.repeat(0).next) # fastest way; works
for any constant

Another approach is to use the __missing__ method for dictionary
subclasses:

class zerodict (dict):
def __missing__ (self, key):
return 0 # fast on initial miss, but slow on
non-misses; works for any constant

The itertools.repeat(const).next approach wins on speed and
flexibility.

But it's the most unreadable too. I'm surprised that defaultdict(int) is
slower than the lambda one though. What's the reason?
 
R

Raymond Hettinger

But it's the most unreadable too.

Not really. It's unusual but plenty readable (no surprise that
repeat(0) repeatedly gives you zero). I think it more surprising that
int() with no arguments gives you a zero.
I'm surprised that defaultdict(int) is
slower than the lambda one though. What's the reason?

All that comes to mind is that int() has to call
PyArg_ParseTupleAndKeywords() while the lambda is unburdened by
argument passing.


Raymond
 
M

Michele Simionato

Not really. It's unusual but plenty readable (no surprise that
repeat(0) repeatedly gives you zero). I think it more surprising that
int() with no arguments gives you a zero.

Well, if I was doing code review of some of my coworkers I would ask
them
to use them int if the constant was zero and lambda otherwise. If they
wanted
to use itertools.repeat(const).next they should prove me that the
speed
increase is absolutely significant in their actual use case and
they should put a big comment in the code explaining why they
preferred
the cryptic defaultdict(itertools.repeat(0).next) over the obvious
defaultdict(int).

Michele Simionato
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,156
Latest member
KetoBurnSupplement
Top