How to memoize functions?

Chris Reedy · Jun 26, 2003

I would like to memoize (I think I've got that right) a collection of
functions for a project I'm working.

<Aside for those not familiar with memoizing functions:>

Given:

def foo(a,b,c):
return <result of large ugly computation>

Memoization of foo would look like:

def foo(a,b,c):
if <this set of arguments has already been handled>
return <result of prior computation>
else
<do large ugly computation>
<save result>
return <result>

</Aside>

The obvious way to memoize a function would be to keep a dictionary with
keys being tuples (or maybe dictionaries) of previous argument lists
and values being the results of the previous computations.

Unfortunately, this will really mess up garbage collection, since
objects that are arguments could never be collected. Something like
weakref.WeakKeyDictionary seems to be the obvious solution. However, a
WeakKeyDictionary won't work since it can't use a tuple as a key, and
wouldn't do the right thing, even if it could.

The only solution I've got so far is to implement a new class, imitating
WeakKeyDictionary. I started down this road, but the implementation
started getting a little involved, since I needed two dictionaries, one
for the argument list -> result mapping and one to keep track of which
objects appear in which argument lists (since an argument tuple must be
deleted when _any_ of its arguments becomes garbage).

So ... does anyone have any suggestions and/or has anyone already done
something like this (I couldn't find anything in the cookbook at
ActiveState).

Thanks in advance, Chris

Jerome Alet · Jun 26, 2003

Le Thu, 26 Jun 2003 16:29:07 -0400, Chris Reedy a écrit :

The obvious way to memoize a function would be to keep a dictionary with
keys being tuples (or maybe dictio naries) of previous argument lists
and values being the results of the previous computations.

Unfortunately, this will really mess up garbage collection, since
objects that are arguments could never be collected.

first I must say that my answer is probably completely stupid.

however why would you want to use the original arguments tuple as a key ?

depending on the computational intensity of your original function, why
not compute some sort of hash (like an hex md5sum) on all the arguments
(converted to strings and concatenated) ?

this way your original arguments would still be garbage collectable (well,
at least I suppose), since only the hash would be used for the key.

again, it may be stupid, I don't know python internal enough

hth

Jerome Alet

Jeff Epler · Jun 26, 2003

The argument tuple is only "materialized" for a moment (it disappears
after the called function returns, if it was taken by *args, or else
before the first line of the function if it was taken by named args).
It will disappear from a weakkeydict immediately as a result.

The common memoize, written here using nested scopes, never trims old
values from the dictionary:
def memoize(f):
d = {}
def g(*args):
if not d.has_key(args):
d[args] = f(*args)
return d[args]
return g
You could make d be a cache of limited size with code like
if not d.has_key(args):
if len(d) > 1000:
d.pop() # XXX
d[args] = f(*args)
the XXX line removes "some key" from the dictionary. The actual item is
not specified, but turns out to have some relationship to the hash value
of args. This may or may not be acceptable to you. Otherwise, you might
implement a true pseudorandom replacement algorithm, or the standard LRU,
or LFU. These may require that you maintain a heap or other auxiliary
list (corresponding to the keys) to achieve O(1) for the "remove an old
memoized value" code.

I just think weakrefs are not the solution to this problem...

Jeff

Chris Reedy · Jun 27, 2003

Jerome said:
Le Thu, 26 Jun 2003 16:29:07 -0400, Chris Reedy a écrit :

first I must say that my answer is probably completely stupid.

however why would you want to use the original arguments tuple as a key ?

Actually, I chose that just because it was the most obvious. The next
choice was a tuple of the ids of the arguments ...

depending on the computational intensity of your original function, why
not compute some sort of hash (like an hex md5sum) on all the arguments
(converted to strings and concatenated) ?

This is certainly a possibility. However ...

this way your original arguments would still be garbage collectable (well,
at least I suppose), since only the hash would be used for the key.

That's true. Unfortunately, that misses the other half of the problem
(which, admittedly, I didn't mention) which is that I would also like to
be able to collect the results of the function, which could be complex
data structures, as well as the arguments (which could be other
instances of the same complex structures).

Chris

problem using both Memoize::Expire and DB_File	2	Jul 20, 2005
Arguments for map'ped functions	1	Feb 5, 2009
I'm tempted to quit out of frustration	1	Aug 13, 2023
[ Please help ] how to create Python functions in C++ at runtime &call them	0	Nov 21, 2007
Does GCC optimize variadic functions to death?	14	Mar 16, 2010
Functions	3	Aug 18, 2003
Brain going crazy with recursive functions	2	Dec 7, 2008
How to protect functions from being called unsynchronized	2	May 24, 2010

How to memoize functions?

Chris Reedy

Jerome Alet

Jeff Epler

Chris Reedy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads