How to memoize functions?

C

Chris Reedy

I would like to memoize (I think I've got that right) a collection of
functions for a project I'm working.

<Aside for those not familiar with memoizing functions:>

Given:

def foo(a,b,c):
return <result of large ugly computation>

Memoization of foo would look like:

def foo(a,b,c):
if <this set of arguments has already been handled>
return <result of prior computation>
else
<do large ugly computation>
<save result>
return <result>

</Aside>

The obvious way to memoize a function would be to keep a dictionary with
keys being tuples (or maybe dictionaries) of previous argument lists
and values being the results of the previous computations.

Unfortunately, this will really mess up garbage collection, since
objects that are arguments could never be collected. Something like
weakref.WeakKeyDictionary seems to be the obvious solution. However, a
WeakKeyDictionary won't work since it can't use a tuple as a key, and
wouldn't do the right thing, even if it could.

The only solution I've got so far is to implement a new class, imitating
WeakKeyDictionary. I started down this road, but the implementation
started getting a little involved, since I needed two dictionaries, one
for the argument list -> result mapping and one to keep track of which
objects appear in which argument lists (since an argument tuple must be
deleted when _any_ of its arguments becomes garbage).

So ... does anyone have any suggestions and/or has anyone already done
something like this (I couldn't find anything in the cookbook at
ActiveState).

Thanks in advance, Chris
 
J

Jerome Alet

Le Thu, 26 Jun 2003 16:29:07 -0400, Chris Reedy a écrit :
The obvious way to memoize a function would be to keep a dictionary with
keys being tuples (or maybe dictio naries) of previous argument lists
and values being the results of the previous computations.

Unfortunately, this will really mess up garbage collection, since
objects that are arguments could never be collected.

first I must say that my answer is probably completely stupid.

however why would you want to use the original arguments tuple as a key ?

depending on the computational intensity of your original function, why
not compute some sort of hash (like an hex md5sum) on all the arguments
(converted to strings and concatenated) ?

this way your original arguments would still be garbage collectable (well,
at least I suppose), since only the hash would be used for the key.

again, it may be stupid, I don't know python internal enough

hth

Jerome Alet
 
J

Jeff Epler

The argument tuple is only "materialized" for a moment (it disappears
after the called function returns, if it was taken by *args, or else
before the first line of the function if it was taken by named args).
It will disappear from a weakkeydict immediately as a result.

The common memoize, written here using nested scopes, never trims old
values from the dictionary:
def memoize(f):
d = {}
def g(*args):
if not d.has_key(args):
d[args] = f(*args)
return d[args]
return g
You could make d be a cache of limited size with code like
if not d.has_key(args):
if len(d) > 1000:
d.pop() # XXX
d[args] = f(*args)
the XXX line removes "some key" from the dictionary. The actual item is
not specified, but turns out to have some relationship to the hash value
of args. This may or may not be acceptable to you. Otherwise, you might
implement a true pseudorandom replacement algorithm, or the standard LRU,
or LFU. These may require that you maintain a heap or other auxiliary
list (corresponding to the keys) to achieve O(1) for the "remove an old
memoized value" code.

I just think weakrefs are not the solution to this problem...

Jeff
 
C

Chris Reedy

Jerome said:
Le Thu, 26 Jun 2003 16:29:07 -0400, Chris Reedy a écrit :




first I must say that my answer is probably completely stupid.

however why would you want to use the original arguments tuple as a key ?

Actually, I chose that just because it was the most obvious. The next
choice was a tuple of the ids of the arguments ...
depending on the computational intensity of your original function, why
not compute some sort of hash (like an hex md5sum) on all the arguments
(converted to strings and concatenated) ?

This is certainly a possibility. However ...
this way your original arguments would still be garbage collectable (well,
at least I suppose), since only the hash would be used for the key.

That's true. Unfortunately, that misses the other half of the problem
(which, admittedly, I didn't mention) which is that I would also like to
be able to collect the results of the function, which could be complex
data structures, as well as the arguments (which could be other
instances of the same complex structures).

Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,153
Latest member
NamKaufman
Top