Memoizing decorator

D

Daishi Harada

Hi,

I'm trying to find/write a memoizing decorator
that works both for functions and methods.
I've been looking at the following two samples:

http://wiki.python.org/moin/PythonDecoratorLibrary#head-11870a08b0fa59a8622201abfac735ea47ffade5
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/325205

AFAICT, both of these seem to be targetting bare
functions (the second seems to break with:
unbound methods must have non-NULL im_class
).

I'm using the following code to test the decorators:

---
def test_memoize(memoize_fn):
@memoize_fn
def fn(x):
print '(eval %s)' % str(x),
return x
print 'fn', fn(1)
print 'fn', fn(1)

class A(object):
@memoize_fn
def fn(self, x):
print '(eval %s)' % str(x),
return x
a = A()
print 'meth', a.fn(1)
print 'meth', a.fn(1)
---

In the method test, I get:
fn() takes exactly 2 arguments (1 given)

As I understand things the difficulty here
is due to the way Python resolves between
functions, unbound methods, and bound
methods, which I have a basic sense of,
and how user-defined callables fits into
this picture, which I'm not very clear on.
I *think* the reason this exception comes
up is because user-defined callables don't
automagically get a 'self' prepended to
the argument list.

So I've managed to get these working for both
functions and methods by wrapping them in
yet another function (the following is for the
cookbook example, replacing 'cachedmethod'):

---
def memoize(function):
im = Memoize(function)
def fn(*args, **kwargs):
return im(*args, **kwargs)
return fn
---

However, this seems wasteful in that there
are now 2 "wrapper" function invocations
before one gets to the underlying function.

I would appreciate any advice on how to
proceed/clean this up.

Thanks,
Daishi
 
S

Scott David Daniels

Daishi said:
... I've managed to get these working for both
functions and methods by wrapping them in
yet another function (the following is for the
cookbook example, replacing 'cachedmethod'):

---
def memoize(function):
im = Memoize(function)
def fn(*args, **kwargs):
return im(*args, **kwargs)
return fn
---
But, this is equivalent to:
def memoize(function):
im = Memoize(function)
return im

Which is (in turn) equivalent to:
memoize = Memoize

So you can just use
@Memoize
def function (....
....


--Scott David Daniels
(e-mail address removed)
 
B

Bengt Richter

Hi,

I'm trying to find/write a memoizing decorator
that works both for functions and methods.
ISTM you can't do that without defining exactly what
the instance that a method is bound to (i.e., the self argument)
contributes to the function/method result. It could be anything,
and it could guarantee that results are no more cacheable than random.random()
results. If you are saying (from the look of your test_memoize) that you want
to ignore the instance (self) entirely, you are effectively saying the method
might as well be a staticmethod.
I've been looking at the following two samples:

http://wiki.python.org/moin/PythonDecoratorLibrary#head-11870a08b0fa59a8622201abfac735ea47ffade5
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/325205

AFAICT, both of these seem to be targetting bare
functions (the second seems to break with:
unbound methods must have non-NULL im_class
).

I'm using the following code to test the decorators:

---
def test_memoize(memoize_fn):
@memoize_fn
def fn(x):
print '(eval %s)' % str(x),
return x
print 'fn', fn(1)
print 'fn', fn(1)

class A(object):
@memoize_fn
def fn(self, x):
print '(eval %s)' % str(x),
return x
a = A()
print 'meth', a.fn(1)
print 'meth', a.fn(1)
---

In the method test, I get:
fn() takes exactly 2 arguments (1 given)

As I understand things the difficulty here
is due to the way Python resolves between
functions, unbound methods, and bound
methods, which I have a basic sense of,
and how user-defined callables fits into
this picture, which I'm not very clear on.
I *think* the reason this exception comes
up is because user-defined callables don't
automagically get a 'self' prepended to
the argument list.
No, users can define callables that will get a "self' prepended" --
it's a matter of defining it so it has a __get__ method, which
functions do (whether bound to names in class variable name space or not,
it just the attribute lookup via instance finding it in the class that
triggers the __get__ method-binding)

So I've managed to get these working for both
functions and methods by wrapping them in
yet another function (the following is for the
cookbook example, replacing 'cachedmethod'):

---
def memoize(function):
im = Memoize(function)
def fn(*args, **kwargs):
return im(*args, **kwargs)
return fn
---

However, this seems wasteful in that there
are now 2 "wrapper" function invocations
before one gets to the underlying function.

I would appreciate any advice on how to
proceed/clean this up.
First, I would suggest thinking about the exact semantics of method
memoization. E.g. suppose you write

class Oops(object):
def __init__(self, factor): self.factor = factor
@memoize_fn
def mul(self, x):
return self.factor * x

You would need a separate cache for every instance to use x as a cache lookup key,
or look up with say (x, id(self)), either way assuming that self.factor can't change.

But if you wanted something just to pass the test, and you really wanted to use
the same decorator for both, then you could wrap the decorated function/method
in a custom descriptor that would do what you want whether called as a function
or as a bound method. Here's a hack, caching just on the basis of non-self args,
tested only as far as you see.
... @memoize_fn
... def fn(x):
... print '(eval %s)' % str(x),
... return x
... print 'fn', fn(1)
... print 'fn', fn(1)
... class A(object):
... @memoize_fn
... def fn(self, x):
... print '(eval %s)' % str(x),
... return x
... a = A()
... print 'meth', a.fn(1)
... print 'meth', a.fn(1)
... ... _cache = {}
... # define closure-seeing (f & _cache) __call__ for method use
... def __call__(self, *args, **kw):
... try: return _cache[args]
... except KeyError:
... if self.inst is not None:
... retval = _cache[args] = f(self.inst, *args, **kw)
... else:
... retval = _cache[args] = f(*args, **kw)
... return retval
... class FunMethod(object):
... def __init__(self): self.inst=None
... def __get__(self, inst, cls=None):
... self.inst = inst
... return self
... FunMethod.__call__ = __call__ # to get access to f & _cache
... return FunMethod()
... fn (eval 1) 1
fn 1
meth (eval 1) 1
meth 1


Regards,
Bengt Richter
 
A

Alex Martelli

Scott David Daniels said:
But, this is equivalent to:
def memoize(function):
im = Memoize(function)
return im

It's only equivalent if Memoize returns a function, or another
descriptor whose __get__ behaves like a function's. In other words, if
Memoize is a class with a __call__ method, you need to add to it a
__get__ method able to bind the `self' argument (and if you want
complete emulation of Memoize being a function, its __get__ method must
also be able to produce the type-checking equivalent of the unbound
method objects that a function's __get__ would build).

Only when such a __get__ method is in class Memoize does this
simplification work, and similarly for the further ones you show:
Which is (in turn) equivalent to:
memoize = Memoize

So you can just use
@Memoize
def function (....


Alex
 
D

Daishi Harada

Hi Bengt,

Thanks for your reply.
First, I would suggest thinking about the exact semantics of method
memoization. E.g. suppose you write

class Oops(object):
def __init__(self, factor): self.factor = factor
@memoize_fn
def mul(self, x):
return self.factor * x

You would need a separate cache for every instance to use x as a cache lookup key,
or look up with say (x, id(self)), either way assuming that self.factor can't change.

You're right; I wasn't thinking clearly enough.
I was in fact hoping to get the same memoize
to work with classmethods also.

I wonder if you wouldn't mind taking a look at
the following - it seems to "work" as given,
but I'm likely missing something again.
In particular, if one flips the order of the
'classmethod' and 'memoize_fn' decorators,
I get a 'classmethod not callable' error,
and I'm losing track of what's really going on.

(I've also "pushed" all of your logic into a
single descriptor class - am I forgetting
something by doing so?)

Thanks again and in advance for your
response,

Daishi

---

def test_memoize(memoize_fn):
def printsep():
print '-'*30

@memoize_fn
def fn(x):
print '(eval %s)' % str(x),
return x
print 'fn', fn(1); printsep()
print 'fn', fn(1); printsep()

class A(object):
n_insts = 0
def __init__(self, n):
self.n = n
A.n_insts += 1

@memoize_fn
def fn(self, x):
print '(eval %s %s)' % (str(self), str(x))
return x+self.n

# The following will memoize the wrong answer
# unless all instances are created first.
@classmethod
@memoize_fn
def clsfn(cls, x):
print '(clsmeth %s %s)' % (str(cls), str(x))
return x+cls.n_insts

def test_class(a):
print 'meth', a.fn(1); printsep()
print 'meth', a.fn(1); printsep()
print 'clsmeth', A.clsfn(1); printsep()
print 'clsmeth', A.clsfn(1); printsep()
a1 = A(1)
a2 = A(2)
test_class(a1)
test_class(a2)

class DescriptorMemoize(object):
def __init__(self, fn):
self.fn = fn
self._cache = {}
self._inst = None
def __get__(self, inst, owner):
print '__get__', inst, owner
self._inst = inst
return self
def __call__(self, *args):
if self._inst is not None:
args = (self._inst,)+args
try:
return self._cache[args]
except KeyError:
value = self.fn(*args)
self._cache[args] = value
return value
except TypeError:
return self.fn(*args)
 
D

Daishi Harada

Hi Bengt,

Thanks for your reply.
First, I would suggest thinking about the exact semantics of method
memoization. E.g. suppose you write

class Oops(object):
def __init__(self, factor): self.factor = factor
@memoize_fn
def mul(self, x):
return self.factor * x

You would need a separate cache for every instance to use x as a cache lookup key,
or look up with say (x, id(self)), either way assuming that self.factor can't change.

You're right; I wasn't thinking clearly enough.
I was in fact hoping to get the same memoize
to work with classmethods also.

I wonder if you wouldn't mind taking a look at
the following - it seems to "work" as given,
but I'm likely missing something again.
In particular, if one flips the order of the
'classmethod' and 'memoize_fn' decorators,
I get a 'classmethod not callable' error,
and I'm losing track of what's really going on.

(I've also "pushed" all of your logic into a
single descriptor class - am I forgetting
something by doing so?)

Thanks again and in advance for your
response,

Daishi

---

def test_memoize(memoize_fn):
def printsep():
print '-'*30

@memoize_fn
def fn(x):
print '(eval %s)' % str(x),
return x
print 'fn', fn(1); printsep()
print 'fn', fn(1); printsep()

class A(object):
n_insts = 0
def __init__(self, n):
self.n = n
A.n_insts += 1

@memoize_fn
def fn(self, x):
print '(eval %s %s)' % (str(self), str(x))
return x+self.n

# The following will memoize the wrong answer
# unless all instances are created first.
@classmethod
@memoize_fn
def clsfn(cls, x):
print '(clsmeth %s %s)' % (str(cls), str(x))
return x+cls.n_insts

def test_class(a):
print 'meth', a.fn(1); printsep()
print 'meth', a.fn(1); printsep()
print 'clsmeth', A.clsfn(1); printsep()
print 'clsmeth', A.clsfn(1); printsep()
a1 = A(1)
a2 = A(2)
test_class(a1)
test_class(a2)

class DescriptorMemoize(object):
def __init__(self, fn):
self.fn = fn
self._cache = {}
self._inst = None
def __get__(self, inst, owner):
print '__get__', inst, owner
self._inst = inst
return self
def __call__(self, *args):
if self._inst is not None:
args = (self._inst,)+args
try:
return self._cache[args]
except KeyError:
value = self.fn(*args)
self._cache[args] = value
return value
except TypeError:
return self.fn(*args)
 
D

Daishi Harada

Hi,

Sorry about the previous dup; I'm posting through
Google Groups which seems to have burped.

Anyways, I've "improved" things (or at least
got things passing more tests). I now bind
the cache to each object instance (and class
for classmethods).

At least one issue still remains, mostly due
to the fact that I still don't really grok how
descriptors are supposed to work.
I managed to get the

@memoize
@classmethod

case to not break by "delegating" to the
classmethod's __get__ from my memoize's
in order to obtain the "correct" underlying
method/function. It seems, however, that
the classmethod implementation doesn't
do the same delegation, so now the flip
case of

@classmethod
@memoize

is having issues where AFAICT there's no
way for the memoize to know what class
it's bound to because classmethod doesn't
invoke it as a decorator. Therefore in that
configuration the memoize is still using
the cache bound to the underlying function
instead of the class object as would be
natural.

Any thoughts/comments would still be
appreciated.

Thanks,
Daishi


---
# All my debugging prints still remain ...
import types
import sys

class DescriptorMemoize(object):
def __init__(self, fn):
print '__init__', self, fn
self.fn = fn
self._cache = {}
def __call__(self, *args):
# Regular invocation case
print '__call__', self, args
return self._callmain(self.fn, self._cache, args)
def _callmain(self, fn, cache, args):
# The basic logic here is the same as in the
# wiki decorator library.
print '_callmain', fn, cache, args
try:
return cache[args]
except KeyError:
value = fn(*args)
cache[args] = value
return value
except TypeError:
return fn(*args)
def __get__(self, inst, owner):
# Method invocation case
print '__get__', self, inst, owner
# Get the right underlying method
try:
fn = self.fn.__get__(inst, owner)
print 'got bound'
except AttributeError:
fn = self.fn
print 'using default fn'
# Get the right cache
# (Generally get the instance's cache)
if (inst is None or
isinstance(self.fn, classmethod)):
print 'using owner cache', owner
obj = owner
else:
print 'using inst cache', inst
obj = inst
try:
# Bypass regular attribute lookup
maincache = obj.__dict__['_memoizecache']
print 'found maincache in obj', obj
except KeyError:
maincache = {}
obj._memoizecache = maincache
print 'created new maincache on', obj
cache = maincache.setdefault(self.fn, {})
print 'maincache+id , cache', maincache, id(maincache), cache
def call(*args):
print 'call', args
return self._callmain(fn, cache, args)
return call

def mk_memoize_test_class(memoize_fn):
class A(object):
n_insts = 0

def __init__(self, n):
self.n = n
A.n_insts += 1

# De-decorate so we can test things easier
def meth(self, x):
print '(eval-meth %s %s)' % (str(self), str(x))
return x+self.n
mem_meth = memoize_fn(meth)

# The following will memoize the wrong answer
# unless all instances are created first.
def clsmeth(cls, x):
print '(eval-clsmeth %s %s)' % (str(cls), str(x))
return x+cls.n_insts
cls_clsmeth = classmethod(clsmeth)
memcls_clsmeth = memoize_fn(cls_clsmeth)
clsmem_clsmeth = classmethod(memoize_fn(clsmeth))

return A

def test_memoize(memoize_fn):
def printsep():
print '-'*30
sys.stdout.flush()

@memoize_fn
def fn(x):
print '(eval %s)' % str(x)
return x

printsep()
print 'bare fn'
print fn(1)
print fn(1)
printsep()

A = mk_memoize_test_class(memoize_fn)
def test_class(a):
for f in ('meth', 'mem_meth'):
print f
m = getattr(a, f)
try:
print m(1)
print m(1)
except Exception, e:
print 'XXX', e
printsep()
for (label, o) in (('inst', a), ('cls', A)):
for f in ('cls_clsmeth', 'memcls_clsmeth',
'clsmem_clsmeth'):
print label
print f
m = getattr(o, f)
try:
print m(1)
print m(1)
except Exception, e:
print 'XXX', e
printsep()

a1 = A(1)
a2 = A(10)
printsep()
print 'Testing a1'
printsep()
test_class(a1)
printsep()
print 'Testing a2'
printsep()
test_class(a2)

if __name__ == '__main__':
test_memoize(DescriptorMemoize)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top