Memoizing Generators

Discussion in 'Python' started by Mark, Jul 8, 2003.

  1. Mark

    Mark Guest

    Hello all,

    Peter Norvig has a great recipe for memoizing Python functions available at
    http://www.norvig.com/python-iaq.html . I made good use of this until I
    tried to memoize a generator. At this point, it breaks down because the
    _generator_ object is stored as the value for the *args key.

    So, I came up with this:

    class MemoizeGenerator:
    def __init__(self, fn):
    self.cache={}
    self.fn=fn
    def __call__(self,*args):
    if self.cache.has_key(args):
    for _i in self.cache[args]:
    yield _i
    else:
    self.cache[args]=()
    for _i in self.fn(*args):
    self.cache[args]+=(_i,)
    yield _i

    Basically, if the generator hasn't been called before with these args, it
    will run the generator's iterator through and append a tuple in the
    dictionary value field for these keys. If the args have been used before,
    then it iterates through the previously computed list. It works like a
    charm for me.

    Now, the two remaining issues are 1) can we "unify" the generator
    memoization with the "standard" memoization and 2) can we deal with lists
    as well (i.e. my generator was returning tuples ... since I'm scared of
    lists in recursive generator contexts *wink*). One other memoization
    issues: what if the args (which will become keys) are mutables ... in
    particular, lists. I suppose we could just tuple() them up?

    Regards,
    Mark
    Mark, Jul 8, 2003
    #1
    1. Advertising

  2. > Now, the two remaining issues are 1) can we "unify" the generator
    > memoization with the "standard" memoization and 2) can we deal with lists
    > as well (i.e. my generator was returning tuples ... since I'm scared of
    > lists in recursive generator contexts *wink*).
    > One other memoization issues: what if the args (which will become
    > keys) are mutables ... in particular, lists. I suppose we could just
    > tuple() them up?


    How would you deal with the following generator? (On the assumption you want a
    a general purpose memoisation mechanism :)

    def retrieveData(source=None, command=True,blocksize=1024):
    if source:
    if not command:
    fh = open(source, "rb",0)
    else:
    fh = os.popen(self.command)
    else:
    fh = sys.stdin
    try:
    try:
    while 1:
    data = fh.read(blocksize)
    if data:
    yield data
    else:
    raise Finality
    except Exception, e:
    if source:
    fh.close()
    raise e
    except Finality:
    pass

    If your source is a network connection, or a real user this becomes further
    indeterminate... Not a suggestion to not try, just worth considering :)

    The other problem as I see it with your implementation is it expects the
    generator to terminate. This is far from guaranteed - generators are useful
    for dealing with infinite sequences/lists and working with functions that
    might not halt. (eg calculate the best approximation of pi that you can based
    on time available, not based on number of iterations/memory available)

    eg
    def runGenerator(fn,args,timeAlloted):
    tstart = time.time()
    gen = fn(args).next
    r = gen()
    while 1:
    if time.time()-tstart >=timeAlloted:
    break
    r = gen()
    return r

    Consider the possibility that the function is a "next move" in chess and the
    time is set by the first player to "hard" - ie lots of time. The next time a
    player comes along with the time set to "easy", the memoised version won't be
    aware of this in this scenario and returns the hard result (but very
    quickly...), or worse hits stop iteration - which the original version
    wouldn't. Or even worse - occasionally hits the StopIteration and normally
    returns the "hard" result aroung 80-90% of the time.

    That said memoising _good_ chess results from certain positions _might_ be
    considered a good thing. It's still an issue though.

    Regards,


    Michael.
    Michael Sparks, Jul 9, 2003
    #2
    1. Advertising

  3. Mark

    Mark Guest

    Hi Michael,


    Michael Sparks wrote:

    > How would you deal with the following generator? (On the assumption you
    > want a a general purpose memoisation mechanism :)


    > <*snip* a generator reading from an input source>


    > If your source is a network connection, or a real user this becomes
    > further indeterminate... Not a suggestion to not try, just worth
    > considering :)


    I think one of the standard assumptions with memoization is that you have a
    deterministic algorithm ... specifically, given an input, the output is
    fixed. So, your example of entering a file and getting different data,
    throws typical (in my opinion ... which is by no means sacred) memoization
    out the window.


    > def runGenerator(fn,args,timeAlloted):
    > tstart = time.time()
    > gen = fn(args).next
    > r = gen()
    > while 1:
    > if time.time()-tstart >=timeAlloted:
    > break
    > r = gen()
    > return r


    Now, this is a problem I like. It would be pretty damn awsome to memoize a
    function with 1) a time constraint and 2) a parameter that would indicate
    when a time constraint is no longer sufficient and thus more work needs to
    be done. Finally, if the generator could _restart_ at the point of work
    already done, you would be saving the work done to that point which is
    memoization's purpose:

    Something like:

    runGenerator(PIdigits, None, 10)

    and then later

    runGenerator(PIdigits, None, 20)

    Then our memoizer would say, well, I can get more digits in the extra 10
    seconds .... so, I need to recompute this answer. But wait, hopefully,
    I've stored both the numerical result for 10 seconds along with the
    generator that got me there. So, all I need to do is look up the 10 second
    generator and run it for another 10 seconds. Humm, can we copy generators?
    I don't seem to think so.

    So, the memoizer class would need some more logic and storage to determine
    whether to return a stored result, computer a brand new result, or
    bootstrap off a previous generator. There would also need to be a
    mechanism for determining just how long is "long enough" that more time
    needs to be allocated (a really hard problem might not get any further in
    12 seconds than it does in 9 seconds). And we'd need a copy of the
    generator so we could go from, say 10 second generator to 20 second, but
    later go from 10 to 15 seconds. Although, we could get around the problem
    by using the maximally accurate result computed so far (presuming we can
    look it up in the user specified time interval). Then we'd only need the
    generator at the farther point ... and we could just keep running it when
    the user gives us more time. Yeah, that could work nicely!

    >
    > Consider the possibility that the function is a "next move" in chess and
    > the time is set by the first player to "hard" - ie lots of time. The next
    > time a player comes along with the time set to "easy", the memoised
    > version won't be aware of this in this scenario and returns the hard


    We could paramaterize the generator (and/or the memoizer) based on the
    difficulty setting .... couldn't we?

    > result (but very quickly...), or worse hits stop iteration - which the


    This is another interesting problem .... the "quick result" is actually a
    means of making the game harder.

    > Regards,
    > Michael.


    Great ideas. I might code up the timing based memoizer for kicks.

    Regards,
    Mark
    Mark, Jul 9, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Daishi  Harada

    Memoizing decorator

    Daishi Harada, Dec 6, 2005, in forum: Python
    Replies:
    6
    Views:
    409
    Daishi Harada
    Dec 8, 2005
  2. weakref and memoizing

    , Jan 20, 2006, in forum: Python
    Replies:
    0
    Views:
    368
  3. Paul McGuire

    Memoizing and WeakValueDictionary

    Paul McGuire, Jan 4, 2009, in forum: Python
    Replies:
    1
    Views:
    315
  4. Andrea Crotti

    why memoizing is faster

    Andrea Crotti, Mar 24, 2011, in forum: Python
    Replies:
    0
    Views:
    172
    Andrea Crotti
    Mar 24, 2011
  5. Seeking advice on memoizing

    , Jan 26, 2004, in forum: Perl Misc
    Replies:
    7
    Views:
    105
    Michele Dondi
    Jan 28, 2004
Loading...

Share This Page