Generic function Memo-izer template class

Discussion in 'C++' started by rep_movsd, Nov 13, 2006.

  1. rep_movsd

    rep_movsd Guest

    Hi folks

    I was on topcoder and came across an interesting problem...

    It involved dynamic programming ( storing function results in a map to
    avoid repeated computation ) and I ended up having to write 3 functions
    that looked pretty much the same, so I thought- "Why not template - ize
    the whole thing" and ended up with a generic Memoizer template, which
    can be used to memoize any recursive function whose result depends on
    multiple recursive invocations of itself.

    Most basic case of such a fn is fibonacci

    int fib(int n)
    {
    return n < 2 ? n : fib(n-1) + fib(n -2);
    }

    this highly stack consuming function will take eternity to compute
    fib(50)

    a standard memoization would be like

    #include <map>
    using std::map;
    map<int, int> fib_memo;
    int fibm(int n)
    {
    map<int,int>::iterator fi = fib_memo.find(n);
    if(fi == fib_memo.end())
    {
    return (fib_memo[n] = fibm(n - 1) + fibm(n - 2));
    }
    else
    return fi->second;
    }

    but its tedious to do this for every function... what i came up with
    lets u express the core computation clearly in the form of helper
    classes and get the template class to memoize it for you automatically
    thusly :

    // define 3 classes needed by the template
    class FibCalc
    {
    public:
    void apply(int a, pair<bool, int>& ret)
    {
    ret.first = a < 2; // fib 0 , 1 => 0, 1
    ret.second = a;
    }
    };

    class FibGetNextIterArgs
    {
    public:
    void apply(int arg, vector<int>& rets)
    {
    rets.push_back(arg - 1);
    rets.push_back(arg - 2);
    }
    };

    class FibCalcNextIter
    {
    public:
    int apply(vector<int>& rets)
    {
    return rets[0] + rets[1];
    }
    };

    and call it like this:

    Memoize<int, int, FibCalc, FibGetNextIterArgs, FibCalcNextIter> fibo;
    int i = 1000;
    cout << fibo.apply(i);

    the Memoize definition is below:

    template<typename RET, typename ARG, class CALC, class GETNEXTITERARGS,
    class CALCNEXTITER>
    class Memoize
    {
    public:
    CALC m_Calc;
    GETNEXTITERARGS m_NextIterArgs;
    CALCNEXTITER m_CalcNextIter;
    map<ARG, RET> m_Memo;
    RET apply(ARG &a)
    {
    pair<bool, RET> r;
    m_Calc.apply(a, r);
    if(r.first)
    return (m_Memo[a] = r.second);

    map<ARG, RET>::iterator mi = m_Memo.find(a);
    if(mi != m_Memo.end())
    {
    return m_Memo[a];
    }
    else
    {
    vector<ARG> nextIterArgs;
    m_NextIterArgs.apply(a, nextIterArgs);

    vector<RET> nextIterRets;
    nextIterRets.reserve(nextIterArgs.size());
    for(size_t i = 0; i < nextIterArgs.size(); ++i)
    {
    nextIterRets.push_back(apply(nextIterArgs));
    }

    return (m_Memo[a] = m_CalcNextIter.apply(nextIterRets));
    }
    }
    };


    Voila ... Auto memoization for functions :D ( I beleive haskell has
    that )

    If any of you gurus have an idea about improving this tell me.

    One immediate thing I can see is to have 1 class instead of 3 and have
    3 members in it, maybe make it a struct to get rid of that "public:"
    I'm adding it to my topcoder bag of tricks...

    Thanks
    Vivek
    rep_movsd, Nov 13, 2006
    #1
    1. Advertising

  2. rep_movsd

    Alan Johnson Guest

    rep_movsd wrote:
    > Hi folks
    >
    > I was on topcoder and came across an interesting problem...
    >
    > It involved dynamic programming ( storing function results in a map to
    > avoid repeated computation ) and I ended up having to write 3 functions
    > that looked pretty much the same, so I thought- "Why not template - ize
    > the whole thing" and ended up with a generic Memoizer template, which
    > can be used to memoize any recursive function whose result depends on
    > multiple recursive invocations of itself.
    >
    > Most basic case of such a fn is fibonacci
    >
    > int fib(int n)
    > {
    > return n < 2 ? n : fib(n-1) + fib(n -2);
    > }
    >
    > this highly stack consuming function will take eternity to compute
    > fib(50)
    >
    > a standard memoization would be like
    >
    > #include <map>
    > using std::map;
    > map<int, int> fib_memo;
    > int fibm(int n)
    > {
    > map<int,int>::iterator fi = fib_memo.find(n);
    > if(fi == fib_memo.end())
    > {
    > return (fib_memo[n] = fibm(n - 1) + fibm(n - 2));
    > }
    > else
    > return fi->second;
    > }
    >
    > but its tedious to do this for every function... what i came up with
    > lets u express the core computation clearly in the form of helper
    > classes and get the template class to memoize it for you automatically
    > thusly :
    >
    > // define 3 classes needed by the template
    > class FibCalc
    > {
    > public:
    > void apply(int a, pair<bool, int>& ret)
    > {
    > ret.first = a < 2; // fib 0 , 1 => 0, 1
    > ret.second = a;
    > }
    > };
    >
    > class FibGetNextIterArgs
    > {
    > public:
    > void apply(int arg, vector<int>& rets)
    > {
    > rets.push_back(arg - 1);
    > rets.push_back(arg - 2);
    > }
    > };
    >
    > class FibCalcNextIter
    > {
    > public:
    > int apply(vector<int>& rets)
    > {
    > return rets[0] + rets[1];
    > }
    > };
    >
    > and call it like this:
    >
    > Memoize<int, int, FibCalc, FibGetNextIterArgs, FibCalcNextIter> fibo;
    > int i = 1000;
    > cout << fibo.apply(i);
    >
    > the Memoize definition is below:
    >
    > template<typename RET, typename ARG, class CALC, class GETNEXTITERARGS,
    > class CALCNEXTITER>
    > class Memoize
    > {
    > public:
    > CALC m_Calc;
    > GETNEXTITERARGS m_NextIterArgs;
    > CALCNEXTITER m_CalcNextIter;
    > map<ARG, RET> m_Memo;
    > RET apply(ARG &a)
    > {
    > pair<bool, RET> r;
    > m_Calc.apply(a, r);
    > if(r.first)
    > return (m_Memo[a] = r.second);
    >
    > map<ARG, RET>::iterator mi = m_Memo.find(a);
    > if(mi != m_Memo.end())
    > {
    > return m_Memo[a];
    > }
    > else
    > {
    > vector<ARG> nextIterArgs;
    > m_NextIterArgs.apply(a, nextIterArgs);
    >
    > vector<RET> nextIterRets;
    > nextIterRets.reserve(nextIterArgs.size());
    > for(size_t i = 0; i < nextIterArgs.size(); ++i)
    > {
    > nextIterRets.push_back(apply(nextIterArgs));
    > }
    >
    > return (m_Memo[a] = m_CalcNextIter.apply(nextIterRets));
    > }
    > }
    > };
    >
    >
    > Voila ... Auto memoization for functions :D ( I beleive haskell has
    > that )
    >
    > If any of you gurus have an idea about improving this tell me.
    >
    > One immediate thing I can see is to have 1 class instead of 3 and have
    > 3 members in it, maybe make it a struct to get rid of that "public:"
    > I'm adding it to my topcoder bag of tricks...
    >
    > Thanks
    > Vivek


    Using map to store the history bothers me a little, just because it
    makes the runtime different (worse) than what many would expect for
    some algorithms. Using Fibonacci as an example, you should be able to
    calculate each successive member of the sequence in constant time
    (assuming constant time operations on integers), so computing the first
    N members of the sequence should require O(N) time.

    Storing the results in a map, though, only gives you a guarantee that
    you'll be able to lookup elements in O(lg N) time, so computing the
    first N members of the sequence has time in O(N lg N).

    I assume you chose a map to conserve space in algorithms where not
    every index is used. I would suggest somehow expanding your template
    to allow the client to decide the space/speed trade off, perhaps by
    supplying a container type as a template parameter.

    --
    Alan Johnson
    Alan Johnson, Nov 13, 2006
    #2
    1. Advertising

  3. rep_movsd

    rep_movsd Guest

    Alan Johnson wrote:
    > Using map to store the history bothers me a little, just because it
    > makes the runtime different (worse) than what many would expect for
    > some algorithms. Using Fibonacci as an example, you should be able to
    > calculate each successive member of the sequence in constant time
    > (assuming constant time operations on integers), so computing the first
    > N members of the sequence should require O(N) time.


    While a tail recursive or iterative fibonacci can be written to run in
    O(n) time, that is not the point of this.

    The point is that there are some algorithms which are much easily
    expressed in a recursive fashion and this technique helps to express
    the solution in a recursive paradigm and still get solutions which do
    not have exponential space and time requirements.

    In such cases, the invocation of the function with argument x would
    depend on N number of recursive invocations with x1 to xN.
    In such situations, memoization is the only way to beat the clock.
    Haskell is one language where functions are memoized automatically.

    Take for example this function to calculate the number of possible k
    digit numbers which consist of the digits 1 to n and where the digits
    are all in non-ascending order :

    eg for k = 2 , n = 3

    11
    21
    22
    31
    32
    33

    are all non ascending.


    Heres the raw recursive implementation

    int getNonDescCombos(int k, int n)
    {
    if(k == 1)
    return n;

    int total = 0;
    for(int i = 1; i <= n; ++i)
    {
    total += getNonDescCombos(k - 1, i);
    }
    return total;
    }

    While its working is clear, No prizes for guessing that this will
    overflow the stack for even modest values of n and k.

    now look at the memoized version :

    map<pair<int, int>, int> nonDescCombos;
    int getNonDescCombos(pair<int, int> arg)
    {
    int k = arg.first;
    int n = arg.second;
    if(k == 1)
    return n;

    map<pair<int, int>, int>::iterator mi = nonDescCombos.find(arg);
    if(mi == nonDescCombos.end())
    {
    int total = 0;
    for(int i = 1; i <= n; ++i)
    {
    total += getNonDescCombos(make_pair(k - 1, i));
    }
    return (nonDescCombos[arg] = total);
    }
    else
    {
    return mi->second;
    }
    }

    getNonDescCombos( make_pair(100, 9) );
    Runs with good space and time charecteristics, but is much more
    complicated and the logic is more opaque.

    now generalized with memoizer template...

    class GetDescCombosCalc
    {
    public:
    void apply(pair<int, int> &p, pair<bool, int>& ret)
    {
    ret.first = p.second == 1;
    ret.second = p.first;
    }
    };

    class GetDescCombosGetNextIterArgs
    {
    public:
    void apply(pair<int, int>& arg, vector<pair<int, int> >& rets)
    {
    for(int i = 1; i <= arg.first; ++i)
    {
    rets.push_back(make_pair(i, arg.second - 1));
    }
    }
    };

    class GetDescCombosCalcNextIter
    {
    public:
    int apply(vector<int>& rets)
    {
    int total = 0;
    for(size_t i = 0; i < rets.size(); ++i)
    {
    total += rets;
    }

    return total;
    }
    };

    Definitely more verbose, but the logic of the recursive version is
    preserved...
    the first class does the job of calculating the simplest cases..
    the second gives a collection of values (x1 to xN) for which the
    function has to be computed before computing f(x).
    The last one computes f(x) given f(x1) to f(xN) computed from the
    previous results.
    A large number of algorithms can readily be genralized to this pattern
    with clear logic.

    Finally the invocation becomes as simple as....

    Memoize<int, pair<int, int>, GetDescCombosCalc,
    GetDescCombosGetNextIterArgs, GetDescCombosCalcNextIter> m;
    int n = 100;
    int k = 29;
    m.apply(make_pair(k, n));

    > Storing the results in a map, though, only gives you a guarantee that
    > you'll be able to lookup elements in O(lg N) time, so computing the
    > first N members of the sequence has time in O(N lg N).
    > I assume you chose a map to conserve space in algorithms where not
    > every index is used. I would suggest somehow expanding your template
    > to allow the client to decide the space/speed trade off, perhaps by
    > supplying a container type as a template parameter.


    One option would be to use a hash_map maybe, and anyway the Memoize
    class is opaque so changing the implementation would make no difference
    to the code for the computation, as long as the logic were same the
    intermediate results could be stored any which way.

    Regards
    Vivek
    rep_movsd, Nov 13, 2006
    #3
  4. Memoizing fixed point combinator (was: Generic function Memo-izer template class)

    rep_movsd wrote:
    > Hi folks
    >
    > I was on topcoder and came across an interesting problem...
    >
    > It involved dynamic programming ( storing function results in a map to
    > avoid repeated computation ) and I ended up having to write 3 functions
    > that looked pretty much the same, so I thought- "Why not template - ize
    > the whole thing" and ended up with a generic Memoizer template, which
    > can be used to memoize any recursive function whose result depends on
    > multiple recursive invocations of itself.


    // You might also like to try something like a memozied fixed point
    // combinator. ( http://en.wikipedia.org/wiki/Y_combinator )

    #include<iostream>
    #include<map>

    std::map<int,int> memo_table;

    struct Fix
    {
    int (*f)(int,Fix);

    int operator()(int x, Fix g)
    {
    return (memo_table.find(x) != memo_table.end())
    ? memo_table[x]
    : memo_table[x] = g.f(x,g);
    };
    };

    int fib(int n, Fix f) { return n < 2 ? n : f(n-1,f) + f(n-2,f); }

    int main(int argc, char *argv[])
    {
    Fix g = {fib};
    std::cout << g(40,g) << std::endl;
    return 0;
    }
    Greg Buchholz, Nov 14, 2006
    #4
  5. Re: Memoizing fixed point combinator

    * Greg Buchholz:
    >
    > #include<iostream>
    > #include<map>
    >
    > std::map<int,int> memo_table;
    >
    > struct Fix
    > {
    > int (*f)(int,Fix);
    >
    > int operator()(int x, Fix g)
    > {
    > return (memo_table.find(x) != memo_table.end())
    > ? memo_table[x]
    > : memo_table[x] = g.f(x,g);
    > };
    > };
    >
    > int fib(int n, Fix f) { return n < 2 ? n : f(n-1,f) + f(n-2,f); }
    >
    > int main(int argc, char *argv[])
    > {
    > Fix g = {fib};
    > std::cout << g(40,g) << std::endl;
    > return 0;
    > }


    What's the point of the argument passing when Fix uses a global table?
    Fix is just a bug waiting to happen. E.g., after using Fix g = {fib},
    try using Fix h = {identityFunction}.

    And what's the point of caching (which for some reason a lot of people
    now call "memoization") when that involves completely rewriting the
    function, which can be much more easily rewritten to be efficient?

    Finally, what's the point of filling in the cache at run-time when that
    will result in all values up to (n, f(n)) being placed in the cache?

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, Nov 14, 2006
    #5
  6. rep_movsd

    rep_movsd Guest

    Re: Memoizing fixed point combinator

    > What's the point of the argument passing when Fix uses a global table?
    > Fix is just a bug waiting to happen. E.g., after using Fix g = {fib},
    > try using Fix h = {identityFunction}.



    > And what's the point of caching (which for some reason a lot of people
    > now call "memoization") when that involves completely rewriting the
    > function, which can be much more easily rewritten to be efficient?


    Well you cannot express some algorithms iteratively no matter how
    cleverly you code, you will end up using an explicit stack.

    > Finally, what's the point of filling in the cache at run-time when that
    > will result in all values up to (n, f(n)) being placed in the cache?


    Well the whole point of memoization is that it forms the base of
    "dynamic" programming problems where divide-and-conquer changes to
    divide-and-conquer-memoize-to-avoid-reconquer.

    I humbly suggest http://en.wikipedia.org/wiki/Dynamic_programming

    Regards
    Vivek
    rep_movsd, Nov 15, 2006
    #6
  7. Re: Memoizing fixed point combinator

    * rep_movsd:
    >> What's the point of the argument passing when Fix uses a global table?
    >> Fix is just a bug waiting to happen. E.g., after using Fix g = {fib},
    >> try using Fix h = {identityFunction}.

    >
    >
    >> And what's the point of caching (which for some reason a lot of people
    >> now call "memoization") when that involves completely rewriting the
    >> function, which can be much more easily rewritten to be efficient?

    >
    > Well you cannot express some algorithms iteratively no matter how
    > cleverly you code, you will end up using an explicit stack.


    So?


    >> Finally, what's the point of filling in the cache at run-time when that
    >> will result in all values up to (n, f(n)) being placed in the cache?

    >
    > Well the whole point of memoization is that it forms the base of
    > "dynamic" programming problems where divide-and-conquer changes to
    > divide-and-conquer-memoize-to-avoid-reconquer.


    Again, so? AFAICS, again there's no relevance to my question.


    > I humbly suggest http://en.wikipedia.org/wiki/Dynamic_programming


    Well.

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, Nov 15, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. christopher diggins
    Replies:
    16
    Views:
    730
    Pete Becker
    May 4, 2005
  2. Replies:
    2
    Views:
    422
  3. johanatan
    Replies:
    6
    Views:
    369
    James Kanze
    Feb 19, 2008
  4. A L
    Replies:
    1
    Views:
    503
    Alf P. Steinbach /Usenet
    Aug 25, 2010
  5. Sudhanshu

    Memo field to Javascript function

    Sudhanshu, Nov 28, 2003, in forum: Javascript
    Replies:
    0
    Views:
    109
    Sudhanshu
    Nov 28, 2003
Loading...

Share This Page