std::map efficiency

Discussion in 'C++' started by Kin Pang, Feb 5, 2004.

  1. Kin Pang

    Kin Pang Guest

    Hi,

    I have a routine where I'm using a std::map from a pair of ints to double
    for caching evaluation results.
    The cache depth could be upto say 1000 in depth.
    However, my application needs to clear and replenish the cache repeatedly
    and profiling seems to suggests
    the performance is handicated by repeated calls to new.
    Can anybody advice me on how to improve the performance or suggests
    alternatives please.

    Thanks in advance.
    Kin Pang, Feb 5, 2004
    #1
    1. Advertising

  2. In article <bvue3h$ru8$>,
    "Kin Pang" <> wrote:

    > Hi,
    >
    > I have a routine where I'm using a std::map from a pair of ints to double
    > for caching evaluation results.
    > The cache depth could be upto say 1000 in depth.
    > However, my application needs to clear and replenish the cache repeatedly
    > and profiling seems to suggests
    > the performance is handicated by repeated calls to new.
    > Can anybody advice me on how to improve the performance or suggests
    > alternatives please.


    Sounds like an ideal application for a "pooling allocator". Create an
    allocator that only expects 1 object to be allocated at a time, but
    grabs a handful of them from the system (say 1000), and hands them out
    as they are needed. You can then specify that allocator in your map,
    and it will use it.

    Issues you will have to answer:

    1. Should the allocator serve only one map, or many?

    2. If serving many, are you operating in a multi-threaded environment?

    If you don't want to write your own pool allocator, you might check out
    Steve Cleary's pool at boost (www.boost.org).

    -Howard
    Howard Hinnant, Feb 5, 2004
    #2
    1. Advertising

  3. Kin Pang

    Cy Edmunds Guest

    "Kin Pang" <> wrote in message
    news:bvue3h$ru8$...
    > Hi,
    >
    > I have a routine where I'm using a std::map from a pair of ints to double
    > for caching evaluation results.
    > The cache depth could be upto say 1000 in depth.
    > However, my application needs to clear and replenish the cache repeatedly
    > and profiling seems to suggests
    > the performance is handicated by repeated calls to new.
    > Can anybody advice me on how to improve the performance or suggests
    > alternatives please.
    >
    > Thanks in advance.
    >



    You should probably be looking for a hash table. Binary trees have to be
    rebalanced a lot as you are constructing them and that can be expensive.

    --
    Cy
    http://home.rochester.rr.com/cyhome/
    Cy Edmunds, Feb 6, 2004
    #3
  4. Kin Pang

    Daniel T. Guest

    "Kin Pang" <> wrote:

    > I have a routine where I'm using a std::map from a pair of ints to double
    > for caching evaluation results.
    > The cache depth could be upto say 1000 in depth.
    > However, my application needs to clear and replenish the cache repeatedly
    > and profiling seems to suggests
    > the performance is handicated by repeated calls to new.
    > Can anybody advice me on how to improve the performance or suggests
    > alternatives please.


    Use a vector< pair< int, int > >. Create a function like:

    bool compare(const pair<int, int>& lhs, const pair<int, int>& rhs) {
    return lhs.first < rhs.first;
    }

    Check out
    <http://ecell.sourceforge.net/doc/refman/html/AssocVector_8h-source.html>
    For a class that implements the map interface using a vector as I
    describe above. Use it, instead of std::map, in your program and see if
    performance improves.
    Daniel T., Feb 6, 2004
    #4
  5. Kin Pang

    Siemel Naran Guest

    "Daniel T." <> wrote in message news:postmaster-
    > "Kin Pang" <> wrote:
    >
    > > I have a routine where I'm using a std::map from a pair of ints to

    double
    > > for caching evaluation results.


    > Use a vector< pair< int, int > >. Create a function like:


    Do you mean vector<pair<pair<int,int>,double>? That's usually faster than
    map for lookup, so it's a good choice when you want to construct a sorted
    array once, then use over and over again for lookup and lower bound. In
    this case try to insert all the elements in any order at once and call sort.
    But inserting one element into the array is expensive because you have to
    keep the array in sorted order, and takes O(N). But who knows, it still
    might give better performance for this problem.

    --
    +++++++++++
    Siemel Naran
    Siemel Naran, Feb 6, 2004
    #5
  6. Kin Pang

    Daniel T. Guest

    "Siemel Naran" <> wrote:

    > "Daniel T." <> wrote in message news:postmaster-
    > > "Kin Pang" <> wrote:
    > >
    > > > I have a routine where I'm using a std::map from a pair of ints to

    > double
    > > > for caching evaluation results.

    >
    > > Use a vector< pair< int, int > >. Create a function like:

    >
    > Do you mean vector<pair<pair<int,int>,double>?


    I thought the OP was using map< int, int >, what would the double be for?


    > That's usually faster than
    > map for lookup, so it's a good choice when you want to construct a sorted
    > array once, then use over and over again for lookup and lower bound. In
    > this case try to insert all the elements in any order at once and call sort.
    > But inserting one element into the array is expensive because you have to
    > keep the array in sorted order, and takes O(N). But who knows, it still
    > might give better performance for this problem.


    He said that profiling suggested that the slow down was in memory
    allocations because the map was constantly being made larger then
    smaller. Using the sorted vector would help that particular problem.
    With a class that implements the map interface with a sorted vector and
    can lazy sort the vector, the OP can easily switch between the two and
    see which is better for any particular problem.
    Daniel T., Feb 6, 2004
    #6
  7. "Kin Pang" <> wrote in message
    news:bvue3h$ru8$...
    | I have a routine where I'm using a std::map from a pair of ints to double
    | for caching evaluation results.
    | The cache depth could be upto say 1000 in depth.
    | However, my application needs to clear and replenish the cache repeatedly
    | and profiling seems to suggests
    | the performance is handicated by repeated calls to new.
    | Can anybody advice me on how to improve the performance or suggests
    | alternatives please.

    As others suggested, some type of hash table is likely to provide
    the best results.

    Here's a basic example:

    double expensiveCompute(int a, int b); // the expensive function

    class CachedCompute
    {
    struct Entry {
    Entry(int a_, int b_)
    : a(a_), b(b_), result:):expensiveCompute(a_,b_)) {}
    int a, b;
    double result;
    };
    enum { kCacheSize = 1024 };
    std::vector<Entry> cache_;

    static int getHashedIndex(int a, int b)
    { // A LOT OF ROOM FOR IMPROVEMENT HERE
    return (17*a + 0x193*b) % kCacheSize;
    }

    public:
    CachedCompute()
    : cache_( kCacheSize, Entry(0,0) ) // init cache with dummy contents
    { }

    double compute(int a, int b)
    {
    Entry& entry = cache_[ getHashedIndex(a,b) ];
    if( entry.a!=a || entry.b!=b )
    entry = Entry(a,b); //overwrite the non-matching entry
    return entry.result;
    }
    };

    Create an instance of CachedCompute and use its compute()
    member function instead of the global expensiveCompute()
    to benefit from the caching.

    The getHashedIndex() function should be optimized to avoid
    collisions (different values of a&b used repeatedly that
    lead to the same index).
    An application-specific optimization is sometimes possible.
    Alternatively, consider a proven hashing approach such as the
    FNV hash: http://www.isthe.com/chongo/tech/comp/fnv/index.html

    The table size (kCacheSize) can also be improved (the std::vector
    is anyway more memory-efficient than std::map, the saving is
    probably about 50%).

    Another improvement might be to look at multiple hash table
    entries when calling compute (e.g. the 4 following ones).
    There are various ways to handle table refreshes in this case...


    Anyway, I would expect the snippet above to provide much better
    performance than an approach based on std::map ...
    But please let me know of your results.

    I hope this helps,
    Ivan
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- e-mail contact form
    Ivan Vecerina, Feb 6, 2004
    #7
  8. Kin Pang

    Kin Pang Guest

    Hi,

    This is a distilled version of what I'm trying to do (code snipet guaranteed
    to compile). I have a Fn which has an operator that I want to cache.
    Fn is used as basis for a functional algebra for more complex function
    expressions.
    The benchmark for the caching performance is EvalCache defined below.
    I'm comparing performance of EvalCacheWithVec with EvalCache using ideas
    suggested kindly by Ivan Vecerina.
    For my application, I'm finding that EvalCache outperforms EvalCacheWithVec
    by a factor of close to two.
    In fact EvalCacheWithVec runs quicker with smaller cache sizes.
    I don't really want to move the cache outside of the class Fn.
    Any suggestions on how I can improve the performance of EvalCacheWithVec
    further?
    Thanks in advance.

    class Fn
    {
    private:
    typedef EvalCache<int,int,double> CACHE;
    // typedef EvalCacheWithVec<int,int,double> CACHE;
    SharedPtr<CACHE> _fnCache;

    SharedPtr<blah> _rep;
    public:
    double operator()( int date_, int evalDate_ ) const
    {
    //! check to see whether cached value is valid and use if so.
    std::pair<int,int> args=std::make_pair(date_, evalDate_);
    const double* cached=_fnCache->lookup(args);
    if (cached==0)//! ie invalid
    {
    double res=(*_rep)(date_, evalDate_);
    _fnCache->add(args, res);
    return res;
    }
    else
    {
    return *cached;
    }
    }
    };

    template< typename T1, typename T2, typename R >
    class EvalCache
    {
    private:
    typedef std::pair<T1,T2> FNARGS;
    FNARGS _args;
    R _cached;
    bool _valid;

    public:
    ~EvalCache(){}
    EvalCache():_valid(false){}

    //! return value of 0 indicates that the cached value cannot be used.
    inline const R* lookup(const FNARGS& args_)
    {
    if (_valid==false) return 0;
    if (_args==args_) return &_cached;
    else return 0;
    }

    inline void add(const FNARGS& args_, const R& res_)
    {
    _args = args_;
    _cached= res_;
    _valid = true;
    }

    inline void clear(){ _valid=false; }
    };



    template<typename T1, typename T2, typename R>
    class EvalCacheWithVec{
    private:
    typedef std::pair<T1,T2> FNARGS;
    typedef std::pair<FNARGS,R> ENTRY;
    std::vector<ENTRY> _cached;
    R _cachedVal;

    enum{kCacheSize=64};
    std::bitset<kCacheSize> _valid;
    bool _noneValid;

    size_t getHashedIndex(T1 a, T2 b)
    { // A LOT OF ROOM FOR IMPROVEMENT HERE
    return (17*a + 0x193*b) % kCacheSize;
    }

    public:
    ~EvalCacheWithVec(){}
    EvalCacheWithVec()
    :
    _cached(kCacheSize),
    _valid(),
    _noneValid(true)
    {}

    //! return value of 0 indicates that the cached value cannot be used.
    const R* lookup(const FNARGS& args_)
    {
    if (_noneValid==true) return 0;

    T1 arg1(args_.first);
    T2 arg2(args_.second);
    size_t idx = getHashedIndex(arg1, arg2);
    if (_valid[idx]==false) return 0;

    const ENTRY& entry = _cached[idx];
    if (entry.first!=args_) return 0;

    _cachedVal=entry.second;
    return &_cachedVal;
    }

    void add(const FNARGS& args_, const R& res_)
    {
    T1 arg1(args_.first);
    T2 arg2(args_.second);
    size_t idx = getHashedIndex(arg1, arg2);
    _cached[idx]=std::make_pair(args_,res_);
    _valid[idx]=true;
    _noneValid=false;
    }

    void clear()
    {
    _noneValid=true;
    _valid.reset();
    }

    };



    "Ivan Vecerina" <> wrote in message
    news:c003qs$7o5$...
    > "Kin Pang" <> wrote in message
    > news:bvue3h$ru8$...
    > | I have a routine where I'm using a std::map from a pair of ints to

    double
    > | for caching evaluation results.
    > | The cache depth could be upto say 1000 in depth.
    > | However, my application needs to clear and replenish the cache

    repeatedly
    > | and profiling seems to suggests
    > | the performance is handicated by repeated calls to new.
    > | Can anybody advice me on how to improve the performance or suggests
    > | alternatives please.
    >
    > As others suggested, some type of hash table is likely to provide
    > the best results.
    >
    > Here's a basic example:
    >
    > double expensiveCompute(int a, int b); // the expensive function
    >
    > class CachedCompute
    > {
    > struct Entry {
    > Entry(int a_, int b_)
    > : a(a_), b(b_), result:):expensiveCompute(a_,b_)) {}
    > int a, b;
    > double result;
    > };
    > enum { kCacheSize = 1024 };
    > std::vector<Entry> cache_;
    >
    > static int getHashedIndex(int a, int b)
    > { // A LOT OF ROOM FOR IMPROVEMENT HERE
    > return (17*a + 0x193*b) % kCacheSize;
    > }
    >
    > public:
    > CachedCompute()
    > : cache_( kCacheSize, Entry(0,0) ) // init cache with dummy contents
    > { }
    >
    > double compute(int a, int b)
    > {
    > Entry& entry = cache_[ getHashedIndex(a,b) ];
    > if( entry.a!=a || entry.b!=b )
    > entry = Entry(a,b); //overwrite the non-matching entry
    > return entry.result;
    > }
    > };
    >
    > Create an instance of CachedCompute and use its compute()
    > member function instead of the global expensiveCompute()
    > to benefit from the caching.
    >
    > The getHashedIndex() function should be optimized to avoid
    > collisions (different values of a&b used repeatedly that
    > lead to the same index).
    > An application-specific optimization is sometimes possible.
    > Alternatively, consider a proven hashing approach such as the
    > FNV hash: http://www.isthe.com/chongo/tech/comp/fnv/index.html
    >
    > The table size (kCacheSize) can also be improved (the std::vector
    > is anyway more memory-efficient than std::map, the saving is
    > probably about 50%).
    >
    > Another improvement might be to look at multiple hash table
    > entries when calling compute (e.g. the 4 following ones).
    > There are various ways to handle table refreshes in this case...
    >
    >
    > Anyway, I would expect the snippet above to provide much better
    > performance than an approach based on std::map ...
    > But please let me know of your results.
    >
    > I hope this helps,
    > Ivan
    > --
    > http://ivan.vecerina.com/contact/?subject=NG_POST <- e-mail contact form
    >
    >
    Kin Pang, Feb 10, 2004
    #8
  9. "Kin Pang" <> wrote in message
    news:c0bkku$bbt$...
    Hi Kin,
    > This is a distilled version of what I'm trying to do (code snipet

    guaranteed
    > to compile).


    Actually it won't compile since Fn needs to be defined after EvalCache.
    It would also be nice to have a complete program with #include directives
    and a main() function - if you would like anyone to actually benchmark it.
    Please also keep in mind that the solution that provides optimal
    execution speed will depend on both the sequence of call inputs,
    and on the computation function that is being cached.

    > I have a Fn which has an operator that I want to cache.
    > Fn is used as basis for a functional algebra for more complex function
    > expressions.
    > The benchmark for the caching performance is EvalCache defined below.
    > I'm comparing performance of EvalCacheWithVec with EvalCache using ideas
    > suggested kindly by Ivan Vecerina.
    > For my application, I'm finding that EvalCache outperforms

    EvalCacheWithVec
    > by a factor of close to two.


    EvalCache only caches the single last result that has been computed,
    and therefore does less work than EvalCacheWithVec.
    This hints at the following problems:
    - the function being cached executes quickly, and there is
    no benefit in caching.
    - the calling code is repeating the evaluation of the same input
    parameters several times, then moving on to the next.
    - your input sequence creates many collisions with the
    current hasing function.

    > In fact EvalCacheWithVec runs quicker with smaller cache sizes.


    This could be a sign that memory access (cache misses) are
    limiting performance. It hints again at a cached function
    that is executing quickly, and might not need to be cached
    (unless input values strongly conflict with the hash function).

    Could you provide a more complete code example, and performance
    metrics (including a comparison with no use of caching) ?

    > I don't really want to move the cache outside of the class Fn.
    > Any suggestions on how I can improve the performance
    > of EvalCacheWithVec further?

    Compared to the code I had posted, two redundancies have
    been added in EvalCacheWithVec:
    - A flag is being kept (in std::bitset) to indicate wheter
    a cache entry is valid or not. Under intensive cache use,
    this will slow things down compared to the original approach:
    filling the whole cache with a valid input-output entry.
    - When a cache entry is updated, the hashed cache index
    is being computed twice.

    Further performance tuning is very dependent on what you
    application specifically does.

    Warm regards,
    Ivan
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- e-mail contact form
    Ivan Vecerina, Feb 11, 2004
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Jansson
    Replies:
    5
    Views:
    6,301
    Ivan Vecerina
    Mar 17, 2005
  2. Replies:
    1
    Views:
    419
    red floyd
    Dec 21, 2008
  3. Thomas J. Gritzan
    Replies:
    6
    Views:
    1,017
    James Kanze
    Dec 22, 2008
  4. James Kanze
    Replies:
    0
    Views:
    1,997
    James Kanze
    Dec 21, 2008
  5. Stephen Horne
    Replies:
    3
    Views:
    592
    Stephen Horne
    Aug 5, 2009
Loading...

Share This Page