reuse HashMap$Entry (or HashMap in total) to avoid millions of allocations

Discussion in 'Java' started by Vince Darley, Sep 10, 2004.

  1. Vince Darley

    Vince Darley Guest

    An application we've built makes very extensive use of
    java.util.HashMaps to store the results of 10s of thousands of
    calculations it does. Each of those calculations involves reading
    from/writing to a HashMap, eventually filling it with, say,
    10000-100000 entries. Once complete, the information is copied out
    into a fixed matrix, and we can safely discard the HashMap (and all
    its entries of course).

    Now, the problem with this is that the code spends 40%+ of its time
    allocating new HashMap$Entry objects (according the profiling we've
    done), and manages to allocate some 33 million of them (>1 Gb of
    memory in total), although certainly no more than 1/2 million are
    active at any one time (say 10-20Mb, according to the profiler).

    So, my question is: are there any suggestions for how we can
    restructure this code or its use of HashMap to avoid such a mad
    allocation/deallocation frenzy. It would all, presumably, run more
    quickly if we could avoid this problem.

    Any ideas?

    Vince.
     
    Vince Darley, Sep 10, 2004
    #1
    1. Advertising

  2. Vince Darley

    Chris Smith Guest

    Vince Darley wrote:
    > Now, the problem with this is that the code spends 40%+ of its time
    > allocating new HashMap$Entry objects (according the profiling we've
    > done), and manages to allocate some 33 million of them (>1 Gb of
    > memory in total), although certainly no more than 1/2 million are
    > active at any one time (say 10-20Mb, according to the profiler).
    >
    > So, my question is: are there any suggestions for how we can
    > restructure this code or its use of HashMap to avoid such a mad
    > allocation/deallocation frenzy. It would all, presumably, run more
    > quickly if we could avoid this problem.


    No, you can't reuse HashMap$Entry objects. You could use a different
    implementation of Map, which you write yourself, that uses some kind of
    pooling internally. The question is then whether you can make your own
    object management for pooling faster than the JVM's memory management.
    I personally doubt you could do too awfully much better.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
     
    Chris Smith, Sep 10, 2004
    #2
    1. Advertising

  3. Vince Darley

    Frank Guest

    Re: reuse HashMap$Entry (or HashMap in total) to avoid millions ofallocations

    Vince Darley wrote:
    > Now, the problem with this is that the code spends 40%+ of its time
    > allocating new HashMap$Entry objects (according the profiling we've
    > done)


    I'm sure you're reading your profiling data wrong.

    Here's a dump from code which basically does nothing but add items to a
    hashmap. As you can see, HashMap$Entry.<init> doesn't show up until
    11th, at 1.67% CPU time.
    Map.put() is bound to take some time, but perhaps you can shave off a
    few % by using a larger initial capacity.


    CPU TIME (ms) BEGIN (total = 940616) Sat Sep 11 15:45:33 2004
    rank self accum count trace method
    1 32.60% 32.60% 1 301225 Q.main
    2 30.27% 62.87% 5300314 301127 java.util.HashMap.put
    3 11.31% 74.18% 5300314 301126 java.util.HashMap.addEntry
    4 5.43% 79.62% 5300314 301123 java.util.HashMap.hash
    5 5.40% 85.02% 1234 301130 java.util.HashMap.resize
    6 5.22% 90.24% 1234 301129 java.util.HashMap.transfer
    7 2.33% 92.57% 7262242 301128 java.util.HashMap.indexFor
    8 1.73% 94.30% 5300314 301124 java.util.HashMap.indexFor
    9 1.70% 95.99% 5300314 301122 java.lang.Integer.hashCode
    10 1.70% 97.69% 5300314 301121 java.util.HashMap.maskNull
    11 1.67% 99.36% 5300314 301125 java.util.HashMap$Entry.<init>
    12 0.10% 99.46% 2 301089 java.lang.Object.wait

    >, and manages to allocate some 33 million of them (>1 Gb of
    > memory in total), although certainly no more than 1/2 million are
    > active at any one time (say 10-20Mb, according to the profiler).


    The garbage collector may not be working too hard when there's plenty of
    free heap. Specify a heap size for the JVM if your program is laying
    claim on too much system resources.

    -Frank
     
    Frank, Sep 11, 2004
    #3
  4. Vince Darley

    Jeff Guest

    We had the same concern.

    First, our analysis did not show the creation of entry as a big time
    consumer. Our profiling did not indicate it, nor did our discussions with
    the JVM providers. Object creation is reasonably fast.

    Second, you could consider writing your own HashMap as another contributor
    suggested. java.util.HashMap uses an array for storage. You could use
    those methods as a starting point for your own class's methods.

    "Vince Darley" <> wrote in message
    news:...
    > An application we've built makes very extensive use of
    > java.util.HashMaps to store the results of 10s of thousands of
    > calculations it does. Each of those calculations involves reading
    > from/writing to a HashMap, eventually filling it with, say,
    > 10000-100000 entries. Once complete, the information is copied out
    > into a fixed matrix, and we can safely discard the HashMap (and all
    > its entries of course).
    >
    > Now, the problem with this is that the code spends 40%+ of its time
    > allocating new HashMap$Entry objects (according the profiling we've
    > done), and manages to allocate some 33 million of them (>1 Gb of
    > memory in total), although certainly no more than 1/2 million are
    > active at any one time (say 10-20Mb, according to the profiler).
    >
    > So, my question is: are there any suggestions for how we can
    > restructure this code or its use of HashMap to avoid such a mad
    > allocation/deallocation frenzy. It would all, presumably, run more
    > quickly if we could avoid this problem.
    >
    > Any ideas?
    >
    > Vince.
     
    Jeff, Sep 11, 2004
    #4
  5. Vince Darley

    emilchacko

    Joined:
    Oct 14, 2009
    Messages:
    2
    try using CERN's colt
    or trove collections
     
    emilchacko, Mar 2, 2010
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lily
    Replies:
    16
    Views:
    1,348
    Bob Jenkins
    May 10, 2004
  2. tshad
    Replies:
    5
    Views:
    539
    Steve C. Orr [MVP, MCSD]
    May 17, 2005
  3. Hylander

    To reuse or not to reuse....

    Hylander, Feb 26, 2004, in forum: Java
    Replies:
    0
    Views:
    425
    Hylander
    Feb 26, 2004
  4. code reuse and design reuse

    , Feb 7, 2006, in forum: C Programming
    Replies:
    16
    Views:
    1,032
    Malcolm
    Feb 12, 2006
  5. jacob navia

    To reuse or not to reuse

    jacob navia, Nov 5, 2006, in forum: C Programming
    Replies:
    19
    Views:
    533
    Dave Thompson
    Dec 18, 2006
Loading...

Share This Page