Does object pooling *ever* make sense?

C

Chris

I've read recently that object allocation in recent JVMs is so fast that
it doesn't make sense to create object pools. Just create a new object
when you need it and let the garbage collector do the work.

Does this hold true when your objects are very large, though? What if
your object contains a byte [] of length 100K? Or 1Mb?

What's the breakeven point beyond which it makes sense to reuse objects?
 
L

Lew

Chris said:
I've read recently that object allocation in recent JVMs is so fast that
it doesn't make sense to create object pools. Just create a new object
when you need it and let the garbage collector do the work.

Does this hold true when your objects are very large, though? What if
your object contains a byte [] of length 100K? Or 1Mb?

What's the breakeven point beyond which it makes sense to reuse objects?

As I understand, the memory allocation procedure in the JVM is to add some
size n to a pointer p and call that region the new object. I don't think there
is anything much faster than that. It takes the same amount of time to add 10M
to p as to add 10. (<=1 cycle?)

The time spent is zeroing the memory, I'd guess, but you'd have to do that
with an object pool, too.

I surmise that there is no breakeven point.

- Lew
 
D

Daniel Pitts

Chris said:
I've read recently that object allocation in recent JVMs is so fast that
it doesn't make sense to create object pools. Just create a new object
when you need it and let the garbage collector do the work.
Does this hold true when your objects are very large, though? What if
your object contains a byte [] of length 100K? Or 1Mb?
What's the breakeven point beyond which it makes sense to reuse objects?

As I understand, the memory allocation procedure in the JVM is to add some
size n to a pointer p and call that region the new object. I don't think there
is anything much faster than that. It takes the same amount of time to add 10M
to p as to add 10. (<=1 cycle?)

The time spent is zeroing the memory, I'd guess, but you'd have to do that
with an object pool, too.

I surmise that there is no breakeven point.

- Lew

I'd suggest running a few tests to find out.

The best suggestion I can make is to wrap all your calls to "new" in a
method somewhere, so that if you decide you need to pool, you can.

Generally, its bad engineering to worry about speed before it becomes
a problem. The first priority is clarity. Next to functionality of
course.
If you find that there is a speed (or memory) issue, THEN you run a
profiler (don't ever "guess" or speculate on what the problem is,
you'll most likely be wrong, no matter how good you are)

The profiler will tell you what you need to optimize.

Hope this helps.
 
G

Gordon Beaton

I've read recently that object allocation in recent JVMs is so fast
that it doesn't make sense to create object pools. Just create a new
object when you need it and let the garbage collector do the work.

Does this hold true when your objects are very large, though? What
if your object contains a byte [] of length 100K? Or 1Mb?

What's the breakeven point beyond which it makes sense to reuse
objects?

A disadvantage to pooling is that it increases the average age of your
objects, resulting in more live objects at any given time, more
objects that live beyond the "nursery", and consequently more work for
the garbage collector.

You also have to manage the object pool.

So in order for pooling to be effective, there has to be a saving
somewhere that outweighs these costs (and possibly others I haven't
mentioned). I don't think the cost depends as much on the *size* of
the object as it depends on the amount of additional *initialization*
that you need to do before you can use it, assuming that you can avoid
that initialization in the pooled case.

/gordon
 
A

Andreas Leitgeb

Chris said:
Does this hold true when your objects are very large, though? What if
your object contains a byte [] of length 100K? Or 1Mb?

If you can reuse the byte array without resetting it to zeros
(e.g. if the programs logic is such that every element of
the byte-array is overwritten, anyway) then it is possible
that pooling saves some time, but you take the risk that some
(otherwise minor) bug in your program might turn into nightmare
when old (non-zero) garbage in the array might slip through.
 
A

Andy Dingley

I've read recently that object allocation in recent JVMs is so fast that
it doesn't make sense to create object pools.

Most of my pooled objects are some sort of reference to an external
resource (e.g. a DB connection) that's inherently expensive. I don't
care what the cost of the object itself is, they're pooled to
economise on this external cost and that's not changed by any JVM
improvement.

Given the low cost of Java object creation I find it hard to justify
pooling for objects that are simply entirely Java anyway. What might
cause me to want to pool them? A stateless singleton is justifiable
but anything with state attached to it probably means as much effort
to create the distinct state for each pooled use of the object as it
does to create a whole new object.
 
L

Lew

Andy said:
Most of my pooled objects are some sort of reference to an external
resource (e.g. a DB connection) that's inherently expensive. I don't
care what the cost of the object itself is, they're pooled to
economise on this external cost and that's not changed by any JVM
improvement.

Given the low cost of Java object creation I find it hard to justify
pooling for objects that are simply entirely Java anyway. What might
cause me to want to pool them? A stateless singleton is justifiable
but anything with state attached to it probably means as much effort
to create the distinct state for each pooled use of the object as it
does to create a whole new object.

And remember Gordon's points:


That "more work for the garbage collector" relates to the fact that it's
harder to garbage collect out of the tenured generation than the nursery.
Except for resource gates, it rarely helps to pool.

- Lew
 
A

Andy Dingley

Except for resource gates, it rarely helps to pool.

One somewhat perverse use I have found for object pooling is to avoid
memory leakage. In a context of refactoring truly nasty code, a pool
of re-used objects was one way to avoid coders allocating these things
willy-nilly and never disposing of them afterwards at all! Finite
cruft is (marginally) better than inifinite expanding cruft.

....And it's less cruel than my other technique, with the hot clueiron.
 
L

Lew

Andy said:
One somewhat perverse use I have found for object pooling is to avoid
memory leakage. In a context of refactoring truly nasty code, a pool
of re-used objects was one way to avoid coders allocating these things
willy-nilly and never disposing of them afterwards at all! Finite
cruft is (marginally) better than inifinite expanding cruft.

...And it's less cruel than my other technique, with the hot clueiron.

Of course, in Java one does not often have to explicitly "dispose of" an
object. The usual difficulty is making sure to pull all references away from
an object, but there are a huge number of situations where that isn't necessary.

Consider:

public void foo()
{
Bax bax = new Bax(); // assume it does not wrap a resource
// do some stuff with bax
// but do NOT create other references to the object
}

No explicit disposal needed for the new Bax if it is never referenced
somewhere else, e.g., in a long-lived Collection. Once the variable "bax" goes
out of scope in that scenario, the programmer can rest easy.

It is actually a good idiom in Java to "allocat[e] ... things willy-nilly".

- Lew
 
J

Joe Seigh

Chris said:
I've read recently that object allocation in recent JVMs is so fast that
it doesn't make sense to create object pools. Just create a new object
when you need it and let the garbage collector do the work.

Does this hold true when your objects are very large, though? What if
your object contains a byte [] of length 100K? Or 1Mb?

What's the breakeven point beyond which it makes sense to reuse objects?


Not so much that the gc is so much faster than it's not really all that worse
and the effects of tying up a lot of memory in object pools is more problematic.
The copying collector relies on having enough memory so you don't run out
between normal GC cycles. If you are running out of memory and forcing GC
to run more often, you could just use an object pool and use a WeakReference to
point to the object pool. The pool will be reclaimed on every GC cycle more
or less. The only time a pool will exist is when it's active. I'm assuming
weak references don't slow down GC all that much.
 
L

Lew

Joe said:
Not so much that the gc is so much faster than it's not really all that
worse
and the effects of tying up a lot of memory in object pools is more
problematic.

The Java mechanisms and GC tend to be much faster than hand-rolled object
pooling (except for resource guards, but then it's not memory that is the slow
part any more).

According to what I've read.
The copying collector relies on having enough memory so you don't run out
between normal GC cycles.

What do you mean by "between normal GC cycles"? From what I've read, GC runs
when memory runs low, not on some kind of timer.
If you are running out of memory and forcing GC to run more often,
you could just use an object pool and use a WeakReference to
point to the object pool. The pool will be reclaimed on every GC cycle
more or less. The only time a pool will exist is when it's active. I'm
assuming weak references don't slow down GC all that much.

I have not run across any reference that ties weak references to any
performance considerations for garbage collection.

The point is that you do not need to use object pooling. The Java memory
allocator is more efficient, by all accounts, than any local policy can hope
to achieve, and using custom schemes is only likely to interfere with the Java
mechanism.

The whole point of the Java memory mechanism is to remove responsibility for
memory management from the programmer. Why would anyone want to mess with that?

If you want to tune the collector, that's an entirely different matter. You
are much more likely to gain joy from the java -X and -XX parameters than any
in-process approach.

There is a common theme running in this newsgroup of people not wanting Java
to be Java. This seems to blind them to the strengths of the language, as they
focus on perceived weaknesses. No language is perfect. Part of the art of
programming involves playing to the strengths of the tools at hand.

Java provides a lot of help with memory management. Use what it offers before
trying to defeat it.

- Lew
 
J

Joe Seigh

Lew said:
The point is that you do not need to use object pooling. The Java memory
allocator is more efficient, by all accounts, than any local policy can
hope to achieve, and using custom schemes is only likely to interfere
with the Java mechanism.

For most Java programmers, that is likely true.
The whole point of the Java memory mechanism is to remove responsibility
for memory management from the programmer. Why would anyone want to mess
with that?

Well, I mess with a whole lot of things that people think can't be improved.
Some things can be improved by quite a bit (orders of magnitude).
 
A

Andy Dingley

Of course, in Java one does not often have to explicitly "dispose of" an object.

One certainly must if it's not garbage collectable!

A problem with Java in general, particularly with newbie coders who've
only learned Java from scratch, is that they pay no attention to gc at
all and just assume that "the maid will clear it up". This works fine
for most pure Java objects, but if you start having some object that
represents another resource and isn't automatically disposable then
you're back to needing to manage these manually (at least to some
degree). Leaving it up to Java's gc means that none of them ever get
disposed, and you've a leak.

Languages that reduce the need to think about everything are great for
fast development, but when this extends into a mindset that no longer
thinks at all, it leads to sloppy development.
 
L

Lew

Andy said:
One certainly must if it's not garbage collectable!

A problem with Java in general, particularly with newbie coders who've
only learned Java from scratch, is that they pay no attention to gc at
all and just assume that "the maid will clear it up". This works fine
for most pure Java objects, but if you start having some object that
represents another resource and isn't automatically disposable then
you're back to needing to manage these manually (at least to some
degree). Leaving it up to Java's gc means that none of them ever get
disposed, and you've a leak.

I completely agree. Note that I disclaimed the case where an object manages an
external resource, but then in that case it is no longer a memory issue. The
original question had to do with memory allocation and deallocation overhead,
and in that context Java's mechanism is entirely satisfactory.

With regard to not thinking and assumptions about the maid's thoroughness you
have the right of it. Playing to the strengths of the Java model does not mean
ignoring it, it means understanding it so well that you can take advantage of it.

If you understand the memory model then, for example, you can exploit things
like the WeakReference class.

WRT external resources, Java gives one the finally block.

- Lew
 
C

Chris Uppal

Chris said:
Does this hold true when your objects are very large, though? What if
your object contains a byte [] of length 100K? Or 1Mb?

It is trivially true that object pooling /can/ make sense. The lifetime cost
of an object is the sum of

the time taken to allocate its store
the time taken to zero that store
the time taken in user-code initialising it
the time take to reclaim that store

If you use object pooling then the equation changes to

the time taken to reset the object's state
the time take to manage the pool

So if an object can be "reset" faster than it could be
created+zeroed+initialised then you have a potential saving. If that saving is
great enough to outweigh the costs described by other posters, then you are
making a nett profit.

But, it always possible (with sufficient ingenuity) to devise examples of
objects which take arbitrarily longer to initialise than to reset, so it is
always possible to devise examples of objects which would benefit from pooling.

Such examples don't often come up in real code, though. It would probably
require that the object had an extremely complicated initial internal state but
one to which it could be returned comparatively cheaply. (E.g. some
complicated pre-computed wordlist/dictionary).

Whether you can reach that position just by using larger and larger simple
arrays seems doubtful. Allocating a single object, even a large one, is not an
expensive operation in itself. It's difficult to imagine a scenario where the
initialisation (if any) of the array in user code was significantly cheaper
than resetting it. So that leaves only the time taken by the VM to zero the
store as a potentially worthwhile saving. That is obviously proportional to
the size of the array, so that's looking promising so far. But consider the
overall life of the array -- presumably the code is going to do something with
it, so it seems almost certain that most positions in the array will be written
to, and read, at least once by the application code. Both of those operations
will be at least as expensive as the initial zeroing (per slot), so the time
saved can be no more than 1/3 of the total runtime and, in realistic
applications, almost certainly much less -- which moves the potential saving
into the "it's not worth bothering with" category.

Of course, that last paragraph assumes that the VM/GC implementation is such
that it has no great difficulty with "churning" large objects -- that does
depend on the implementation. It would be interesting to how the numbers work
out in practise rather than in theory[*].

-- chris

[*] And please, don't anyone at this point produce that tired old cliché about
how "theory and practise are the same in theory [etc]"
 
C

Chris Uppal

Lew said:
What do you mean by "between normal GC cycles"? From what I've read, GC
runs when memory runs low, not on some kind of timer.

Depends on the GC implementation. There are algorithms which collect garbage
on a continuous basis. And it's not an unreasonable technique, for VMs aimed
at desktop applications and the like, to run a low-priority GC task on a timer
(that's a VM implementation technique, mind, not a recommendation that desktop
/applications/ should be coded that way).

And in fact the Sun GC implementations, which use multiple spaces, don't really
have a concept of waiting until memory runs low before running GC. They do
have a, comparatively expensive, "full" GC which is run if memory does run low,
but the object of the design is to avoid doing that as far as possible.
Ideally to eliminate it altogether.

-- chris
 
A

Andy Dingley

So if an object can be "reset" faster than it could be
created+zeroed+initialised then you have a potential saving.

Of course. Except that zeroed+initialised is trivially fast,
management isn't and creation is usually pretty speedy too. Your
conclusion here is _far_ from guaranteed.

There's an equally over-simplifed proof that garbage collection is
just one form of pool management, so anything that overlays another
layer of pool management over any existing layer of pool management
_must_ be slower than just having a single layer.


It's all Melvyn Bragg's fault. I've been reading Popper this week and
now I can't believe _anything_ anyone tells me!
 
C

Chris Uppal

Andy Dingley wrote:

[me:]
Of course. Except that zeroed+initialised is trivially fast,
management isn't and creation is usually pretty speedy too. Your
conclusion here is _far_ from guaranteed.

I believe that you have misunderstood the whole of my post.

It's all Melvyn Bragg's fault. I've been reading Popper this week and
now I can't believe _anything_ anyone tells me!

Reading it seems to have damaged your ability to follow an argument too ;-)
Maybe your expectations are now set too high...

-- chris
 
C

Chris

Chris said:
I've read recently that object allocation in recent JVMs is so fast that
it doesn't make sense to create object pools. Just create a new object
when you need it and let the garbage collector do the work.

Does this hold true when your objects are very large, though? What if
your object contains a byte [] of length 100K? Or 1Mb?

What's the breakeven point beyond which it makes sense to reuse objects?

Thanks, everybody, for the insights, but nobody really shed any light on
the original question, which was: what's the breakeven point?

So I wrote a little code to test the question, pasted below. The code
simply allocates byte [] objects of varying sizes. Here are the results,
for 100,000 iterations, elapsed time in milliseconds:

bufsize 10 elapsed = 16
bufsize 1024 elapsed = 47
bufsize 10240 elapsed = 313
bufsize 102400 elapsed = 3078
bufsize 1048576 elapsed = 316540
bufsize 10Mb, terminated because it took too long

JDK 1.6, JVM memory = the default 64mb (increasing JVM memory did not
alter the results).

Contrary to some of the earlier advice in this thread, it is *not*
trivially fast to allocate larger objects. Allocation time increases
roughly linearly as object sizes increase.

Given that fetching an object from a pool 100,000 times should generally
not take more than a few milliseconds (locking & sync included), it
looks like object pooling is a necessity when speed is important and
objects get larger than a few dozen Kb in size.

public static void main(String[] argv) throws Exception {

// 10, 1K, 10K, 100K, 1Mb, 10Mb
int [] BUFSIZES = {10, 1024, 10 * 1024, 100 * 1024, 1024 * 1024, 10
*1024 * 1024};
int ITERATIONS = 100000;

for (int bufSizePtr = 0; bufSizePtr < BUFSIZES.length; bufSizePtr++) {
int bufSize = BUFSIZES[bufSizePtr];
long start = System.currentTimeMillis();
for (int i = 0; i < ITERATIONS; i++) {
byte[] buf = new byte [bufSize];
buf[0] = 1;
}
long elapsed = System.currentTimeMillis() - start;
System.out.println("bufsize " + bufSize + " elapsed = " + elapsed);
}
}
 
C

Chris

Contrary to some of the earlier advice in this thread, it is *not*
trivially fast to allocate larger objects. Allocation time increases
roughly linearly as object sizes increase.

Here's more evidence. The code was rewritten to use System.nanoTime()
and to double the amount of memory allocated on each cycle. Times are
still in millisec:

bufsize 1 elapsed = 61
bufsize 2 elapsed = 23
bufsize 4 elapsed = 22
bufsize 8 elapsed = 23
bufsize 16 elapsed = 24
bufsize 32 elapsed = 29
bufsize 64 elapsed = 41
bufsize 128 elapsed = 60
bufsize 256 elapsed = 107
bufsize 512 elapsed = 192
bufsize 1024 elapsed = 344
bufsize 2048 elapsed = 669
bufsize 4096 elapsed = 1308
bufsize 8192 elapsed = 2519
bufsize 16384 elapsed = 4970
bufsize 32768 elapsed = 9934
bufsize 65536 elapsed = 19732
bufsize 131072 elapsed = 39455
bufsize 262144 elapsed = 73419

For very small objects, allocation time is constant. As you get to
larger objects, allocation time doubles as you double the size. Above
~1K, allocation time is close to linear.

Same pattern applies for JDK 1.4, 1.5, and 1.6.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top