Are there any Garbage Collector improvements?

M

Mark Sizzler

We have problems with the Java Garbage Collector.
It is very slow when we hold large tables in memory and perform many inserts and updates.

As far as I remember there are 3rd party improved garbagr collector products.
Does someone have a recommendation?

Are there any best practice hints on how to improve the Java built-in GC?

Mark
 
J

John B. Matthews

We have problems with the Java Garbage Collector. It is very slow
when we hold large tables in memory and perform many inserts and
updates.

This makes me wonder if you are unintentionally retaining objects in a
collection that implements the Map interface.

[...]
Are there any best practice hints on how to improve the Java built-in GC?

Avoid unintentional object retention and consider WeakHashMap:

<http://java.sun.com/javase/6/docs/api/java/util/WeakHashMap.html>
<http://www-128.ibm.com/developerworks/java/library/j-jtp11225/>
<http://www.ibm.com/developerworks/java/library/j-perf08273.html>
 
L

Lew

This makes me wonder if you are unintentionally retaining objects in a
collection that implements the Map interface.

Other possibilities are references from long-lived objects to short-
lived ones, and intentional retention of objects in the mistaken
belief that it will reduce GC overhead.

Avoid unintentional object retention and consider WeakHashMap:

Also avoid intentional retention of references.

If the algorithm requires that an object live a long time, let it
live. But an antipattern in Java is to keep an object and reuse it
for different values over and over, e.g.,

public class GcAntiPattern
{
public static void main( String [] args )
{
Foo foo = new Foo();
for ( int ix = 0; ix < 10000; ++ix )
{
fillWithValues( foo );
doSomethingWithFoo( foo );
}
}
}

Usually in such cases it is better to allocate the 'Foo' inside the
loop so that it can be GCed by a minor collection instead of a major
one.

Also:
<http://www.ibm.com/developerworks/java/library/j-jtp09275.html>
<http://www.ibm.com/developerworks/java/library/j-jtp01274.html>
 
D

Daniel Pitts

Mark said:
We have problems with the Java Garbage Collector.
It is very slow when we hold large tables in memory and perform many inserts and updates.

As far as I remember there are 3rd party improved garbagr collector products.
Does someone have a recommendation?

Are there any best practice hints on how to improve the Java built-in GC?

Mark
Don't hold large tables in memory :)
Are you sure its the garbage collector? Have you tried tuning the
performance parameters:
<http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html>
 
L

Lew

Lew said:
If the algorithm requires that an object live a long time, let it
live. But an antipattern in Java is to keep an object and reuse it
for different values over and over, e.g.,

public class GcAntiPattern
{
public static void main( String [] args )
{
Foo foo = new Foo();
for ( int ix = 0; ix < 10000; ++ix )
{
fillWithValues( foo );
doSomethingWithFoo( foo );
}
}
}

Usually in such cases it is better to allocate the 'Foo' inside the
loop so that it can be GCed by a minor collection instead of a major
one.

Avinash said:
In the case of that anti-pattern...

Can you elaborate a little more on why the anti-pattern is bad? Why
is reusing an object bad? Is it worse than creating an object in a
loop like that?

The referenced links explain it better than I probably can, but the key is
that Java uses a generational garbage collector. Young generation collections
are very fast, and object creation is blazingly fast. Tenured generation
collections take much longer. Also, the JVM can often optimize away object
creation altogether with temporary short-lived objects, depending on their
shape and usage. Basically, as one of the cited articles point out, the
programmer usually cannot do better than the compiler in Java.

This is only a rule of thumb. It could be that creation of umpty-gazillion
objects inside the loop would trigger so many young-generation GC cycles that
it would help to create one outside the loop. Or, Hotspot might figure that
out for you. It's hard to tell.

That is the crux. We as programmers really don't know. It's better to scope
a variable for its natural life - if an object is only used inside the loop,
declare it inside the loop. This will prevent bugs that would be much, much
worse than a putative, unprovable slowdown due to GC.
 
T

Tom Anderson

We have problems with the Java Garbage Collector. It is very slow when
we hold large tables in memory and perform many inserts and updates.

As far as I remember there are 3rd party improved garbagr collector
products. Does someone have a recommendation?

Are there any best practice hints on how to improve the Java built-in
GC?

I'm not aware of any way to plug a third-party GC into Sun's JVM. You
could switch to using another JVM - IBM make a good one, but i'm not aware
of any more which are anywhere near as good. They're mostly research VMs,
or fairly basic open-source ones. AFAIK, anyway.

However, there are a lot of flags you can use to tune the way Sun's GC
works. This guide discusses the most important stuff:

http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html

Googling for things like 'java garbage collection tuning' will find you
more.

tom
 
A

Arne Vajhøj

Tom said:
I'm not aware of any way to plug a third-party GC into Sun's JVM. You
could switch to using another JVM - IBM make a good one, but i'm not
aware of any more which are anywhere near as good. They're mostly
research VMs, or fairly basic open-source ones. AFAIK, anyway.

BEA and Oracle have their own JVM's too (Oracle owns BEA now, but
I don't think they have merged product lines yet).

BEA JRockIt has a very good reputation.

Arne
 
J

J. Davidson

Lew said:
Other possibilities are references from long-lived objects to short-
lived ones, and intentional retention of objects in the mistaken
belief that it will reduce GC overhead.

The garbage collector used to be very different.

Old Java garbage collectors like mark-sweep had to do work in proportion
to the number of dead objects. So to make code run fast you reused
objects and avoided discarding too many.

The relatively new generational garbage collector has to do work in
proportion to the number of surviving objects instead. So to make code
run fast you now should discard objects rather than retain them.

GC optimization has basically been turned 180 degrees and stood up on
its head by this. The advice to get the fastest GC performance now is
diametrically opposite what was best with the older GC.

If Mark is working with a Java project with very many tree-rings in it,
it's quite likely the thing was coded exactly as horribly as possible
from the stand-point of GC optimization, because of GC-optimization.

Unfortunately, making it run fast with the newer GC means a lot of work
or even a total rewrite of big chunks of it in that case. On the plus
side, the results will be well worth it, making everything far faster
than the old code used with the older GC. And it's very unlikely that
the next big revolution in GC will flip everything over again, so the
new code will be future-proofed in one respect that the old code
(obviously) wasn't.

Another good reason is that treating most objects as disposable can
greatly simplify a lot of the logic, getting rid of pools and other
scaffolding and allowing some of them to be made immutable, which lets
you get rid of setters and possibly a lot of other code. Code that had
to be maintained, and might contain bugs, as well as was bloating up the
size of the running image and the jar files, bloating up the
documentation, and cluttering up the IDE's method listing and code view.

You can also afford to make classes that encapsulate a few primitives
where you might have once just used the primitives directly to cut down
on object creation. To use a worn-out old example, you can get rid of
all those xs and ys worked on in tandem and use Point2D or Complex or
whatever fits the situation without all those used-up Point2Ds or
Complexes gumming up the garbage collector the way they would have, five
or ten years ago. The code might get more readable, simpler, and less
error-prone.

Or you might just end up wishing for operator overloading to be in Java
7. :)

- jenny
 
T

Tom Anderson

The garbage collector used to be very different.

Old Java garbage collectors like mark-sweep had to do work in proportion to
the number of dead objects.

I don't think that was ever true. Mark-and-sweep and stop-and-copy both do
work proportional to the number of live objects - dead objects are never
reached during traversal of the object graph, and never touched.

I think only C-style memory managers, which put each deleted block on a
free list, do work proportional to dead objects.

Rather, what it was that the pre-generational collectors would do work
proportional to the *total* number of live objects on every collection,
whereas generational collectors do work proportional to the number of live
objects *in the nursery* (roughly). That meant that if you had a
significant amount of long-lived objects (which most apps do), you wanted
to avoid frequent collections, because each one would walk all your
objects, and thus ...
So to make code run fast you reused objects and avoided discarding too
many.
Bingo.

GC optimization has basically been turned 180 degrees and stood up on
its head by this. The advice to get the fastest GC performance now is
diametrically opposite what was best with the older GC.

Also bingo. This is why i shudder when i read things like this:

http://lab.polygonal.de/2008/06/18/using-object-pools/

Right now, Flash has a pretty basic collector, and object pooling is
apparently a very big win. At some point in the next few years, it'll get
a more sophisticated collector (along with a JIT and all the other goodies
needed to keep up with javascript), and then there's an excellent chance
that all this stuff will end up being a big smoking hole in everyone's
feet.

With any luck, there will be a small number of pool libraries in use, and
they'll all be written so that they can be replaced with no-op non-pooling
versions when the time comes. Fingers crossed.

tom
 
J

J. Davidson

Tom said:
I think only C-style memory managers, which put each deleted block on a
free list, do work proportional to dead objects.

I recall reading about garbage collectors that did so as well. I
definitely recall old Java versions performing better if you reused
objects rather than discarded them -- exactly opposite to current Java.
About the latter you seem to be in agreement.
 
T

Tom Anderson

I recall reading about garbage collectors that did so as well. I
definitely recall old Java versions performing better if you reused
objects rather than discarded them -- exactly opposite to current Java.
About the latter you seem to be in agreement.

Yes, absolutely. I think this was due to their slow implementation of
allocation, rather than their doing per-dead-object work. I think.

tom
 
R

Roedy Green

We have problems with the Java Garbage Collector.
It is very slow when we hold large tables in memory and perform many inserts and updates.

Others will tackle your problem directly. Here are some things to
check before you invest big bucks in new GC package.

Have you instrumented to be sure the problem is GC? not the
operations on the tables themselves?

Does your table technique keep allocating new objects frequently? It
should be doing something like ArrayList does, using a buffer bigger
than needed and only growing it when it overflows.

Have you done a study of the objects to make sure there is no
packratting? No GC is going to work well if you accidentally hold on
to objects you don't really need. see
http://mindprod.com/jgloss/packratting.html

Finally there is the ballerina in phone booth problem. What is your
ratio of live object space to heap space? No GC will work well when
that ratio gets too large.

--
Roedy Green Canadian Mind Products
http://mindprod.com
Your old road is
Rapidly agin'.
Please get out of the new one
If you can't lend your hand
For the times they are a-changin'.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top