Garbage collection problem

D

dontrango

Hi,

I have a garbage collection problem below. After line 15, why would the
object referenced by a is eligible for garbage collection whereas that
referenced by b is not?

Thanks for the help.

1 class TestA {
2 TestB b;
3 TestA ( ) { b = new TestB (this); }
4 }
5
6 class TestB {
7 TestA a;
8 TestB (TestA a) { this.a = a; }
9 }
10
11 class TestAll {
12 public static void main (String [ ] args) {
13 new TestAll.makeThings ( );
14 // ... code
15 }
16
17 void makeThings ( ) { testA test = new TestA ( ); }
18 }
 
J

John C. Bollinger

dontrango said:
I have a garbage collection problem below. After line 15, why would the
object referenced by a is eligible for garbage collection whereas that
referenced by b is not?

Thanks for the help.

If you want homework help then it is to your benefit to show at least
some evidence of trying to solve the problem yourself. Some reasoning
supporting at least a partial position on the question, something to
show that you have at least read the relevant part of your text or (even
better) the actual specification -- give us something to work with.
Neither you nor any of the rest of us is well served if you manage to
pass your class without actually knowing the material.


John Bollinger
(e-mail address removed)
 
D

dontrango

1 class TestA {
2 TestB b;
3 TestA ( ) { b = new TestB (this); }
4 }
5
6 class TestB {
7 TestA a;
8 TestB (TestA a) { this.a = a; }
9 }
10
11 class TestAll {
12 public static void main (String [ ] args) {
13 new TestAll.makeThings ( );
14 // ... code
15 }
16
17 void makeThings ( ) { testA test = new TestA ( ); }
18 }

here is my thread of thought:

line 13 creates an instance of TestAll class and calls its makeThings
method, without keeping a reference to it.
line 17 instantiates a TestA object, calling its constructor. In the
process, this calls the constructor of class B that sets
the instance variable of TestB object a to the TestA object.

looks like an island of isolation but not quite; they are instances of
different classes.

so I disagree with the statement 'After line 15, the object referenced by a
is eligible for
garbage collection whereas that referenced by b is not?' since both should
be eligible ( the current thread has no access to
both objects ).

What is your opinion on that?
 
X

xarax

dontrango said:
Hi,

I have a garbage collection problem below. After line 15, why would the
object referenced by a is eligible for garbage collection whereas that
referenced by b is not?

Thanks for the help.

1 class TestA {
2 TestB b;
3 TestA ( ) { b = new TestB (this); }
4 }
5
6 class TestB {
7 TestA a;
8 TestB (TestA a) { this.a = a; }
9 }
10
11 class TestAll {
12 public static void main (String [ ] args) {
13 new TestAll.makeThings ( );

You meant: new TestAll().makeThings();
14 // ... code
15 }
16
17 void makeThings ( ) { testA test = new TestA ( ); }

You meant: void makeThings() {TestA test = new TestA();}


1. If you must use line numbers, make them /*...*/ comments, so
that others can simply copy&paste your code.

2. Only post code that you know will compile. The above
code will not compile (even after removing the line numbers).

3. If your code has comments, be sure they a /*...*/ comments
instead of // comments, because sometimes the code wraps and
screws-up the // comments.

4. The reasoning that "a" is eligible for garbage collection
and "b" is not: "b" is not eligible until all of its strong
references are nullified and "a" still has a strong reference
to "b". "b" is ineligible for GC until after "a" has been
reclaimed. "a" and "b" are not both simultaneously eligible,
but rather incrementally eligible. GC won't know that "b"
is eligible until sometime after it determines that "a"
is eligible. However, depending on the JVM implementation, GC
may defer reclaiming "a" until "b" is eligible also eligible
for reclaim. Eligibility for reclaim and actual reclaiming
are two completely different phases of GC.

It may be easier to understand by examining the "reachability"
of the objects:

At line 13, the current thread can only indirectly reach "b"
through a strong reference to "test" local variable. When the
makeThings() method returns (line 15), its stackframe is popped
(and all local variables on that stackframe are nullified). Thus,
the thread loses its last strong reference to the single instance
of TestA (i.e., the "a" instance). At that time, GC can make a
determination that the TestA instance is eligible for reclaim.
The TestB instance (i.e., the "b" instance) is not yet eligible,
because GC hasn't actually reclaimed the TestA instance (and
nullifying its reference fields). Only *after* the TestA instance
becomes eligible for reclaim will GC notice that its TestB instance
field was the last strong reference for the "b" object. The
determination of GC eligibility is an incremental process.

Note, however, that there are theoretical GC models that
can determine precisely when an instance is eligible for
reclaim without a reachability search (via reference tracking
algorithms). Such GC models still require an incremental
approach, but not a search of the heap.


Hope this helps.
 
C

Chris Uppal

xarax said:
Only *after* the TestA instance
becomes eligible for reclaim will GC notice that its TestB instance
field was the last strong reference for the "b" object. The
determination of GC eligibility is an incremental process.

Are you using the expression "eligible for reclaim" in a technical sense,
defined as part of some Java spec somewhere ? I can't find anywhere that does
so, but could very easily have missed something.

If so then I can understand how the necessity for precision for describing the
lifetime of an object in the presence of finalisation could lead to terminology
that makes what you say exactly correct.

But if not, then I think it's wrong. Ignoring finalisation for a moment, in
what I would call "normal" terminology, an object becomes eligible for reclaim
once there is no longer any path leading from a root, such as a thread's stack
frame, to that object. (I'm also ignoring weak/soft/phantom references here).
Hence, at the moment where any one object becomes eligible, all other objects
that are only reachable via that object also become eligble -- by definition.

What may change is whether and how an actual GC algorithm can *detect* that
eligibility. Algorithms in the broad category including mark-and-sweep,
copying, etc will in one sense discover that both objects are unreachable at
the same time. In another sense they never discover that any object is
unreachable -- they are only interested in ones that are reachable, everything
else is just unintialised RAM. GC algorithms which use some variant of
reference counting *do* have the incremental nature that you describe -- the GC
actively follows trains of unreachability: "aha this is unreachable. Good. So
that means *this* is unreachable too". And so on...

When you factor finalisation into the picture then it gets more complicated,
and the terminology doesn't seem to be particularly well established. However,
one way to see it is that finalisation breaks the link between being
"unreachable from a root" and "eligible for reclaim". Another way of seeing it
(which matches the language of the JLS2 rather better) is that the system
automatically moves finalisable objects which are not otherwise reachable into
a state where they are only reachable by other objects in that state and by the
finalisation process (zero, one, or more threads). In either case (as I read
the rather opaque text) finalisation does introduce something rather like the
incremental process that you describe in that no object becomes eligible for
reclaim until it, and all chains of references to it, have been finalized
without making it reachable again.

I suspect, though, that you know all this perfectly well, and what we have here
is a difference in terminology rather than a different understanding of how
real GC algorithms work. Could you clarify please ?

-- chris
 
J

John C. Bollinger

dontrango said:
dontrango wrote:


1 class TestA {
2 TestB b;
3 TestA ( ) { b = new TestB (this); }
4 }
5
6 class TestB {
7 TestA a;
8 TestB (TestA a) { this.a = a; }
9 }
10
11 class TestAll {
12 public static void main (String [ ] args) {
13 new TestAll.makeThings ( );
14 // ... code
15 }
16
17 void makeThings ( ) { testA test = new TestA ( ); }
18 }

here is my thread of thought:

line 13 creates an instance of TestAll class and calls its makeThings
method, without keeping a reference to it.
line 17 instantiates a TestA object, calling its constructor. In the
process, this calls the constructor of class B that sets
the instance variable of TestB object a to the TestA object.

looks like an island of isolation but not quite; they are instances of
different classes.

so I disagree with the statement 'After line 15, the object referenced by a
is eligible for
garbage collection whereas that referenced by b is not?' since both should
be eligible ( the current thread has no access to
both objects ).

What is your opinion on that?

My opinion is that you have judged correctly, at least with regard to
the Java GC model. In Java an object is eligible for GC if it is not
reachable from a live thread via a chain of strong references. In the
example, every TestA instance is paired with a TestB instance such that
each holds a strong reference to the other. Therefore, if one is
strongly reachable then so is the other, and the two always have the
same eligibility for GC. With the precise code above, it would be
possible to break the relationship after construction of the TestA and
TestB instances by directly modifying their instance variables, but no
such thing is actually done. Both instances created during an
invocation of TestAll.makeThings() in fact become eligible for GC as
soon as makeThings() exits.

There are some subtleties involved in determining eligibility for GC in
Java, the most notable being "hidden" local variables. Hidden local
variable arise because Java does not actually have nested local variable
scopes at the VM level -- only at the Java source level. To the VM all
local variables are treated equally. Therefore, a local reference
variable that goes out of scope in the Java sense sticks around until
the method in which it is declared terminates, no matter how long that
may be. Any object it refers to remains strongly reachable until that time.

That there are other GC models where the answer might be a bit
different, perhaps including some in which the problem is not a trick
question. For Java, however, the bottom line answer is "there is no
such reason."


John Bollinger
(e-mail address removed)
 
L

Lee Fesperman

John said:
There are some subtleties involved in determining eligibility for GC in
Java, the most notable being "hidden" local variables. Hidden local
variable arise because Java does not actually have nested local variable
scopes at the VM level -- only at the Java source level. To the VM all
local variables are treated equally. Therefore, a local reference
variable that goes out of scope in the Java sense sticks around until
the method in which it is declared terminates, no matter how long that
may be. Any object it refers to remains strongly reachable until that time.

That is no quite true. At the bytecode level, local variables are 'slots' in the stack
frame. Some compilers will reuse the slots allocated to nested local variables thus
removing the reachability of references in the reused slots.

This only applies to direct interpretation of bytecodes. Runtime compilers are free to
apply additional optimizations, including reuse of 'unnested' local variables. I have
seem JVMs that perform even more esoteric optimizations.
That there are other GC models where the answer might be a bit
different, perhaps including some in which the problem is not a trick
question. For Java, however, the bottom line answer is "there is no
such reason."

For Java, it is best to assume that (as you said) a local variable remains reachable
until the method terminates. However, it is not guaranteed.
 
J

John C. Bollinger

Lee Fesperman wrote:

[...]
At the bytecode level, local variables are 'slots' in the stack
frame. Some compilers will reuse the slots allocated to nested local variables thus
removing the reachability of references in the reused slots.

This only applies to direct interpretation of bytecodes. Runtime compilers are free to
apply additional optimizations, including reuse of 'unnested' local variables. I have
seem JVMs that perform even more esoteric optimizations.
[...]

For Java, it is best to assume that (as you said) a local variable remains reachable
until the method terminates. However, it is not guaranteed.

Yes, I suppose I was a bit presumptive. I should have said that an
object referred to by a local variable of some method _may_ remain
reachable via that variable until the completion [abrupt or normal] of
that method's execution, regardless of whether the variable goes out of
scope in the Java source sense. The specs do not require that behavior,
and reasonable compilers might indeed emit bytecode that does not
exhibit it. The specs also do not forbid the behavior, and some
compiler / VM combinations certainly do exhibit it in at least some cases.

It is seperate question whether a VM might reuse a local variable slot
that would otherwise go unused for the remainder of the execution of
some method. Doing so requires some degree of program flow analysis in
order to determine in the first place that the slot is available. I'm
having trouble imagining a scenario where a compliant VM could
reasonably elect to do that outside the scope of JIT compilation, but
once you JIT a piece of bytecode a wide variety of optimizations are
possible.


John Bollinger
(e-mail address removed)
 
D

Dale King

John C. Bollinger said:
Lee Fesperman wrote:

[...]
At the bytecode level, local variables are 'slots' in the stack
frame. Some compilers will reuse the slots allocated to nested local variables thus
removing the reachability of references in the reused slots.

This only applies to direct interpretation of bytecodes. Runtime compilers are free to
apply additional optimizations, including reuse of 'unnested' local variables. I have
seem JVMs that perform even more esoteric optimizations.
[...]

For Java, it is best to assume that (as you said) a local variable remains reachable
until the method terminates. However, it is not guaranteed.

Yes, I suppose I was a bit presumptive. I should have said that an
object referred to by a local variable of some method _may_ remain
reachable via that variable until the completion [abrupt or normal] of
that method's execution, regardless of whether the variable goes out of
scope in the Java source sense. The specs do not require that behavior,
and reasonable compilers might indeed emit bytecode that does not
exhibit it. The specs also do not forbid the behavior, and some
compiler / VM combinations certainly do exhibit it in at least some cases.

It is seperate question whether a VM might reuse a local variable slot
that would otherwise go unused for the remainder of the execution of
some method. Doing so requires some degree of program flow analysis in
order to determine in the first place that the slot is available. I'm
having trouble imagining a scenario where a compliant VM could
reasonably elect to do that outside the scope of JIT compilation, but
once you JIT a piece of bytecode a wide variety of optimizations are
possible.

Chris Smith and I had a long discussion on this a while back and could not
come to an agreement. Consider the following case:

public void method()
{
{ Object o = new FooBar(); }
someMethodThatExecutesALongTime();
}

I think we would agree that it would not be in error for the object to be
eligible for garbage collection during the call to the other method, since
it is no longer accessible and the variable itself has gone out of scope.

What if we remove that scope:

public void method()
{
Object o = new FooBar();
someMethodThatExecutesALongTime();
}

The question is whether the VM is allowed to make the object created
eligible for garbage collection before the called method returns. It is no
longer accessed in this method but it is still in scope.

I say that the VM should not be allowed to do this because while the
variable will not be accessed again, technically it is still visible after
the method returns and would be accessible even though it isn't actually
accessed.

But the only way to actually see an effect from this is if the finalizer for
the object had a side effect. This type of pattern is of course used all the
time in C++, but makes less sense in Java, but I'm not sure we can just
throw it out without losing program correctness.

Unfortunately, the spec is not explicitly clear on this point.

The problem is that these two methods generate the exact same byte code.

If you agree with me that in the second case it would be wrong for the
object to be garbage collected before the called method returns, then you
have to conclude that the VM is quite limited in what it can do to optimize
garbage collection and it doesn't matter what program flow analysis that you
do. The VM can only tell if the variable *will* be accessed again, but has
no way to know when it ceases to be "accessible".
 
L

Lee Fesperman

Dale said:
Chris Smith and I had a long discussion on this a while back and could not
come to an agreement. Consider the following case:

public void method()
{
{ Object o = new FooBar(); }
someMethodThatExecutesALongTime();
}

I think we would agree that it would not be in error for the object to be
eligible for garbage collection during the call to the other method, since
it is no longer accessible and the variable itself has gone out of scope.

What if we remove that scope:

public void method()
{
Object o = new FooBar();
someMethodThatExecutesALongTime();
}

The question is whether the VM is allowed to make the object created
eligible for garbage collection before the called method returns. It is no
longer accessed in this method but it is still in scope.

I say that the VM should not be allowed to do this because while the
variable will not be accessed again, technically it is still visible after
the method returns and would be accessible even though it isn't actually
accessed.

If you agree with me that in the second case it would be wrong for the
object to be garbage collected before the called method returns, then you
have to conclude that the VM is quite limited in what it can do to optimize
garbage collection and it doesn't matter what program flow analysis that you
do. The VM can only tell if the variable *will* be accessed again, but has
no way to know when it ceases to be "accessible".

I'll vote for better optimization. As you say, your approach would limit VM
optimization. For instance, the runtime compiler could use a register instead of a
variable. On some machines, "accessibility" could force a register save/restore.

I fully expect VMs to do this optimization and much, much more. It would be best to heed
my caveat: "it [accessibility/reachability] is not guaranteed."
 
S

Stephen Kellett

Dale King <kingd@[at].invalid> said:
public void method()
{
{ Object o = new FooBar(); }
someMethodThatExecutesALongTime();
}

eligible for garbage collection before the called method returns. It is no
longer accessed in this method but it is still in scope.

It is *not in scope*. Read the VM spec for local variable table
definitions. This defines the locations within the method for which a
variable is valid.

StartPC and Length - ie, the variable may not be valid until XX bytes
into the method and then is only valid for YY bytes. Params are of
course valid from 0 bytes into the method and for Length bytes from
there.

Given your method, object 'o' would inhabit slot zero and would be valid
from offset 0 to offset at the start of the line with the call to
someMethod...

As such, the variable is valid for reclamation, however most GC's will
take the simple and easy approach and wait for the method to end before
thinking about such things.

Stephen
 
S

Stephen Kellett

Stephen Kellett said:
Dale King <kingd@[at].invalid> said:
public void method()
{
{ Object o = new FooBar(); }
someMethodThatExecutesALongTime();
}

eligible for garbage collection before the called method returns. It is no
longer accessed in this method but it is still in scope.

It is *not in scope*. Read the VM spec for local variable table
definitions. This defines the locations within the method for which a
variable is valid.

Following myself up. I feel a qualification of my statement is required.

The local variable table is optional - it does not have to be present
for the java class to load and execute. Its job is to help debuggers and
so forth. I've come to this conclusion after writing a prototype java
tracer that dumped the params and locals to stdout. Many of the Sun
supplied classes don't have a local variable table - even though they
clearly do have local variables and method parameters.

To recast my original statement:
If you have a local variable table for the method and the JVM chooses to
use that information the JVM can determine the variable is out of scope.
The JVM will most likely not use the local variable table information as
it is simpler to simply wait until the end of the method.

For the occasions when the local variable table is absent, you have to
assume the variable is in scope even though you know better. Hence the
JVM won't attempt garbage collection of such local variables until the
method terminates.

Cheers

Stephen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top