Ross said:
In the same way that when a reference goes out of scope, the reference
you set to null is removed from that object's reference count when you
make the assignment (or just after I think). Periodically, the JVM runs
through all it's objects (for example), and every one it finds with a
zero reference count (indicating it's not reachable) is eligable for GC.
Note this doesn't necessarily mean they *will* be collected, especially
when we factor in active threads and whatnot, but you get the idea.
I'm not aware of any JVMs that use reference counting for GC. The
problem with reference counting is that objects that circularily point
to each other will be considered live, even though there is no way to
reach them from 'the outside' so to speak. I'll explain a bit further
about the approach normally taken in JVMs.
The basic idea in most JVMs is to keep track of which objects at time A
could possibly be accessed by the program, with the help of something
called reachability analysis. If we consider a simplified JVM with no
static variables and no JNI, and only one thread running, the program at
a point A looks something like this on the stack:
-------------
| main method
-------------
| method a
-------------
| method b
-------------
| method c <- Currently executing method c.
-------------
Where the method call chain is main -> a -> b -> c. Suppose we suspend
the program at this point, and wish to compute the set of all currently
reachable objects. How do we do this? If we do some preparation, it is
not too hard. Before we run any method, we calculate a garbage
collection map for the method. This will simply be a data structure that
contains information about which local variables in the method can
contain references to objects. For instance, it might look like this (*):
public class GCMap
{
int[] framePointerOffsets;
}
where framePointerOffsets contains the adresses (relative to the
framepointer) of local variables that contain references.
So, now all we need to do this is walk the stack backwards and with the
help of the garbage collection maps add all the references currently
contained in the local variables of each method to a set, which we will
call the root set. The code could like this:
Set rootSet = empty
for stackframe in stack
for offset in framePointerOffsets in GCMap of method at stackframe
rootSet = rootSet U (value at (framePointer + offset))
So now we have our root set. The last thing we need to do is follow all
of the references in the root set to find other live references and from
those references go to yet others and so on. This can be done with a
standard graph search. If some object is not reachable from the root set
of references, it cannot possibly be accessed by the program anymore,
and as such is garbage which can be collected.
This is a simplified explanation of a very simple reachability analysis,
in reality things are more complicated (and there are various
alternative procedures for achieving the same goals). This example
should however also explain why references may need to be 'nulled' at
times. In the gathering of references to the root set, we are only
considering the state of the program at point A, and ignoring anything
that may happen in the future. For instance, we may have a reference in
a local variable at point A which is never read (never used) again in
the future. But our analyzer does not know this! Hence we (as
programmers) may need to manually 'null' references in certain methods,
when we know that a reference points to an object we will not use anymore.
* we assume all local variables are on the stack