references and garbage collection

J

jimjim

Hello,

I m trying to make my transition from C++ to Java and my curiosity
generated the following question.

The method below aims to return the first element stored in a Vector that
has a particular tag in the Hashtable.
Hashtable tuples = new Hashtable();

private synchronized Object get(String tag)
{
Vector v = (Vector) tuples.get(tag);
//snip
Object o = v.firstElement();
v.removeElementAt(0);
return o;
}

1. The reference to the first element of the Vector is stored in Object o
( = v.firstElement( )). It is a reference that is stored as there is no
"new" statement, isnt it?
2. Latter, what v.removeElementAt(0) does is to copy all Vector's
elements starting from element[1] one place downwards, to element[0], and
delete the element[lastElement]. Should I assume that in this moment, memory
is automatically created in order to store the "actual" (not the reference)
element[0] of the Vector to Object o, before it is overwritten by the
System.copyarray( ) that the Vector uses?
3. In addition to this convenience, should I also assume that even though
the Object o goes out of scope, meaning that the function's stack is
released thus the object is destroyed, it is actually not destroyed as after
returning from the get( ) the object's reference is stored somewhere else in
the program?

Please do correct me if I am wrong, and most importantly share your insight
and knowledge on how things actually work. Responsible answers please. Thank
you in advance.
jimjim
 
J

jimjim

P.S: along with any answers, pointers to any documents I should read on the
issue are wellcome (opt for short and concrete ones though) :)
 
A

Anton Spaans

Hi JimJim

Java is very similar as C++ in many repects, but quite different in others.

First of all something that may answer most of the questions in your post:

Java does not store objects on the stack, like C++ does. Java stores only
references (pointers) to them on the stack. So, when the stack unwinds,
objects are not destroyed (necessarily), only the references are removed.

An object is always stored on the heap and is destroyed only when there are
no references (pointers) to the object left at all. The garbage-collector
figures this out and destroys the object when it sees fit.

Since objects themselves are never put on the stack, you can never make a
mistake like this in C++:

public Foo* getFoo()
{
Foo foo(5);
...
return &foo;
}

In short, all variables in Java are either primitives or
Object-references(pointers), never Objects themselves.

So, the answers to your questions:
1) Yep, no constructor is called whatsoever. An existing reference is
assigned to the return value.

2) removeElementAt(x) only removes the xth element from the Vector (and
return the removed reference). It does not do anything else. All collections
and sets (HashMaps, Vectors, TreeSets, etc) never hold Objects themselves
(see the text above); they only hold references to them.

3) The 'o' is a reference to an Object. The method 'private synchronized
Object get(String tag)' returns this reference. No Objects are destroyed.
Only if there is no reference to this returned Object left in your
application, it'll be destroyed at some point, but you have no control over
that.

-- Anton.

jimjim said:
Hello,

I m trying to make my transition from C++ to Java and my curiosity
generated the following question.

The method below aims to return the first element stored in a Vector that
has a particular tag in the Hashtable.
Hashtable tuples = new Hashtable();

private synchronized Object get(String tag)
{
Vector v = (Vector) tuples.get(tag);
//snip
Object o = v.firstElement();
v.removeElementAt(0);
return o;
}

1. The reference to the first element of the Vector is stored in Object o
( = v.firstElement( )). It is a reference that is stored as there is no
"new" statement, isnt it?
2. Latter, what v.removeElementAt(0) does is to copy all Vector's
elements starting from element[1] one place downwards, to element[0], and
delete the element[lastElement]. Should I assume that in this moment, memory
is automatically created in order to store the "actual" (not the reference)
element[0] of the Vector to Object o, before it is overwritten by the
System.copyarray( ) that the Vector uses?
3. In addition to this convenience, should I also assume that even though
the Object o goes out of scope, meaning that the function's stack is
released thus the object is destroyed, it is actually not destroyed as after
returning from the get( ) the object's reference is stored somewhere else in
the program?

Please do correct me if I am wrong, and most importantly share your insight
and knowledge on how things actually work. Responsible answers please. Thank
you in advance.
jimjim
 
C

Chris Smith

jimjim said:
I m trying to make my transition from C++ to Java and my curiosity
generated the following question.

The method below aims to return the first element stored in a Vector that
has a particular tag in the Hashtable.
Hashtable tuples = new Hashtable();

private synchronized Object get(String tag)
{
Vector v = (Vector) tuples.get(tag);
//snip
Object o = v.firstElement();
v.removeElementAt(0);
return o;
}

1. The reference to the first element of the Vector is stored in Object o
( = v.firstElement( )). It is a reference that is stored as there is no
"new" statement, isnt it?

Yes. All non-primitive variables are references to objects. So when
you use '=' to assign a value to a reference variable, you are creating
another reference to the same object.
2. Latter, what v.removeElementAt(0) does is to copy all Vector's
elements starting from element[1] one place downwards, to element[0], and
delete the element[lastElement].

Maybe internally, it does it that way. You shouldn't care. It removes
the first element of the Vector, and that's all you need to know.
Should I assume that in this moment, memory
is automatically created in order to store the "actual" (not the reference)
element[0] of the Vector to Object o, before it is overwritten by the
System.copyarray( ) that the Vector uses?

No. You should realize that objects are immovable, and are never
referred to by variables. If you're really going to poke around in the
implementation of Vector, it keeps an array of references to objects.
Its that array of references that is being changed. All the objects
themselves stay where they were.
3. In addition to this convenience, should I also assume that even though
the Object o goes out of scope, meaning that the function's stack is
released thus the object is destroyed, it is actually not destroyed as after
returning from the get( ) the object's reference is stored somewhere else in
the program?

The variable 'o' goes out of scope and is lost. That variable is a
reference to the object. The object itself is not lost, because part of
the return statement makes the value (a reference to the object)
available to the calling stack frame. As long as the calling stack
frame keeps a reference to the object, the object itself is going to
stay exactly where it was.
Please do correct me if I am wrong, and most importantly share your insight
and knowledge on how things actually work.

The bulk of your confusion seems to arise regarding the difference
between a reference and an object. All variables (except primitives)
are references. You can never express an object in Java; only
references to it. So you're imagining objects being copied around and
created and deleted when no such thing is happening; instead, references
are being assigned, but the object sticks around.

In fact, garbage collection means that *except* for concerns about
memory usage, you can assume that once created, an object will exist
forever, and be in the same piece of memory forever. You are guaranteed
that you never have to worry about an object being deleted when you
still need it. That's part of the whole point of garbage collection in
Java, and an essential aspect of the Java security model. So unless
you're troubleshooting memory issues, use that as your mental model of
how things work: you create an object, and it stays where you put it.
Forget about garbage collection.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
C

Chris Smith

jimjim said:
P.S: along with any answers, pointers to any documents I should read on the
issue are wellcome (opt for short and concrete ones though) :)

Jon's article at http://www.yoda.arachsys.com/java/passing.html isn't
specifically about this topic, but covers some of the same basic
concepts. You might find it useful.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
J

John C. Bollinger

Anton said:
An object is always stored on the heap and is destroyed only when there are
no references (pointers) to the object left at all. The garbage-collector
figures this out and destroys the object when it sees fit.

Almost correct. An object becomes eligible for garbage collection when
it is no longer "reachable" from any live thread. There may still be
references to it floating around. For instance, the object might be a
node in a circular data structure. If no external objects have
references to any of the nodes and no thread is executing a method of
any of the nodes then the whole data structure is eligible for garbage
collection, even though there are existing references to each object.


John Bollinger
(e-mail address removed)
 
J

John C. Bollinger

Chris said:
In fact, garbage collection means that *except* for concerns about
memory usage, you can assume that once created, an object will exist
forever, and be in the same piece of memory forever. You are guaranteed
that you never have to worry about an object being deleted when you
still need it. That's part of the whole point of garbage collection in
Java, and an essential aspect of the Java security model. So unless
you're troubleshooting memory issues, use that as your mental model of
how things work: you create an object, and it stays where you put it.
Forget about garbage collection.

That's quite correct, but I had to read it a second time before I
recognized it. My sticking point was on the part about the object
existing forever, which it certainly might not do, but the point was
that an object is guaranteed to continue to exist as long as anyone has
any reason to care about it. Thus the programmer usually might as well
adopt the mindset that the object exists forever, because he has no good
way to tell the difference.

Anyway, just in case anyone else suffered from the same misreading that
I initially did.


John Bollinger
(e-mail address removed)
 
M

Michael Borgwardt

jimjim said:
Hello,

I m trying to make my transition from C++ to Java and my curiosity
generated the following question.

The method below aims to return the first element stored in a Vector that
has a particular tag in the Hashtable.
Hashtable tuples = new Hashtable();

private synchronized Object get(String tag)
{
Vector v = (Vector) tuples.get(tag);
//snip
Object o = v.firstElement();
v.removeElementAt(0);
return o;
}

Note that the entire method can be written in one much simpler line:

return ((Vector) tuples.get(tag)).remove(0);

Also, the class Vector is obsolescent and you should use ArrayList instead,
which can do everything Vector can, is not (but can be) synchronized and
thus faster, and has a less cluttered interface. Vector has many redundant
methods because it was retrofitted on the List interface, which is probably
the reason why you overlooked the remove(int) method.
 
S

Steve Horsley

jimjim said:
Hello,

I m trying to make my transition from C++ to Java and my curiosity
generated the following question.

The method below aims to return the first element stored in a Vector that
has a particular tag in the Hashtable.
Hashtable tuples = new Hashtable();

private synchronized Object get(String tag)
{
Vector v = (Vector) tuples.get(tag);
//snip
Object o = v.firstElement();
v.removeElementAt(0);
return o;
}

1. The reference to the first element of the Vector is stored in Object o
( = v.firstElement( )). It is a reference that is stored as there is no
"new" statement, isnt it?

That's correct.
2. Latter, what v.removeElementAt(0) does is to copy all Vector's
elements starting from element[1] one place downwards, to element[0], and
delete the element[lastElement]. Should I assume that in this moment, memory
is automatically created in order to store the "actual" (not the reference)
element[0] of the Vector to Object o, before it is overwritten by the
System.copyarray( ) that the Vector uses?

Wrong. You seem to be under the impression that a Vector can contain
Objects (or descendants). This is not the case. A Vector can only ever
contain _references_ to Objects. A Vector probably contains an Object[],
which is an array of Object references. The nearest C++ construct is an
array of pointers. Remember, Vector.add(Object) does NOT contain a
new(), and thus does not allocate memory for an Object (though it might
have to resize the array to fit a new reference in). Similarly,
Vector.remove(int) does not deallocate memory, it (at most) copies a few
references around an array.

You can NEVER have or pass raw objects in java. new() returns a
reference, and any time you do something that looks like you are dealing
with an object you are in fact dealing with a reference. Apply this
equivalence:

java:
Object o = new Object();
String s = o.toString();
String[] sa = new String[1];
sa[0] = s;

C++:
Object *o = new Object();
String *s = o->toString();
String **sa = new String[1];
sa[0] = s;

ALL java object and array allocation is heap based.

The use and deferencing of the pointer (called a reference) is implicit
in java. Actually, the implementation may not be the same way as a C++
pointer, a reference might be a handle/index-number not a memory address
but the language does not allow you any way to discover a difference.

Java has no equivalent of this C++ code to allocate compound structures
on the stack:
Object o;
Object[5] oa;
The same lines in java declare and allocate space for a reference to an
Object, and a reference to an array of 5 references to Objects.
3. In addition to this convenience, should I also assume that even though
the Object o goes out of scope, meaning that the function's stack is
released thus the object is destroyed, it is actually not destroyed as after
returning from the get( ) the object's reference is stored somewhere else in
the program?

Nope.
Again, the only thing on the stack is a _reference_ to an Object. That
reference is deallocated as the method returns.

Again, being pedantic, the JVM is free to deallocate the memory for the
reference whenever it wants, but I guess most JVMs put automatic
variables on the stack and deallocate as part of the return process. A
JVM could concievably delay the deallocation. It could deallocate before
the method returns (but after the last reference use of course). It may
re-use the space for another variable if it can be sure of no overlap in
usage.

Also, as others pointed out, it can garbage collect the Object at any
time after it becomes unreachable. Although deallocation of the
reference is probably a good clue, the GC is free to use any means to
determine reachability, so it could be GC'd before the method actually
returns, or 6 months later.

Steve
 
J

jimjim

Hello,
For instance, the object might be a node in a circular data structure.
If no external objects have references to any of the nodes and no
thread is executing a method of any of the nodes then the whole data
structure is eligible for garbage collection, even though there are existing
references to each object.

I m sorry but I can't understand. You first say that no external objects
have
references to any of the nodes and then that "even though there are existing
references to each object" (by saying object do you mean each node of the
circular data structure?).

An example would be great.

Thank you in advance.
jimjim
 
C

Chris Smith

jimjim said:
I m sorry but I can't understand. You first say that no external objects
have
references to any of the nodes and then that "even though there are existing
references to each object" (by saying object do you mean each node of the
circular data structure?).

It's kind of a picky point, but it has historical significance. Here's
an example:

class Node
{
Node next;
}

class TestApp
{
private static void buildObject()
{
Node a = new Node();
Node b = new Node();
Node c = new Node();

a.next = b;
b.next = c;
c.next = a;
}

private static void everythingElse()
{
// Some piece of complex code would go here
}

public static void main(String[] args)
{
buildObj();
everythingElse();
}
}

The question is whether the three objects created in buildObj() can be
garbage collected during the execution of everythingElse(). The answer
is that they can.

A straight-forward application of the "when there are no more references
left" rule would tell you that they can't. Why? Because there is a
reference to 'a' (in c.next), a reference to 'b' (in a.next), and a
reference to 'c' (in b.next). In reality, though, they can be collected
because even though these objects reference each other, none of these
references are actually visible to the running application.

This may seem somewhat obvious to someone approaching garbage collection
in modern times. However, to someone approaching garbage collection
from some early implementations, it could be surprising. Several early
garbage-collected languages actually couldn't handle this, and so
programmers would be in the habit of manually breaking up any circular
data structures before letting the garbage collector at them.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
G

GaryM

<SNIP>

Chris, I find your examples and explanations you provide this group
very informative and easy to understand. Just wanted to say thanks for
a really nice job that advances neophytes like myself all the quicker.

Gary
 
J

jimjim

Is this from the point of view of what a programmer should be doing in the
old days in order not to have a memory leak by such a circular structure?
(The only way of breaking up the circular data structure in your example is
by deleting each and every of the objects in it, isnt it?)

Nowdays should I assume that if I retain a reference to at least one Node
object, the structure wont be liable for garbage collection? On the other
hand, if I dont retain at least one reference the collector would do right
to collect the structure as I cant access it in any way, right?

Thanks to all for the input.
 
C

Chris Smith

jimjim said:
Is this from the point of view of what a programmer should be doing in the
old days in order not to have a memory leak by such a circular structure?
(The only way of breaking up the circular data structure in your example is
by deleting each and every of the objects in it, isnt it?)

Yes, it would be how a programmer would have prevented a memory leak in
an early garbage collection implementation (by which I mean before Java
was ever imagined, some time in the 70s and 80s).

No, breaking up the structure would not have required deleting anything;
it would have required changing the next references on at least one
node... and probably setting it to null. The reclaiming of memory is
left to the garbage collector.
Nowdays should I assume that if I retain a reference to at least one Node
object, the structure wont be liable for garbage collection? On the other
hand, if I dont retain at least one reference the collector would do right
to collect the structure as I cant access it in any way, right?

Right.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top