byte arrays fun...

R

ralph

I am using data compression java.util.zip package. My problem is I
need to do it repeatedly, as in the code below, and am working on very
large strings. What is the proper way to "nullify" or realease the
memory that can be freed after using the byte arrays? Does setting
them to null actually cause that memory (heap? stack?) to be freed? I
am running out of memory and think this is where it is happening.
Thanks so much for any help...

Inflater i = new Inflater ();
i.setInput ( bufferWork );
byte[] decompressedBuffer = new byte[ maxCountersSize +
increaseSize ];
long decompressedReturn = i.inflate ( decompressedBuffer );
if ( i.getTotalOut () > maxCountersSize ) {
maxCountersSize = i.getTotalOut ();
}
String uncompressedString = new String ( decompressedBuffer,
i.getTotalOut () );
i.end ();


then to clean up

decompressedBuffer = null;
 
T

Thomas G. Marshall

ralph said:
I am using data compression java.util.zip package. My problem is I
need to do it repeatedly, as in the code below, and am working on very
large strings. What is the proper way to "nullify" or realease the
memory that can be freed after using the byte arrays? Does setting
them to null actually cause that memory (heap? stack?) to be freed? I
am running out of memory and think this is where it is happening.
Thanks so much for any help...

Inflater i = new Inflater ();
i.setInput ( bufferWork );
byte[] decompressedBuffer = new byte[ maxCountersSize +
increaseSize ];
long decompressedReturn = i.inflate ( decompressedBuffer );
if ( i.getTotalOut () > maxCountersSize ) {
maxCountersSize = i.getTotalOut ();
}
String uncompressedString = new String ( decompressedBuffer,
i.getTotalOut () );
i.end ();


then to clean up

decompressedBuffer = null;


Yep. You could urge it to move along by explicitly calling System.gc(), but
the jvm makes no guarantees about precisely /when/ the gc will work.

Java uses a "mark and sweep" algorithm for garbage collection. That means
that the moment you nullify the variable to the array, there is no longer
any way for the array to be marked. If it happened to be an array of
objects, then if each object were referenced only by the array then they too
would be orphaned with no way of marking them. All non-marked items get
zeroed out by the gc.

/However/, I would like to point out a curious thing I discovered in the api
help docs at
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/System.html#gc()

The two sentences of the 2nd paragraph for gc says this:

1. Calling the gc method suggests that the Java Virtual
Machine expend effort toward recycling unused objects
in order to make the memory they currently occupy
available for quick reuse.

2. When control returns from the method call, the Java
Virtual Machine has made a best effort to reclaim space
from all discarded objects.

#1 says that only a /suggestion/ is made to the jvm to garbage collect,
which is how I've understood it.

Does #2 say that when control returns from the method call the cleanup has
finished (??), or do they not mean that when they say "has made a best
effort to reclaim".
 
A

Adam Maass

Thomas G. Marshall said:
/However/, I would like to point out a curious thing I discovered in the api
help docs at
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/System.html#gc()

The two sentences of the 2nd paragraph for gc says this:

1. Calling the gc method suggests that the Java Virtual
Machine expend effort toward recycling unused objects
in order to make the memory they currently occupy
available for quick reuse.

2. When control returns from the method call, the Java
Virtual Machine has made a best effort to reclaim space
from all discarded objects.

#1 says that only a /suggestion/ is made to the jvm to garbage collect,
which is how I've understood it.

Does #2 say that when control returns from the method call the cleanup has
finished (??), or do they not mean that when they say "has made a best
effort to reclaim".

It says that a "best effort" has been made to clean up space. What, exactly,
that means is left undefined. In a generational garbage collector, maybe
System.gc() cleans up the first generation only. Maybe it does nothing at
all.

-- Adam Maass
 
T

Thomas G. Marshall

Adam Maass said:
"Thomas G. Marshall"


It says that a "best effort" has been made to clean up space. What,
exactly, that means is left undefined. In a generational garbage
collector, maybe System.gc() cleans up the first generation only.
Maybe it does nothing at all.

That would earn them a smack in the head from my writing professors in
college, many moons ago...
 
W

William Brogden

ralph said:
I am using data compression java.util.zip package. My problem is I
need to do it repeatedly, as in the code below, and am working on very
large strings. What is the proper way to "nullify" or realease the
memory that can be freed after using the byte arrays? Does setting
them to null actually cause that memory (heap? stack?) to be freed? I
am running out of memory and think this is where it is happening.
Thanks so much for any help...

Inflater i = new Inflater ();
i.setInput ( bufferWork );
byte[] decompressedBuffer = new byte[ maxCountersSize +
increaseSize ];
long decompressedReturn = i.inflate ( decompressedBuffer );
if ( i.getTotalOut () > maxCountersSize ) {
maxCountersSize = i.getTotalOut ();
}
String uncompressedString = new String ( decompressedBuffer,
i.getTotalOut () );
i.end ();


then to clean up

decompressedBuffer = null;

Why not reuse the byte[] if there is no dependence on the initial array
being filled with zeros?
 
P

Phil...

I agree that there is no guarantee about when gc runs,
but the JVM is not going to terminate your program
due to insufficient memory without first running gc.
So just set it to null and you should be ok.

Thomas G. Marshall said:
ralph said:
I am using data compression java.util.zip package. My problem is I
need to do it repeatedly, as in the code below, and am working on very
large strings. What is the proper way to "nullify" or realease the
memory that can be freed after using the byte arrays? Does setting
them to null actually cause that memory (heap? stack?) to be freed? I
am running out of memory and think this is where it is happening.
Thanks so much for any help...

Inflater i = new Inflater ();
i.setInput ( bufferWork );
byte[] decompressedBuffer = new byte[ maxCountersSize +
increaseSize ];
long decompressedReturn = i.inflate ( decompressedBuffer );
if ( i.getTotalOut () > maxCountersSize ) {
maxCountersSize = i.getTotalOut ();
}
String uncompressedString = new String ( decompressedBuffer,
i.getTotalOut () );
i.end ();


then to clean up

decompressedBuffer = null;


Yep. You could urge it to move along by explicitly calling System.gc(), but
the jvm makes no guarantees about precisely /when/ the gc will work.

Java uses a "mark and sweep" algorithm for garbage collection. That means
that the moment you nullify the variable to the array, there is no longer
any way for the array to be marked. If it happened to be an array of
objects, then if each object were referenced only by the array then they too
would be orphaned with no way of marking them. All non-marked items get
zeroed out by the gc.

/However/, I would like to point out a curious thing I discovered in the api
help docs at
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/System.html#gc()

The two sentences of the 2nd paragraph for gc says this:

1. Calling the gc method suggests that the Java Virtual
Machine expend effort toward recycling unused objects
in order to make the memory they currently occupy
available for quick reuse.

2. When control returns from the method call, the Java
Virtual Machine has made a best effort to reclaim space
from all discarded objects.

#1 says that only a /suggestion/ is made to the jvm to garbage collect,
which is how I've understood it.

Does #2 say that when control returns from the method call the cleanup has
finished (??), or do they not mean that when they say "has made a best
effort to reclaim".
 
J

Jon Skeet

Phil... said:
I agree that there is no guarantee about when gc runs,
but the JVM is not going to terminate your program
due to insufficient memory without first running gc.
So just set it to null and you should be ok.

Why bother setting it to null? It's a local variable - it'll be
eligible for collection at the end of the method anyway.
 
J

Jon Skeet

Apart from anything else, this line troubles me:
String uncompressedString = new String ( decompressedBuffer,
i.getTotalOut () );

That's almost *certainly* not what you want to do. You *might* have
meant:

String uncompressedString = new String ( decompressedBuffer, 0,
i.getTotalOut () );

which is *slightly* better, but really you should specify a character
encoding, otherwise the platform default one will be used, which isn't
usually a good idea.
 
T

Thomas G. Marshall

Jon Skeet said:
Why bother setting it to null? It's a local variable - it'll be
eligible for collection at the end of the method anyway.

Understood, but I believe the poster wanted to know the rules, incase his
method continued on from there.
 
J

John C. Bollinger

Thomas said:
/However/, I would like to point out a curious thing I discovered in the api
help docs at
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/System.html#gc()

The two sentences of the 2nd paragraph for gc says this:

1. Calling the gc method suggests that the Java Virtual
Machine expend effort toward recycling unused objects
in order to make the memory they currently occupy
available for quick reuse.

2. When control returns from the method call, the Java
Virtual Machine has made a best effort to reclaim space
from all discarded objects.

#1 says that only a /suggestion/ is made to the jvm to garbage collect,
which is how I've understood it.

Does #2 say that when control returns from the method call the cleanup has
finished (??), or do they not mean that when they say "has made a best
effort to reclaim".

We had a discussion of precisely that point several weeks ago. I think
it was Chris Smith who ended up persuading me that no, #2 doesn't really
mean anything at all, because the definition of a "best effort" is
completely implementation dependant, and may even be context dependant.
I had tried to argue that an effort that didn't clean up everything
(give or take) or no effort at all necessarilly was not a "best effort",
but I ended up giving up that position.


John Bollinger
(e-mail address removed)
 
R

Roedy Green

Why bother setting it to null? It's a local variable - it'll be
eligible for collection at the end of the method anyway.

Some methods run for hours. It can sometimes pay to null local
variables early.
 
C

Chris Smith

John said:
Thomas G. Marshall wrote:
We had a discussion of precisely that point several weeks ago. I think
it was Chris Smith who ended up persuading me that no, #2 doesn't really
mean anything at all, because the definition of a "best effort" is
completely implementation dependant, and may even be context dependant.

My specific feelings, for the record, are that this is ultimately
meaningless to language lawyers, but presents a definite intention that
it's somewhat weasel-ish to try and get out of. Specifically, I think
to understand it you have to divide the required implementation code
into two pieces:

1. Real garbage collection code, which is used for garbage collection
whether or not the Runtime.gc() method is ever called.

2. Implementation code in the java.lang.Runtime class, including native
code. This is code that is specific to the implementation of the gc()
method, and would never be called from the remainder of the application
if Runtime.gc() were never used. This code should not have privileged
knowledge about heap layout and concurrency policies and the like.

I interpret "best effort" to be addressed to the implementor of the
Runtime.gc() method, and *not* to the implementor of the garbage
collector itself. Thus, I think it constrains the implementor of
Runtime.gc to use any available entry points to the garbage collection
module that may be defined. Presumably, most garbage collectors would
have an entry point to start garbage collection of some kind, and
Runtime.gc() would be constrained by the second sentence of the API docs
to call it if possible. Presumably, also, because a "best" effort is
specified, Runtime.gc() would be constrained to choose the entry point
and parameters to accomplish the most thorough garbage collection
possible; that is, if an entry point to the garbage collector for a full
collection is possible, then a first-generation collection doesn't
fulfill the requirement.

However, there are cases where such an entry point isn't defined in the
garbage collector implementation. Such cases might include a system
where there simply IS no deferred garbage collection work (such as a
reference counting system). In such cases, no effort would be required,
so Thomas's #1 and #2 are reconciled, and there is simultaneously a
"suggestion" being ignored, and a "best effort" being made.

The fuzzy part comes in here: who's to decide the line between JNI
implementation of the gc() method and the garbage collector itself,
especially when the code isn't the cleanest in the world? Who's to
determine what is or is not an "exposed" API? If the VM implementor
wants to avoid doing any garbage collection work at all in Runtime.gc(),
they could just declare that the garbage collector and memory allocator
are one and the same module of the application, and that their only
exposed APIs are the memory allocation APIs. Suddenly, the Runtime.gc
implementation can just claim that "our hands are tied" and do nothing.

There are even concurrency policies I could conceive of that may require
that some thread be at an allocation point to do a garbage collection.
In that case, some extra work within the garbage collector might be
required in order to make it possible to order a collection on demand.
In that case, is Runtime.gc required to do anything? What if the extra
work in the garbage collector would only take fifteen minutes? What if
it would take a month? What if it would take four days?

Unfortunately, Sun is caught here trying to write a spec that specifies
intent, without making anyt actual rules because they are afraid the
rules will rule out the next big innovation in garbage collection.
That's an inherently contradictory process, so the result is never going
to be perfect.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
J

Jon Skeet

Roedy Green said:
Some methods run for hours. It can sometimes pay to null local
variables early.

Those are very much the exceptions though - and I would almost always
make sure that such methods did very little else than call other
methods. It's a bad idea to get people into the habit of nulling out
variables just for the sake of it, IMO.

I don't know whether any Java JITs do what the .NET CLR does, which is
keep track of which local variables are never referred to after a
certain point, so that they don't act as roots when the garbage
collector kicks in. It doesn't *always* work (for instance, with a loop
where the variable is only read the first time due to an if block, the
JIT won't spot that) but it's a good start.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top