Reading Large Files

F

freesoft_2000

Hi everyone,

I am trying to to read and write large files about
600M.
The thing is when i use the normal FileOutputStream methods an exception
gets thrown saying

java.lang.OutofMemory

This is how i am trying to read the file

FileOutputStream out = new FileOutputStream("C:/my_life_story.zip");
InputStream in = //an input stream from a socket, etc

int len = 0;
byte[] buffer1 = new byte[1024];

while ((len = in.read(buffer1)) > 0)
{
out.write(buffer1, 0, len);
}

in.close();
out.close();

My computer does have sufficient memory

I tried adding this in the while loop but it does not work and i got the
same exception thrown by the JVM again

System.gc();

My question is basically how do i increase the memory programatically
or is there away in which i can read the file without having the above
exception thrown by the JVM?

Any help i greatly appreciated

Thank You

Yours Sincerely

Richard West
 
S

Skip

freesoft_2000 said:
Hi everyone,

I am trying to to read and write large files about
600M.
The thing is when i use the normal FileOutputStream methods an exception
gets thrown saying

java.lang.OutofMemory

This is how i am trying to read the file

FileOutputStream out = new FileOutputStream("C:/my_life_story.zip");
InputStream in = //an input stream from a socket, etc

You are not reading but writing to the file here.
int len = 0;
byte[] buffer1 = new byte[1024];

while ((len = in.read(buffer1)) > 0)

This should be:
while ((len = in.read(buffer1)) != -1)
as 0 is perfectly valid and does *not* say you reached the end of the
stream, just that there is nothing available at the moment.
{
out.write(buffer1, 0, len);
}

in.close();
out.close();

This should take like 1kb of ram, and should certainly not cause running out
of memory. Maybe the underlying implementation of writing to that socket
requires you to flush() it, before it's really sent. That would be a *very
crappy* OS, but well, who knows...

while ((len = in.read(buffer1)) != -1)
{
out.write(buffer1, 0, len);
out.flush();
}

HTH
 
R

Roedy Green

I am trying to to read and write large files about
600M.
The thing is when i use the normal FileOutputStream methods an exception
gets thrown saying

There are several ways to read and write files without needing enough
RAM to keep the whole thing in RAM.

See http://mindprod.com/products1.html#FILETRANSFER

For some source code that reads a buffer at a time using unbuffered
files.

You can also read a line or even byte at a time if you use buffered
i/o.

See http://mindprod.com/applets/fileio.html

for how to do various types of i/o.
 
F

freesoft_2000

Hiwa everyone,

This is usually what i do when running a program

java -Xmx650m MyProgram

Unfortunately i can't use the above code because the thing is the
application is in a jar file on windows platform and users that use this
application to run it usually double click on the jar file to run it. Is
there a way to do what you suggested programatically maybe by the use of
properties?

Skip i think you could be right abou the OS being crap.

Hoping to hear from you

Richard West
 
T

tam

Skip wrote:
....
while ((len = in.read(buffer1)) > 0)

This should be:
while ((len = in.read(buffer1)) != -1)
as 0 is perfectly valid and does *not* say you reached the end of the
stream, just that there is nothing available at the moment.
.....

Not according to the Javadocs... For java.io.InputStream the read(b)
method
behavior is:


Reads some number of bytes from the input stream and stores them into
the
buffer array b. The number of bytes actually read is returned as an
integer.
This method blocks until input data is available, end of file is
detected, or
an exception is thrown.


If b is null, a NullPointerException is thrown. If the length of b is
zero,
then no bytes are read and 0 is returned; otherwise, there is an
attempt to
read at least one byte. If no byte is available because the stream is
at end
of file, the value -1 is returned; otherwise, at least one byte is
read and
stored into b.


So the only way to get a 0 in the return is to give an input buffer
of length 0. If there is no data available the method will block
till there is.

Regards,
Tom McGlynn
 
A

Andrew Thompson

java -Xmx650m MyProgram

Unfortunately i can't use the above code because the thing is the
application is in a jar file on windows platform and users that use this
application to run it usually double click on the jar file to run it.

Give them a little 'launcher' jar to double click
(if they insist), then..

Runtime.exec( "java -Xmx650m MyProgram" );
...Is
there a way to do what you suggested programatically maybe by the use of
properties?

JWS can set memory..
<http://java.sun.com/j2se/1.4.2/docs/guide/jws/developersguide/syntax.html#resources>
....
<j2se version="1.3" initial-heap-size="64m"/>

HTH
 
S

Skip

freesoft_2000 said:
Hiwa everyone,

This is usually what i do when running a program

java -Xmx650m MyProgram

Unfortunately i can't use the above code because the thing is the
application is in a jar file on windows platform and users that use this
application to run it usually double click on the jar file to run it. Is
there a way to do what you suggested programatically maybe by the use of
properties?

Skip i think you could be right abou the OS being crap.

Chances of that are 0.000001% if you are using MacOS, Linux, Unix, Solaris,
Windows... Stuff like this only happens on a OS that you either built
yourself or screwed up. So there remains 99.999999% chance that the bug is
somewhere else in your code.

Have fun debugging :eek:)
 
F

freesoft_2000

Hi everyone,

I tried searching somewhere else in the program and you guys
were right as there seems to be a memory leak but it seemed to came from
this part

FileOutputStream fStream = new FileOutputStream("some file");
ObjectOutput stream = new ObjectOutputStream(fStream);

//The memory exception occurs at the below command line

stream.writeObject(TextPane1.getDocument());

stream.flush();
stream.close();
fStream.close();

I know for a fact that the document in the JTextPane spans couple of of
thousand pages but it seems that the exception is coming from that area.

Could it be that because the stream is writing the document at one go. If
it is then is there a better way to do it?

On an unrelated issue is there a way to increase the heap area
programmatically maybe by use of properties?

Richard West
 
J

jan V

I tried searching somewhere else in the program and you guys
were right as there seems to be a memory leak but it seemed to came from
this part

This is a handy rule: when you have a bug, don't blame others, and suspect
something in your own code FIRST.
//The memory exception occurs at the below command line

stream.writeObject(TextPane1.getDocument());

I'm not surprised.
I know for a fact that the document in the JTextPane spans couple of of
thousand pages but it seems that the exception is coming from that area.

I'm even less surprised ;-)
Could it be that because the stream is writing the document at one go. If
it is then is there a better way to do it?

Don't write the Document object. Write its content.
On an unrelated issue is there a way to increase the heap area
programmatically maybe by use of properties?

Nope. This is one of the more annoying features of Java.
 
T

Thomas Hawtin

Skip said:
This should be:
while ((len = in.read(buffer1)) != -1)
as 0 is perfectly valid and does *not* say you reached the end of the
stream, just that there is nothing available at the moment.

[...]

So the only way to get a 0 in the return is to give an input buffer
of length 0. If there is no data available the method will block
till there is.

If every InputStream obeyed the contract. This is not always the case.
Nor is it the case that second closes of streams always behave as specified.

However, I would strongly suggest that implementations always return
with something. Many years ago I had an InputStream used in a proxy,
running at max priority, return from read with nothing. Which was
alright until at a customer site a beta received a keep alive packet
from the server which sent the proxy into a tight loop. (I can't
remember if the keep alive was an empty TCP packet causing the socket
input stream to return with nothing, a telnet IAC NOP or IAC EOR.).

Tom Hawtin
 
T

Thomas Hawtin

freesoft_2000 said:
I tried searching somewhere else in the program and you guys
were right as there seems to be a memory leak but it seemed to came from
this part

FileOutputStream fStream = new FileOutputStream("some file");
ObjectOutput stream = new ObjectOutputStream(fStream);

What do people say about posting a simple complete example that
demonstrates the problem...

If, as is often the case, an object (or array) can be referenced by more
than one way then the object is transferred only once. This means that
objects transferred by the object streams must have references kept. You
can flush these references at opportune moments.
Could it be that because the stream is writing the document at one go. If
it is then is there a better way to do it?

If it's a really large amount of data, then split it.
On an unrelated issue is there a way to increase the heap area
programmatically maybe by use of properties?

No, but you can allocate NIO buffers 'directly' which should not be
counted as part of the heap allocation.

Tom Hawtin
 
J

jan V

On an unrelated issue is there a way to increase the heap area
No, but you can allocate NIO buffers 'directly' which should not be
counted as part of the heap allocation.

Hmm... this is an interesting concept. Is this mentioned in the docs
anywhere? A pointer would be appreciated.
 
T

Thomas Hawtin

jan said:
Hmm... this is an interesting concept. Is this mentioned in the docs
anywhere? A pointer would be appreciated.

Top of the java.nio.ByteBuffer docs "Direct vs. non-direct buffers".

"A direct byte buffer may be created by invoking the allocateDirect
factory method of this class. The buffers returned by this method
typically have somewhat higher allocation and deallocation costs than
non-direct buffers. The contents of direct buffers may reside outside of
the normal garbage-collected heap, and so their impact upon the memory
footprint of an application might not be obvious. It is therefore
recommended that direct buffers be allocated primarily for large,
long-lived buffers that are subject to the underlying system's native
I/O operations. In general it is best to allocate direct buffers only
when they yield a measureable gain in program performance."

As with everything like this, the actual implementation is
implementation dependent. A low quality implementation could just wrap a
byte array for you.

When it says "factory method" it does not mean the Factory Method pattern.

Tom Hawtin
 
J

jan V

The contents of direct buffers may reside outside of the normal
garbage-collected heap,

There we go: that's what I was worried about "may reside". So I would think
very hard before using this as any kind of solution to get round the fixed
heap size. If Sun could just have given us "will reside", then your idea
would have been pretty cool.
 
F

freesoft_2000

Hi everyone,

Thomas you said something about splitting the file. If its
of no inconvinience to you could you list a simple example of splitting
and joining an object(assume the object is a document)

Thomas you also mentioned something about
java.nio.ByteBuffer. How would you use that class to actually avoid the
problem i am having. A sample code would be helpful

Andrew Thompson you said "give them a little 'launcher'
jar to double click
(if they insist), then..

Runtime.exec( "java -Xmx650m MyProgram" );"

Sorry to ask you this andrew but after searching on yahoo i could not find
any information on -Xmx
If it is of no inconvinience to you do have a link on how to use the above
-Xmx

Hoping to hear from you guys

Thank You

Yours Sincerely

Richard West
 
F

freesoft_2000

Hi everyone,

Sorry i made your head hurt andrew.

One more thing when i use this

Runtime.exec( "java -Xmx650m MyProgram" );

I don't have to include the initial value as well right which is Xms?
Simply adding the above Xmx value should be sufficent right?

On another note wii there be a trickle down effect where the lack of
memory in the main application affect any java programs running at the
same time?

Hoping to hear from you

Richard West
 
A

Andrew Thompson

Sorry i made your head hurt andrew.

LOL! Not this time! (I was just confused, which
is quite common for me)
One more thing when i use this

Runtime.exec( "java -Xmx650m MyProgram" );

Now, I should have pointed out that this
"..is a crudely hacked out, completely untested, code(ish)
snippet that might point you along the right path, vis.
<http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Runtime.html#exec(java.lang.String)>"

Note
a) their are other overloaded variants that might
better suit your need.
b) I have never had need to Runtime.exec() anything
c) ...
I don't have to include the initial value as well right which is Xms?
Simply adding the above Xmx value should be sufficent right?

I do not know.
On another note wii there be a trickle down effect where the lack of
memory in the main application affect any java programs running at the
same time?

c) .."Executes the specified string command in a separate process."

Suggests to me, 'no'. No trickle down effect. If I read
that correctly, it means 'new JVM' (with it's own, separate,
memory constraints).

HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,900
Latest member
Nell636132

Latest Threads

Top