Reading strings using BufferedReader.readLine() is too slow

W

wicli_pucli

I have the following code for reading a string from an input stream:

BufferedReader in = new BufferedReader(
new InputStreamReader(
connection.getInputStream()));

reply = new String();
String reply_now;

while ((reply_now = in.readLine()) != null) {
reply += reply_now;
}
in.close();

The problem is it's too slow and takes too much CPU time because, as I
guess, it performs too many additions with strings and I can't find
any alternative way of reading streams - reliable and not using
deprecated methods. How can I read text from a stream in bigger
portions than lines?
 
G

Gordon Beaton

The problem is it's too slow and takes too much CPU time because, as I
guess, it performs too many additions with strings and I can't find
any alternative way of reading streams - reliable and not using
deprecated methods. How can I read text from a stream in bigger
portions than lines?

Your subject line is wrong, it isn't BufferedReader.readline() which
is slow. The real problem is, as you correctly point out in the text,
string concatenation.

Use StringBuilder and create the String only once you've got all of
the data:

StringBuilder reply = new StringBuilder();

while ((reply_now = in.readLine()) != null) {
reply.append(reply_now);
}
in.close();

return reply.toString();

(If you don't have StringBuilder, you can use StringBuffer in the
exact same manner).

/gordon

--
 
W

wicli_pucli

Your subject line is wrong, it isn't BufferedReader.readline() which
is slow. The real problem is, as you correctly point out in the text,
string concatenation.

Use StringBuilder and create the String only once you've got all of
the data:

StringBuilder reply = new StringBuilder();

while ((reply_now = in.readLine()) != null) {
reply.append(reply_now);
}
in.close();

return reply.toString();

(If you don't have StringBuilder, you can use StringBuffer in the
exact same manner).

/gordon

--


Gordon, brilliant. Thanks very much, I wish I wasn't that stupid
sometimes...
 
T

Tom Hawtin

Gordon said:
Your subject line is wrong, it isn't BufferedReader.readline() which
is slow. The real problem is, as you correctly point out in the text,
string concatenation.

Also, if you don't want to create a String per line, blocks can be read
into a char array with Reader.read(char[]) (note: read does not
necessarily fill the entire array, even if not at the end of file). You
can then drop the BufferedReader, as all that will be doing is an
additional copy of the data.

Tom Hawtin
 
R

Roedy Green

reply = new String();
String reply_now;

while ((reply_now = in.readLine()) != null) {
reply += reply_now;
}

You are allocating objects like mad here, cause excess gc.
first use:

reply = "";

rather than new String();

new String() should almost never be used.

see http://mindprod.com/jgloss/string.html

use code like this instead for much greater efficiency

StringBuilder accum = new StringBuilder ( ESTIMATEDTOTALSIZE );
String line;
while ((line = in.readLine()) != null) {
accum.append(line);
}

String result = accum.toString();

However even that improved code is silly. Why do all the work of
splitting the file into lines, then glue them all back together again?
Why not read the entire file in one fell swoop (one physical i/o).
Read the entire file into a byte[] array then convert it to a String,
which is what com.mindprod.hunkio.HunkIO.readEntireFile does for a
living.

see http://mindprod.com/products1.html#HUNKIO

/**
* Get all text in a file.
*
* @param fromFile file where to get the text.
*
* @return all the text in the File.
*
* @throws IOException when cannot access file.
* @noinspection WeakerAccess
*/
public static String readEntireFile( File fromFile ) throws
IOException
 
S

Stefan Ram

Roedy Green said:

I am writing something like this, but it seems I am more
paranoid about the long file size. My code contains something
like this:

if( running )
{ if(( lenght = file.length() )>( long )java.lang.Integer.MAX_VALUE )
{ this.report = new de.dclj.ram.system.environment.DefaultReport
( "file is too long to be read into an octett array.", this.report );
running = false; }}

if( running )
{ result = this.'fill new array from small file'(); }
 
T

Twisted

I am writing something like this, but it seems I am more
paranoid about the long file size. My code contains something
like this:

if( running )
{ if(( lenght = file.length() )>( long )java.lang.Integer.MAX_VALUE )
{ this.report = new de.dclj.ram.system.environment.DefaultReport
( "file is too long to be read into an octett array.", this.report );
running = false; }}

if( running )
{ result = this.'fill new array from small file'(); }

Ick. The real object oriented way to handle this is to have a class
implement CharSequence and provide a view of the file. Possibly a read/
write view. With buffering handled more or less transparently inside,
aside from maybe an explicit flush() method if it's not read-only.
(The close() method would flush unwritten output regardless.) It could
encapsulate whatever buffering strategy, e.g. reading in the entirety
of small files and only chunks at a time of large files. Reading none
of large files until operated upon and then reading and caching a
chunk around the requested character. For operations that manipulate
chunks of the thing or the whole thing it would usually return more
CharSequence implementing objects that represent the result, without
doing anything else; their charAt and similar methods would result in
some computations and the underlying thing's charAt being called in
one or more places. Of course, stuff like hashCode and equals would
necessitate groveling over the whole file, and closing the parent
FileCharSequence of an anonymous lazily-computed CharSequence could
present problems... firstIndexOf would require reading from the
beginning of the file to the first occurrence ... the whole file if
not found ... but still. It wouldn't try to hold all of a 6GB file in
memory at once, it wouldn't necessarily even read the whole thing in
some situations, and it would generally play a lot nicer with files
that size than anything that tried to slurp the whole thing into core
at once.
 
R

Roedy Green

I am writing something like this, but it seems I am more
paranoid about the long file size. My code contains something
like this:

Then trap the out of memory error and apologise, or test the length of
the file before you start.
 
S

Stefan Ram

Roedy Green said:
Then trap the out of memory error

I did so, indeed:

try
{ array = new byte[ length ]; }
catch( final java.lang.OutOfMemoryError outOfMemoryError )
{ this.report = new de.dclj.ram.system.environment.DefaultReport
( outOfMemoryError, this.report );
running = false; }
and apologise,

I report it to the caller.
or test the length of the file before you start.

This is what the code in the post you replied to
was intended to do.
 
S

Stefan Ram

Roedy Green said:
Then trap the out of memory error and apologise, or test the length of
the file before you start.

Actually, these two conditions are independent in one
direction:

If a file has more octets than java.lang.Integer.MAX_VALUE,
it can not be read to an octet array, because in Java
array indices are int values, not long values.

But if the file does not have more octets than
java.lang.Integer.MAX_VALUE, there still might not be enough
memory for an array of the required size.

So, I have to test both (if consequent action depends
on the success of the read operation).
 
R

Roedy Green

This is what the code in the post you replied to
was intended to do.

I was thinking more along the lines of:

File file = new File ( "C:/inputrecords.txt");
if ( file.length() > MAXFILESIZE )
{
System.err.println( " sorry, file too big to process.");
System.exit(1);
}
 
R

Roedy Green

But if the file does not have more octets than
java.lang.Integer.MAX_VALUE, there still might not be enough
memory for an array of the required size.

you are going to run out of ram long before that in a 32-bit java.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top