Why is Java so slow????

  • Thread starter Java Performance Export
  • Start date
L

Lew

Java said:
The useful conclusion that can be drawn is that a test generator
written in "C" will run faster, and so be more desirable to use, than
one written in Java, to the extent that it is desirable for each and
every component to run as fast as possible, and to the extent that it
is useful to identify those aspects of our decision making that
produce desirable results.

A more useful conclusion is that a Java test harness should perhaps just be
kept running in order to amortize the startup time.
 
L

Lew

Ming said:
I think it is because Java is designed to be slow :p

What a load of hooey.

In my version of the benchmark, Java took about 94 seconds to do what C did in 88.
 
J

Java Performance Expert

I modified the Java benchmark in accordance with others' suggestions and for
five million lines of output came up with:

Java:
$ java -server -cp build/classes testit.TimePrin
Elapsed: 93.966 secs.

C program:
$ ./timepr
Elapsed: 88.000000 secs.

A far cry from 60:1, eh?

Lew,
On my system, it takes 9.21 seconds (Java) and 1.1 seconds (C)

Any ideas?

thanks
JH
 
L

Lew

Java said:
Lew,
On my system, it takes 9.21 seconds (Java) and 1.1 seconds (C)

Running the code that I posted?

If so, I am mystified.

I only have access to my machine, which gave the results from the command line
as posted. The numbers are fairly consistent from run to run, too, about 94
s. for Java and about 88 or 89 for C, emitting five million lines of output in
each test loop. As you can see, the test loop was set up to eliminate startup
time in the measurement.

I used GCC 4.1.2 and Java 6u3 (32-bit).
 
M

Ming

For web application, FastCGI/C is the fastest, then FastCGI/Perl or
Mod_Perl. Java is way way slower than FastCGI/C
 
B

Bent C Dalager

I modified the Java benchmark in accordance with others' suggestions and for
five million lines of output came up with:

Java:
$ java -server -cp build/classes testit.TimePrin
Elapsed: 93.966 secs.

C program:
$ ./timepr
Elapsed: 88.000000 secs.

A far cry from 60:1, eh?

AMD-64 ~2 GHz, 1 GB RAM, Linux Fedora 7, the usual mix of other programs
running. 32-bit Java.

On my MacBook Pro 2.33GHz with OSX 10.4.10, using
java version "1.5.0_07"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_07-164)
Java HotSpot(TM) Client VM (build 1.5.0_07-87, mixed mode, sharing)
- and -
gcc version 4.0.1 (Apple Computer, Inc. build 5367)

I get:

$ java -classpath . TimePrin
Elapsed: 86.424 secs.

$ java -server -classpath . TimePrin
Elapsed: 83.093 secs.

$ gcc test.c
$ ./a.out
Elapsed: 52.000000 secs.

$ gcc -O4 test.c
$ ./a.out
Elapsed: 52.000000 secs.

Cheers,
Bent D
 
B

Bent C Dalager

On my system, it takes 9.21 seconds (Java) and 1.1 seconds (C)

Any ideas?

As an experiment, I would try to increase the workload tenfold
(i.e. increase to 50 mill lines) just to see if the ratios remain the
same.

Cheers,
Bent D
 
S

Steve Wampler

Lew said:
Running the code that I posted?

If so, I am mystified.

I only have access to my machine, which gave the results from the
command line as posted. The numbers are fairly consistent from run to
run, too, about 94 s. for Java and about 88 or 89 for C, emitting five
million lines of output in each test loop. As you can see, the test
loop was set up to eliminate startup time in the measurement.

I used GCC 4.1.2 and Java 6u3 (32-bit).

I made a couple of modifications:

(1) time output to stderr, so I could redirect stdin (I didn't want
to wade through X-zillion output messages!)
(2) I had to import java.util.Date to get Date() - I don't understand
why you don't have to!

With 5,000,000 strings:

Java (-server) : 8.192
C++ (4.1.1 -O4): 2.000

with 50,000,000 strings:

Java (-server) : 80.421
C++ (4.1.1 -O4): 17.000

both have the same ratio. I suspect (without any attempt to profile anything!)
that Java is, in the simple example, paying the price for multiple object creations
in the inner loop.

Your counts for 5,000,000 seem pretty high, but are likely I/O bound costs of
not tossing stdout.
 
M

Mark Thornton

Java said:
The useful conclusion that can be drawn is that a test generator
written in "C" will run faster, and so be more desirable to use, than
one written in Java, to the extent that it is desirable for each and
every component to run as fast as possible, and to the extent that it
is useful to identify those aspects of our decision making that
produce desirable results.

The main reason your Java code was so slow relative to C is that you had
not taken account of the special characteristics of System.out. It is a
line buffered PrintStream. To avoid this characteristic try writing to a
file instead or creating a new stream on FileDescriptor.out like this:

PrintStream out = new PrintStream(new BufferedOutputStream(new
FileOuputStream(FileDescriptor.out)));

Whenever you see a big difference in text IO performance between Java
and C, differences in buffering are almost always the cause.

In addition, Java is always at a disavantage in such tests because its
character handling is always unicode, while the C comparison invariably
uses a single byte character set (and no conversion of character encoding).

If you want to write worthwhile tests you need a better understanding of
the languages and libraries involved.

Mark Thornton
 
S

Steve Wampler

Steve said:
Your counts for 5,000,000 seem pretty high, but are likely I/O bound
costs of
not tossing stdout.

Yep. In fact, the tests become dominated by I/O costs if stdout isn't
tossed (note the use of fewer lines - just a sign of impatience):

Java (50,000 lines): 3.683 s
C++ (50,000 lines): 3.000 s (hmm, *always* integral?)

CentOS 4, Dual 2GHz Opteron, 32-bit everything.
 
M

Mark Space

Java said:
On my system, it takes 9.21 seconds (Java) and 1.1 seconds (C)

Any ideas?

It might have something to do with your locale or your system libraries.
Even though you are using a BufferedOutputStream writer, each call to
getBytes() on the string still runs an encoder.

The following is much faster than any of my previous tests on my system.
Give it a shot. It does not (yet) count, but it does spam IO so we
can at least see if it might yield some improvements.

static public void rawTest( String [] args ) throws IOException {

int lim = new Integer( args[0] );

String message = "This is line ";
String sed = "0000000\n";

OutputStream os = new BufferedOutputStream(System.out);

byte [] mbuff = message.getBytes();
byte [] sedBuff = sed.getBytes();
int mlength = mbuff.length;
int sedLength = sedBuff.length;

for( int i = 0; i < lim; i++ ) {
os.write( mbuff, 0, mlength );
os.write( sedBuff, 0, sedLength );
}

}

Note: not complete, just paste it in and have main() call it.
 
L

Lew

Steve said:
(1) time output to stderr, so I could redirect stdin (I didn't want
to wade through X-zillion output messages!)
(2) I had to import java.util.Date to get Date() - I don't understand
why you don't have to!

Oops. I did, but forgot to include that line in the post.
With 5,000,000 strings:

Java (-server) : 8.192
C++ (4.1.1 -O4): 2.000

with 50,000,000 strings:

Java (-server) : 80.421
C++ (4.1.1 -O4): 17.000

both have the same ratio. I suspect (without any attempt to profile
anything!)
that Java is, in the simple example, paying the price for multiple
object creations
in the inner loop.

Much more likely that it's the IO costs alluded to by others. Object creation
in Java tends to be pretty fast (roughly 10 machine instructions, according to
Sun), and often optimized away.
Your counts for 5,000,000 seem pretty high, but are likely I/O bound
costs of
not tossing stdout.

No doubt, but I was trying the simple case just to see what resulted. As it
happened, the differences here were on the order of 6-7%, favoring C, without
the optimizations of elimination of stdout or string encoding in Java.

I was also paying the price for String encoding conversion, as others have
mentioned.
 
B

Bent C Dalager

The main reason your Java code was so slow relative to C is that you had
not taken account of the special characteristics of System.out.

I tried running the code Lew posted, redirected, with the following
results:

JAVA:

$ time java -server -classpath . TimePrin > out.txt
real 0m31.306s
user 0m15.430s
sys 0m14.705s

$ time java -server -classpath . TimePrin > /dev/null
real 0m20.474s
user 0m14.085s
sys 0m6.233s

C:

$ time ./a.out > out.txt
real 0m4.680s
user 0m1.302s
sys 0m0.650s

$ time ./a.out > /dev/null
real 0m1.366s
user 0m1.248s
sys 0m0.010s

It is a
line buffered PrintStream. To avoid this characteristic try writing to a
file instead or creating a new stream on FileDescriptor.out like this:

PrintStream out = new PrintStream(new BufferedOutputStream(new
FileOuputStream(FileDescriptor.out)));

Trivially rewriting it to use a wrapped FileOutputStream yields:

kandidat:~/tmp bcd$ time java -server -classpath . TimePrin
Elapsed: 5.223 secs.

real 0m5.410s
user 0m4.679s
sys 0m0.521s

This value of 5.41s is reasonably close to C's 4.68s, but the Java
program needed to be written specifically to output to file whileas C
achieved this performance by simple redirection.

The code used is:

import java.util.Date;
import java.io.*;
public class TimePrin
{
private static final int LIM = 5000000;

public static void main( String [] args)
throws Exception
{
int lim;
if ( args.length < 1 )
{
lim = LIM;
}
else
{
try
{
lim = Integer.parseInt( args [0] );
}
catch ( NumberFormatException ex )
{
lim = LIM;
}
}

long start = new Date().getTime();
PrintStream out = new PrintStream(
new BufferedOutputStream(
new FileOutputStream("out.txt")));
for ( int i=0; i < lim; i++)
{
out.print( "This is line " + i +"\n" );
}
out.close();
long end = new Date().getTime();

double elapsed = (end - start) / 1000.0;
System.out.println( "Elapsed: "+ elapsed +" secs." );
}
}


Cheers,
Bent D
 
M

Mark Space

Mark said:
String sed = "0000000\n";

This *still* causes the output to be flushed. Remove the \n and I get a
50% speed increase (halves runtime). Is there no way to set the
autoflush for System.out to false?
 
L

Lew

Bent said:
I tried running the code Lew posted, redirected, with the following
results:

JAVA:

$ time java -server -classpath . TimePrin > out.txt

The benchmark I provided specifically sought to avoid including program load
time in the result. Using UNIX 'time' utility defeats that.
 
M

Mark Space

Mark said:
This *still* causes the output to be flushed. Remove the \n and I get a
50% speed increase (halves runtime). Is there no way to set the
autoflush for System.out to false?

Now I'm just kind of making notes as I go.

BufferedOutputStream's write( b[], int, int ) method does a range check
on its arguments, then just does this:

for (int i = 0 ; i < len ; i++) {
write(b[off + i]);
}

That's cut and paste from the source code.

You'd be faster by a bit just to call write(byte) in a for loop.
 
B

Bent C Dalager

This *still* causes the output to be flushed. Remove the \n and I get a
50% speed increase (halves runtime).

This could be because your console window needs to scroll less?
Is there no way to set the
autoflush for System.out to false?

System.setOut(new PrintStream(System.out));
might work, after a fashion :)

From the APIdoc:

PrintStream

public PrintStream(OutputStream out)

Create a new print stream. This stream will not flush automatically.

Parameters:
out - The output stream to which values and objects will be printed
See Also:
PrintWriter.PrintWriter(java.io_OutputStream)


Cheers,
Bent D
 
B

Bent C Dalager

The benchmark I provided specifically sought to avoid including program load
time in the result. Using UNIX 'time' utility defeats that.

The impact is negligible in the case at hand.

Cheers,
Bent D
 
M

Mark Space

Mark said:
BufferedOutputStream's write( b[], int, int ) method does a range check
on its arguments, then just does this:

for (int i = 0 ; i < len ; i++) {
write(b[off + i]);
}

That's cut and paste from the source code.

You'd be faster by a bit just to call write(byte) in a for loop.

But this doesn't work. It's an order of magnitude *slower* to call
write(byte) in a loop.

Also, call write( byte []) is also about 50% slower than the three
argument write(). Both write(byte[]) and write(byte) resolve according
to NetBeans to the superclass of BufferedOuputStream, OutputStream.

Is this virtual method overhead?

Or has Sun pre-optimized their own libraries so they run faster?
 
S

Stefan Ram

Mark Space said:
This *still* causes the output to be flushed. Remove the \n and I get a
50% speed increase (halves runtime). Is there no way to set the
autoflush for System.out to false?

I have written the following attempt.
But I have not tested whether it works.
(The exception handling still can be simplified.)

/* NB: The "false" below is intended to turn autoflush off. */
public class Main
{
public static java.io.PrintStream newOutPrintWithAutoflushTurnedOff()
{ java.io.PrintStream outPrint = null;
java.lang.Exception ex = null;
final java.io.FileDescriptor outDescriptor =
java.io.FileDescriptor.out;
try{ java.io.FileOutputStream outStream =
new java.io.FileOutputStream( outDescriptor );
outPrint = new java.io.PrintStream( outStream, false ); }
catch( final java.lang.SecurityException e ){ ex = e; }
if( ex != null )throw new java.lang.RuntimeException( ex );
return outPrint; }

public static void setOutWithAutoflushTurnedOff()
{ java.io.PrintStream outPrint;
java.lang.Exception ex = null;
try{ outPrint = newOutPrintWithAutoflushTurnedOff();
try{ java.lang.System.setOut( outPrint ); }
catch( final java.lang.SecurityException e ){ ex = e; }}
catch( final java.lang.RuntimeException e ){ ex = e; }
if( ex != null )throw new java.lang.RuntimeException( ex ); }

public static void main( final java.lang.String[] args )
{ setOutWithAutoflushTurnedOff();
java.lang.System.out.println( "Is autoflush\noff now?" ); }}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top