Disclaimer: I am not a fan of Java.
Olivier said:
As you can see the java program performance is far from the other one. I
was very surprised! Before the tests, I was sure that the winner will be
C followed by java, but it is not the case ...
Here's what I learned from my ray tracer language comparison:
http://www.ffconsultancy.com/languages/ray_tracer/
For a compiled language, Java is slow. However, this effect is compounded in
two important ways:
1. Getting accurate information about how to optimize Java code is almost
impossible.
2. Actually implementing the optimizations is typically unnecessarily slow
and tedious in Java.
In this case, you have already seen the two standard responses to a valid
question from the Java community:
1. This is "only" a microbenchmark.
2. You're measuring startup time.
In reality, IO is a very important aspect of many programs (including my ray
tracer benchmark, which was hopelessly IO bound _only_ in Java until some
kind soul taught me how to work around these deficiencies in Java) and you
are a long way from measuring the startup time, which will be a fraction of
a second in this case.
Of course, I can improve the performance of the java version, by using a
StringBuffer, and print this buffer when it is bigger than a given size,
but it is not fair !
Actually the best solution is to use buffered output rather than a string
buffer to accumulate an intermediate result. Don't ask me why the Java
implementors aren't competent enough to buffer automatically but they
aren't.
So, you start by importing a namespace for doing non-sucky IO:
import java.io.*;
Then you create an unnecessarily long-named BufferedOutputStream:
BufferedOutputStream out = new BufferedOutputStream(System.out);
Finally, you write to this instead of System.out:
out.write(("abcdefghijk " + i + "\n").getBytes());
Holy mackerel, your Java code now takes only 1.275s instead of the original
3.135! That's a lot better than it was before but it is still over and
order of magnitude (!) slower than any decent language, e.g. C at 0.196s.
You see, Java is what the functional programming community refer to as
a "low-level language" because it simply fails to convey high-level
information to the compiler for optimization. In this case, the Java
compiler is probably being amazingly retarded in that it actually performs
two string concatenations in unicode for absolutely no reason whatsoever.
So let's try manually unrolling the code:
for (int i = 0; i < 1000000; i++) {
out.write("abcdefghijk ".getBytes());
out.write(s1);
out.write(Integer.toString(i).getBytes());
out.write("\n".getBytes());
out.write(s2);
}
Wow, this is actually slower than it was before at 1.492s. But this is going
in the right direction. As it happens, Java is so crap that it can't even
hoist loop invariants as well as the C compilers from the 70s.
Hoisting manually, we finally arrive at:
import java.io.*;
public class Test
{
public static void main(String[] args) throws java.io.IOException
{
BufferedOutputStream out = new BufferedOutputStream(System.out);
byte[] s1 = "abcdefghijk ".getBytes();
byte[] s2 = "\n".getBytes();
for (int i = 0; i < 1000000; i++) {
out.write(s1);
out.write(Integer.toString(i).getBytes());
out.write(s2);
}
}
}
Now this only takes 0.947s! Incredible, several times longer now and still
several times slower than any decent language!
In summary, what we have learned is that for the love of God (TM) don't use
Java if performance is at all important for your work. This doesn't just
apply to IO either: Java is very slow at a wide variety of vitally-
important tasks, e.g. allocation.