performance question

  • Thread starter Olivier Scalbert
  • Start date
L

Lew

Olivier said:
Ok, but how can I do the same thing in Java ?
By the way, I do not want to attack java (that I use daily). I just want
to understand ...

First, add character encoding to the other languages' examples.

Second, change the buffer in the output streams (or Writer, in Java's case) to
be the same size in every example.

Third, don't time the startup time of the Java program. Just time the loops
in each language.

Fourth, since, unlike C, Java is dynamically optimized, not statically, run
the loop enough times for optimization to kick in (which you likely are
already doing). Ideally, you'd run the loop a while without timing it, then
again with timing, letting the Hotspot compiler do its thing.
 
J

Joshua Cranmer

Olivier said:
Hello,

I would like to show to somebody how different languages look like by
doing a very simple case.
I was very surprised by the poor performance of the java version !

Your metric is printing out a series of Strings. Note:
1. Java uses Unicode internally and must translate the strings before
outputting them.
2. You are printing out several times. The printing portion invokes the
JNI overhead.
3. The program is "small" -- the JVM startup time comes into account here.
4. The programs are not all equally-optimized.

This seemingly surprising result was brought up in another thread recently.
 
J

Jon Harrop

Mark said:
Search for the thread "Why is Java so slow???" starting on 19 November,
posted by "java.performance.expert" (who was clearly nothing of the sort).

Any particular response? All I can see is the usual drivel from the Java
community followed by the conjecture that Java isn't suitable for IO.

Oh, and Lew times screen update on his terminal in order to claim that Java
is as fast as C. Sigh...
 
O

Olivier Scalbert

Thanks for the interesting info Jon !!

Jon said:
Disclaimer: I am not a fan of Java.

Olivier said:
As you can see the java program performance is far from the other one. I
was very surprised! Before the tests, I was sure that the winner will be
C followed by java, but it is not the case ...

Here's what I learned from my ray tracer language comparison:

http://www.ffconsultancy.com/languages/ray_tracer/

For a compiled language, Java is slow. However, this effect is compounded in
two important ways:

1. Getting accurate information about how to optimize Java code is almost
impossible.

2. Actually implementing the optimizations is typically unnecessarily slow
and tedious in Java.

In this case, you have already seen the two standard responses to a valid
question from the Java community:

1. This is "only" a microbenchmark.

2. You're measuring startup time.

In reality, IO is a very important aspect of many programs (including my ray
tracer benchmark, which was hopelessly IO bound _only_ in Java until some
kind soul taught me how to work around these deficiencies in Java) and you
are a long way from measuring the startup time, which will be a fraction of
a second in this case.
Of course, I can improve the performance of the java version, by using a
StringBuffer, and print this buffer when it is bigger than a given size,
but it is not fair !

Actually the best solution is to use buffered output rather than a string
buffer to accumulate an intermediate result. Don't ask me why the Java
implementors aren't competent enough to buffer automatically but they
aren't.

So, you start by importing a namespace for doing non-sucky IO:

import java.io.*;

Then you create an unnecessarily long-named BufferedOutputStream:

BufferedOutputStream out = new BufferedOutputStream(System.out);

Finally, you write to this instead of System.out:

out.write(("abcdefghijk " + i + "\n").getBytes());

Holy mackerel, your Java code now takes only 1.275s instead of the original
3.135! That's a lot better than it was before but it is still over and
order of magnitude (!) slower than any decent language, e.g. C at 0.196s.

You see, Java is what the functional programming community refer to as
a "low-level language" because it simply fails to convey high-level
information to the compiler for optimization. In this case, the Java
compiler is probably being amazingly retarded in that it actually performs
two string concatenations in unicode for absolutely no reason whatsoever.

So let's try manually unrolling the code:

for (int i = 0; i < 1000000; i++) {
out.write("abcdefghijk ".getBytes());
out.write(s1);
out.write(Integer.toString(i).getBytes());
out.write("\n".getBytes());
out.write(s2);
}

Wow, this is actually slower than it was before at 1.492s. But this is going
in the right direction. As it happens, Java is so crap that it can't even
hoist loop invariants as well as the C compilers from the 70s.

Hoisting manually, we finally arrive at:

import java.io.*;

public class Test
{
public static void main(String[] args) throws java.io.IOException
{
BufferedOutputStream out = new BufferedOutputStream(System.out);

byte[] s1 = "abcdefghijk ".getBytes();
byte[] s2 = "\n".getBytes();

for (int i = 0; i < 1000000; i++) {
out.write(s1);
out.write(Integer.toString(i).getBytes());
out.write(s2);
}
}
}

Now this only takes 0.947s! Incredible, several times longer now and still
several times slower than any decent language!

In summary, what we have learned is that for the love of God (TM) don't use
Java if performance is at all important for your work. This doesn't just
apply to IO either: Java is very slow at a wide variety of vitally-
important tasks, e.g. allocation.
 
J

Jon Harrop

Mark said:
1. Modern Java systems monitor the execution of the program for a little
while before compiling. This allows a better choice of optimisation
strategy, but doesn't produce the fastest results on applications that
take much less than a second to run anyway

This benchmark takes several seconds to run in Java on a high-end machine =>
startup time is as irrelevant here as it was the last time.
(but if a program takes less than a second why do you care how much less).

If I'm running a program many times then I'll care about Java's
uniquely-poor performance at starting up.
2. Unicode vs ASCII. Java always works with Unicode and converts the
characters to whatever encoding is in use on the machine. Your C
equivalent assumes ASCII. Try including a £ character in your test; Java
will get this right so long as the character encoding is set correctly
(this can be done at execution time).

Is Java unable to express the efficient solution that we almost always want?
3. Output buffering. By default Java output to System.out is line
buffered. That is the output buffer is flushed at the end of every line.
The C 'equivalent' usually uses a much larger buffer by default (512
bytes or more). You can change Java to use a larger buffer and not flush
at EOL or you could change C to use line buffered output. Unless you
change one or the other your comparison is flawed.

Making this program run as slowly as Java does in any other language is
actually almost impossible.
4. You could look back a week or two and find that someone else proposed
a similar test, with similar flaws. After seeing similar mistakes on an
almost weekly basis for at least 10 years, it is not surprising that
people get a bit tired of providing detailed explanations of what is
wrong.

Yet nobody has ever managed to give an adequate solution to this
enormously-important problem or many others, which is precisely why Java
deserves its label as a slow language.
 
J

Jon Harrop

Lew said:
First, add character encoding to the other languages' examples.

Why? We don't want char encoding so there is no logical reason to cripple
all other implementations just because Java can't handle it.
Second, change the buffer in the output streams (or Writer, in Java's
case) to be the same size in every example.

You can't even cripple most other implementations in that way.
Third, don't time the startup time of the Java program. Just time the
loops in each language.

The startup time is only ~4% of the total time anyway on this benchmark,
because Java is so slow.
Fourth, since, unlike C, Java is dynamically optimized, not statically,
run the loop enough times for optimization to kick in (which you likely
are
already doing). Ideally, you'd run the loop a while without timing it,
then again with timing, letting the Hotspot compiler do its thing.

If you're not doing that with your normal programs, why do it here? Is
Hotspot too stupid to reuse previous results? How much scope do you think
there is for dynamic optimization here anyway?
 
L

Lew

Jon said:
Oh, and Lew times screen update on his terminal in order to claim that Java
is as fast as C. Sigh...

Huh? What are you talking about?

I have never done that.
 
J

Jon Harrop

Lew said:
Huh? What are you talking about?

I have never done that.

I don't think it was intensional but by not piping the output to /dev/null
or somewhere else you were measuring the speed at which it drew on the
screen. That is the only reason your results were so similar for C and
Java.

Someone else replied using a pipe and demonstrated that it should run almost
100x faster if you don't do that.

An honest mistake I'm sure but I did think it was quite funny. ;-)
 
J

Joshua Cranmer

Jon said:
Why? We don't want char encoding so there is no logical reason to cripple
all other implementations just because Java can't handle it.

Why cripple Java, or any other programming language for that matter, by
having it merely print out one million lines of code? Use sed, awk, or
bash for that matter.
If you're not doing that with your normal programs, why do it here? Is
Hotspot too stupid to reuse previous results? How much scope do you think
there is for dynamic optimization here anyway?

Java's performance gets better in the long run thanks to Hotspot
optimization; you're metering a program with an abnormally short
lifespan that's not "realistic" at all.

The problem with most benchmarks is that almost all of them are poorly
designed. The best benchmark is one that takes full power of the
language and "standard" libraries to mimic real-world applications. Most
benchmarks are far from real-world examples. How many programs actually
print out 1,000,000 lines of output in their entire lifespan?

I believe that there is a published benchmark measuring the FPS of a
Quake implementation in both Java and C; IIRC, the Java actually does
better than the C much of the time.

Don't complain about performance until you have benchmarks of that sort.
 
M

Mark Thornton

Jon said:
No, char encoding is not an adequate explanation of a 15x slowdown. This is
Java being very poor at buffering.

Java is not poor at buffering it just doesn't do it by default. In the
specific case of System.out it is buffered, but that buffer is flushed
at the end of every line. There are good reasons why this behaviour is
often useful, just not in this context.

Mark Thornton
 
O

Olivier Scalbert

Jon said:
public class Test
{
public static void main(String[] args) throws java.io.IOException
{
BufferedOutputStream out = new BufferedOutputStream(System.out);

byte[] s1 = "abcdefghijk ".getBytes();
byte[] s2 = "\n".getBytes();

for (int i = 0; i < 1000000; i++) {
out.write(s1);
out.write(Integer.toString(i).getBytes());
out.write(s2);
}
}
}


Ooops, do not forget to:
out.close();

otherwise, you can loose characters !
;-)
 
M

Mark Thornton

Jon said:
Why? We don't want char encoding so there is no logical reason to cripple
all other implementations just because Java can't handle it.

If you set the output encoding to ISO-8859-1, then the overhead is
usually insignificant. However people who mishandle text containing
characters outside US ASCII, in my opinion, deserve a special place in
hell. It is really unpleasant trying to recover usable data that has
been mangled in this way.
You can't even cripple most other implementations in that way.
Of course you can. You set buffers of any size you want in both Java and
C and select flushing policies too in both. If you want a fair
comparison, just make sure that you use the same conditions in both cases.
The startup time is only ~4% of the total time anyway on this benchmark,
because Java is so slow.
Until you equalise the buffering policies, you aren't comparing like
with like.
If you're not doing that with your normal programs, why do it here? Is
Hotspot too stupid to reuse previous results? How much scope do you think
there is for dynamic optimization here anyway?
If your benchmark is too simple to allow scope for dynamic optimisation
it is likely to be a poor representation of real use. If your
application is computationally intensive, the results of previous runs
can be a poor guide to optimising the current run. Therefore on real
work the benefit of retaining previous HotSpot data is not as great as a
simplistic test might suggest.

One downside of techniques like those used in HotSpot is that it makes
writing useful micro benchmarks much harder. Some suggest that a useful
micro benchmark is now all but impossible. HotSpot isn't the only
culprit here; modern CPUs also have very complex performance
characteristics that similarly confound attempts at simple benchmarks.

Mark Thornton
 
M

Michael Jung

Jon Harrop said:
Hoisting manually, we finally arrive at:

import java.io.*;

public class Test
{
public static void main(String[] args) throws java.io.IOException
{
BufferedOutputStream out = new BufferedOutputStream(System.out);

byte[] s1 = "abcdefghijk ".getBytes();
byte[] s2 = "\n".getBytes();

for (int i = 0; i < 1000000; i++) {
out.write(s1);
out.write(Integer.toString(i).getBytes());
out.write(s2);
}
}
}

There's still some optimization possible, since "Integer.toString(i)"
still carries encodings.
Now this only takes 0.947s! Incredible, several times longer now and still
several times slower than any decent language!

In summary, what we have learned is that for the love of God (TM) don't use
Java if performance is at all important for your work. This doesn't just
apply to IO either: Java is very slow at a wide variety of vitally-
important tasks, e.g. allocation.

The only thing you can learn from such examples is that you should
take a programming language appropriate to your problem domain.
Performance is a part of that domain.

(For what it's worth: people have already provided examples that show
that Java can be at least as performant as C in certain problem
domains.)

Michael
 
O

Olivier Scalbert

Joshua said:
The problem with most benchmarks is that almost all of them are poorly
designed. The best benchmark is one that takes full power of the
language and "standard" libraries to mimic real-world applications. Most
benchmarks are far from real-world examples. How many programs actually
print out 1,000,000 lines of output in their entire lifespan?
More of 1,000,000 lines of output ? Trust me, a lot of programs !
Depending of your work !
 
M

Mark Thornton

Jon said:
In summary, what we have learned is that for the love of God (TM) don't use
Java if performance is at all important for your work. This doesn't just
apply to IO either: Java is very slow at a wide variety of vitally-
important tasks, e.g. allocation.

Oddly you mention allocation, which is often extremely fast in Java
relative to C, especially in multithreaded systems. Despite a
superficial resemblance to C++, Java's performance characteristics are
significantly different (and that does not mean slow). My work is
computationally intensive, yet has not been hampered at all by using Java.
Advice on Java performance is more easily obtained if your posts are
phrased so as not to look like yet another troll out to trash Java.

Mark Thornton
 
M

Mark Thornton

Olivier said:
More of 1,000,000 lines of output ? Trust me, a lot of programs !
Depending of your work !

I've used Java to compute digests of every file on a (large) disk. The
execution time was disk bound.

Mark Thornton
 
O

Olivier Scalbert

Let's me clarify some points.

1- In any case, I have wanted to create a troll or attack the java platform.

2- I need to create huge csv files for testing (several hundreds of
millions lines) that will be piped in other unix programs and then
import in db.

3- I start with java, which is my everyday language, on medium size file
(1 million lines) I feel that the performance was not so good. I have
done some tests with other languages that I know less and obtain better
results. So I reduce the problem and expose it to you. That is all.

4- With the BufferedOutputStream tips, I get much better result. If I
generate 100 millions of lines, C, Perl, and java as nearly the same
performance, for this small test.

For C:
real 1m15.435s
user 0m32.418s
sys 0m8.729s

For Perl:
real 1m27.213s
user 1m6.180s
sys 0m8.805s

For java:
real 2m15.236s
user 1m52.683s
sys 0m14.797s

For java -server:
real 1m33.065s
user 1m12.065s
sys 0m11.365s

5- So now, I am more reassured and I will continue with Java. Of course,
as there will be much more computation, the gap between compiled and
interpreted languages will increase.


Thanks to everyone.

Olivier
 
D

Daniel Pitts

Olivier said:
Hello,

I would like to show to somebody how different languages look like by
doing a very simple case.
I was very surprised by the poor performance of the java version !

Here are the programs in different languages:

in C:

#include <stdio.h>

int main()
{
int i;

for(i = 0; i < 1000000; i++)
{
printf("abcdefghijk %d\n", i);
}

return 0;
}

time ./test1 > out.txt

real 0m0.710s
user 0m0.576s
sys 0m0.124s

in java:
public class Test
{
public static void main(String[] args)
{
for (int i = 0; i < 1000000; i++)
{
System.out.println("abcdefghijk " + i);
}
}
}

time java Test > out.txt

real 0m12.364s
user 0m4.180s
sys 0m7.676s

time java -server Test > out.txt

real 0m10.537s
user 0m3.120s
sys 0m6.544s


That is not good at all !
ols@tatooine:~/projects/ruby$ java -version
java version "1.6.0_02"
Java(TM) SE Runtime Environment (build 1.6.0_02-b05)
Java HotSpot(TM) Client VM (build 1.6.0_02-b05, mixed mode, sharing)

ols@tatooine:~/projects/ruby$ uname -a
Linux tatooine 2.6.22-14-generic #1 SMP Sun Oct 14 23:05:12 GMT 2007
i686 GNU/Linux

In python:
i=0
while i < 1000000:
print "abcdefghijk", i
i=i+1

time python test.py > out.txt

real 0m2.292s
user 0m2.064s
sys 0m0.112s

In perl:
for ($count = 0; $count < 1000000; $count++)
{
print "abcdefghijk $count\n";
}
time perl test.pl > out.txt

real 0m1.243s
user 0m1.060s
sys 0m0.160s

In ruby:
counter = 0
while counter < 1000000
puts("abcdefghijk #{counter}")
counter+=1
end

time ruby test.rb > out.txt

real 0m4.731s
user 0m4.452s
sys 0m0.100s

As you can see the java program performance is far from the other one. I
was very surprised! Before the tests, I was sure that the winner will be
C followed by java, but it is not the case ...

Of course, I can improve the performance of the java version, by using a
StringBuffer, and print this buffer when it is bigger than a given size,
but it is not fair !

If you have any ideas, there are welcomed !

Thanks,

Olivier
I did my own benchmarks using these files:
bench.c:
#include <stdio.h>
#include <time.h>

void bench() {
long foo = 0;
clock_t start = clock();
for (long i = 1; i < 5000; ++i) {
for (long j = 1; j < i; ++j) {
if ((i % j) == 0) {
foo ++;
}
}
}
clock_t end = clock();
printf("%d %dms\n", foo,
(int) ((end - start) * 1000 / CLOCKS_PER_SEC));
}


int main() {
for (long i = 1; i < 10; ++i) {
printf("%d: ", i);
bench();
}
}

Bench.java:

public class Bench {
static final long CLOCKS_PER_SEC = 1000;
static void bench() {
int foo = 0;
long start = System.currentTimeMillis();
for (int i = 1; i < 5000; ++i) {
for (int j = 1; j < i; ++j) {
if ((i % j) == 0) {
foo ++;
}
}
}
long end = System.currentTimeMillis();
System.out.printf("%d %dms\n", foo,
(int) ((end - start) * 1000 / CLOCKS_PER_SEC));
}


public static void main(String[] args) {
for (int i = 1; i < 10; ++i) {
System.out.printf("%d: ", i);
bench();
}
}
}

Then ran these:
-bash-3.00$ java -version
java version "1.5.0_09"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_09-b03)
Java HotSpot(TM) Client VM (build 1.5.0_09-b03, mixed mode, sharing)
-bash-3.00$ javac Bench.java
-bash-3.00$ g++ --version
g++ (GCC) 3.3.3 (NetBSD nb3 20040520)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is
NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.
-bash-3.00$ java -server Bench
1: 38357 457ms
2: 38357 416ms
3: 38357 401ms
4: 38357 394ms
5: 38357 394ms
6: 38357 401ms
7: 38357 395ms
8: 38357 401ms
9: 38357 394ms
-bash-3.00$ java -client Bench
1: 38357 421ms
2: 38357 400ms
3: 38357 394ms
4: 38357 400ms
5: 38357 393ms
6: 38357 393ms
7: 38357 400ms
8: 38357 394ms
9: 38357 401ms
-bash-3.00$ ./bench
1: 38357 450ms
2: 38357 440ms
3: 38357 450ms
4: 38357 430ms
5: 38357 450ms
6: 38357 440ms
7: 38357 450ms
8: 38357 440ms
9: 38357 450ms


This looks to me like the c version is slower...
 
S

Steve Wampler

Jon said:
On the contrary, Olivier's test is simple and flawless. Can you make the
Java nearly as fast as the C? Feel free to not use unicode, or does Java
have poor support for everything else?

Well, if you like such simple and flawless tests, here's another one.
Of course, this one is (on my machine) over 700 times *faster* in
Java than in C. So, why does C have such poor performance - surely
it's been around enough that optimizers should be able to as well
as Java on such a simple test.

(By the way, the 700 is pretty arbitrary - I could just as easily
make it 700,000 times faster in Java!)

*This is why such 'simple, flawless' tests are so flawed*...

------------ C version and time ---
#include <stdio.h>

int
main(int ac, char **av) {
char *s = malloc(1024*1024);
int i = 0;
for (i = 0; i < 1024*1023; ++i) {
s = 'a';
}
s[++i] = '\0';

int k = 0;
int j = 0;
int n = 0;
for (n = 0; n < 100; ++n) {
for (j = 0; j < 1024*1024*1024; ++j) {
k = strlen(s);
}
}
printf("s has %ul (%u) characters\n", strlen(s), k);
}
->gcc -O4 t1.c -o t1
->time t1
s has 1047552l (1047552) characters
t1 113.74s user 0.25s system 99% cpu 1:54.18 total
->
------------------------------------------------------------

--------------------------------- Java version and time ---
class T1 {

public static void main(String[] args) {
StringBuilder s = new StringBuilder(1024*1024);
for (int i = 0; i < 1024*1023; ++i) {
s.append('a');
}
int k = 0;
for (int n = 0; n < 100; ++n) {
for (int i = 0; i < (1024*1024*1024); ++i) {
k = s.length();
}
}
System.out.println("s has "+s.length()+" ("+k+") characters.");
}
}
 
S

Steve Wampler

Steve said:
--------------------------------- Java version and time ---
class T1 {

public static void main(String[] args) {
StringBuilder s = new StringBuilder(1024*1024);
for (int i = 0; i < 1024*1023; ++i) {
s.append('a');
}
int k = 0;
for (int n = 0; n < 100; ++n) {
for (int i = 0; i < (1024*1024*1024); ++i) {
k = s.length();
}
}
System.out.println("s has "+s.length()+" ("+k+") characters.");
}
}
----------------------------------------------------------------

Sigh. The test may be simple and flawless, but my cutting and pasting
wasn't. Here's the missing part:
---------------------------------------------------------
->javac T1.java
->time java -server T1
s has 1047552 (1047552) characters.
java -server T1 0.12s user 0.04s system 101% cpu 0.158 total
->
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top