Java vs C++ speed (IO & Sorting)

J

Joe Greer

server than it is to write CGI programs in C++. (Curiously, one
of the application domains where I think Java would have the
edge would be light weight graphic clients---a very good, fully
integrated GUI library and portability of the compiled code
would seem to be major trump cards for that. But it doesn't
seem to be widely used there.)

In my opinion, there are several things that stand in the way of this.
First, there is more to look and feel than properly drawn controls. This
makes a generic GUI feel just a little wrong everywhere. Second,
interfacing to native code from Java, while doable, is a pain. Third,
there aren't as many folks worried about GUI portability as you might
imagine. There just isn't the bang for the buck you want in order to go
through the porting effort.

joe
 
R

Razii

Excuse me? Those numbers show a 32% variance while yours show a 38%
variance in the other direction. If 629 and 429 aren't much
difference, than neither are your numbers.

I never claimed Java is faster. People in this newsgroup have claimed
that java is *much* slower. In other words, you must show significant
speed advantage in this test case.
 
L

Lew

James said:
Modern Java has templates, and even back when I was using it
regularly, before it had templates, we didn't have to cast that
much.

Sorry, but that is not accurate. Java does not have templates. Java has
generics, which use a typographical notation similar to templates, but are
neither called "templates" nor work the way C++ templates do.

Generics were introduced in Java 5, some three-plus years ago. Unlike C#
generics they pertain only to the compile phase, not to run-time types. Java
generics are quite powerful, but the lack of reification is a major complaint
in the Java universe. Java programmers are used to run-time typing and seem
to resent that generics aren't part of it.

Personally, while I agree that generic type erasure is annoying, I find the
discipline of separating run-time and compile-time notions to be a powerful
aid to reasoning about my programs, especially in a dynamic language such as
Java or C#.
 
R

Razii

Which you did not do in your test.

I am compiling with JET now..
Complete nonsense. If you are using the VM to load and run your class
file, the time must count.

Nope. VM load time has nothing to do with what we are testing. C++
code also only times needed for reading the file, sorting and writing
the file.
 
W

witmer

I never claimed Java is faster. People in this newsgroup have claimed
that java is *much* slower. In other words, you must show significant
speed advantage in this test case.

32% is extremely significant.
 
L

Lew

Razii said:
You are wasting time and trolling. Memory allocation is no issue in
the part we tested, as someone else also noted.

Perhaps I am wasting time, indeed even by participating in Benchmark Wars at
all, but I suspect you misconstrue. My point was that any memory
de-allocation in Java is pretty much unavoidable, as it's built into the JVM,
and therefore that all Java benchmarks will be affected by it. I was

I was not disagreeing with your point at all. I didn't say that the C++
version de-allocated memory, only that de-allocation in the Java version,
*should it occur*, would perforce be measured.

So settle down, and get back to your Benchmark Wars already in progress. No
need to go all /ad hominem/ on me.
 
M

Michael DOUBEZ

Razii a écrit :
I never claimed Java is faster. People in this newsgroup have claimed
that java is *much* slower. In other words, you must show significant
speed advantage in this test case.

On my system, c++ has a penalty with writing to the disk.
If you remove writing, you lower the noise on the measurement. I guess
the STL I use could benefit of some optimisation in this part.

I for one never claimed java is *much* slower, JIT do wonders. Truly and
I expect increased processor power will lessen the gap. Nowadays, it is
more I/O access that are the bottlenecks.

Here are the results of the modified versions (removing the write part):

D:\gnuwin32>java IOSort && IOSort.exe
Time for reading, sorting, writing: 1448 ms
Time for reading, sorting, writing: 1167 ms

D:\gnuwin32>java IOSort && IOSort.exe
Time for reading, sorting, writing: 1402 ms
Time for reading, sorting, writing: 1183 ms

D:\gnuwin32>java IOSort && IOSort.exe
Time for reading, sorting, writing: 1417 ms
Time for reading, sorting, writing: 1183 ms

--- Inverting order -------

D:\gnuwin32>IOSort.exe && java IOSort
Time for reading, sorting, writing: 1152 ms
Time for reading, sorting, writing: 1183 ms

D:\gnuwin32>IOSort.exe && java IOSort
Time for reading, sorting, writing: 918 ms
Time for reading, sorting, writing: 1354 ms

D:\gnuwin32>IOSort.exe && java IOSort
Time for reading, sorting, writing: 1074 ms
Time for reading, sorting, writing: 1386 ms


Michael
 
D

dave_mikesell

Nope. VM load time has nothing to do with what we are testing. C++
code also only times needed for reading the file, sorting and writing
the file.

If you're testing the Java solution to this problem, you are by
definition testing the performance of the runtime.

It's like you saying you can get a kite to 200 feet faster from the
top of a 15-story building, than I can from the ground, but we can't
count the time it took you to take the elevator.
 
L

Lew

You can't run the Java version without the VM. It's perfectly
reasonable (and intellectually honest) to count the VM startup time.

That depends on what you're measuring. For many server applications, startup
time is a negligible fraction of the application's uptime. If you want to
know how fast the Hotspot-optimized version of an algorithm is, you'll not
only factor out the JVM startup time, but you'll run the loop a zillion times
before starting the clock in order to give Hotspot a chance to care. If
you're intellectually honest.
 
R

Razii

More results this time java.class compiled to native Windows (instead
of using VM) by using JET compiler.

http://www.excelsior-usa.com/ (JET compiler can be found here)


Time for reading, sorting, writing: 203 ms (Java with JET)
Time for reading, sorting, writing: 203 ms (Java with JET)
Time for reading, sorting, writing: 188 ms (Java with JET)

(c++ using multiset)
Time for reading, sorting, writing: 328 ms (c++)
Time for reading, sorting, writing: 312 ms (c++)
Time for reading, sorting, writing: 312 ms (c++)


10 bibles (43 meg file)

Time for reading, sorting, writing: 2453 ms (Java with JET)
Time for reading, sorting, writing: 2391 ms (Java with JET)
Time for reading, sorting, writing: 2344 ms (Java with JET)
Time for reading, sorting, writing: 2437 ms (Java with JET)

Time for reading, sorting, writing: 5281 ms (c++)
Time for reading, sorting, writing: 5703 ms (c++)
Time for reading, sorting, writing: 3921 ms (c++)
Time for reading, sorting, writing: 3718 ms (c++)
 
L

Lew

If you're testing the Java solution to this problem, you are by
definition testing the performance of the runtime.

It's like you saying you can get a kite to 200 feet faster from the
top of a 15-story building, than I can from the ground, but we can't
count the time it took you to take the elevator.

Apples and oranges. It's all about what you intend to measure. Demanding
that he intend to measure something different isn't fair. He's disclosed the
limits of his testing, that's all that's required of intellectual
responsibility. It's up to you to decide the relevance of that limit, not him
to change it.

As I mention elsewhere, for many applications and architectures JVM startup
time is irrelevant. What you want is granular predictability - know how much
JVM overhead contributes, and separately how long the algorithm will take.
Add them together if that's relevant to your analysis, don't if it isn't.
 
R

Razii

Which you did not do in your test.

I hope the following makes you really happy

http://www.excelsior-usa.com/ (JET compiler can be found here)


Time for reading, sorting, writing: 203 ms (Java with JET)
Time for reading, sorting, writing: 203 ms (Java with JET)
Time for reading, sorting, writing: 188 ms (Java with JET)

(c++ using multiset)
Time for reading, sorting, writing: 328 ms (c++)
Time for reading, sorting, writing: 312 ms (c++)
Time for reading, sorting, writing: 312 ms (c++)


10 bibles (43 meg file)

Time for reading, sorting, writing: 2453 ms (Java with JET)
Time for reading, sorting, writing: 2391 ms (Java with JET)
Time for reading, sorting, writing: 2344 ms (Java with JET)
Time for reading, sorting, writing: 2437 ms (Java with JET)

Time for reading, sorting, writing: 5281 ms (c++)
Time for reading, sorting, writing: 5703 ms (c++)
Time for reading, sorting, writing: 3921 ms (c++)
Time for reading, sorting, writing: 3718 ms (c++)
 
D

dave_mikesell

That depends on what you're measuring.  For many server applications, startup
time is a negligible fraction of the application's uptime.  If you want to
know how fast the Hotspot-optimized version of an algorithm is, you'll not
only factor out the JVM startup time, but you'll run the loop a zillion times
before starting the clock in order to give Hotspot a chance to care.  If
you're intellectually honest.

So are you also discounting the time it takes the larger executable to
be loaded by the OS?

And by all means, I encourage the OP to right a more real world
application to benchmark where startup time is negligible.
 
D

dave_mikesell

I hope the following makes you really happy

Very good. You've proven beyond a shadow of a doubt that Java is the
better language for this particular toy benchmark, at least on your
machine.
 
J

Jerry Coffin

This topic was on these newsgroups 7 years ago :)

[ ... ]
Back to see if anything has changed

Not much -- you're still a troll, and people still respond to trolls.

[ ... ]
The question still is (7 years later), where is great speed advantage
you guys were claiming for c++?

Anybody who claims a major speed advantage for C++ (or much of anything
else) on an application that's mostly I/O bound is nuts. I doubt anybody
has claimed any such thing.

While C++ iostreams are extremely versatile, they're not necessarily the
most efficient way to do I/O. This has little to do with the language,
per se, and a great deal to do with consious tradeoffs. While it's
theoretically possible to design around most of those tradeoffs to
improve speed, most vendors seem uninterested.

I don't really care all that much about your Java code (which won't even
run for me) but just for fun, let's take a look at what happens to the
C++ version with a minor modification:

#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <ctime>
#include <iterator>
#include <set>

// the main modification is here
#ifdef CSTDIO
#include "cstdio.h"
namespace s = JVC;
#else
namespace s = std;
#endif

// I've also gotten rid of the "using namespace std;" and explicitly
// qualified the names below.
//

int main() {

typedef std::multiset<std::string> mss;
mss buf;
std::string linBuf;

s::ifstream inFile("bible.txt");

clock_t start=clock();

while(s::getline(inFile,linBuf)) buf.insert(buf.end(), linBuf)
;

s::eek:fstream outFile("output.txt");

std::copy(buf.begin(),buf.end(),
s::eek:stream_iterator<std::string>(outFile,"\n"));

clock_t endt=clock();
std::cout <<"Time for reading, sorting, writing: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
return 0;
}

Now a few numbers, generated with VC++ 7.1:

compiled with: cl /O2b2 /G7ry /arch:SSE2 sort_bible.cpp
average speed for five runs: 230 ms

compiled with: cl /O2b2 /G7ry /arch:SSE2 /DCSTDIO sort_bible.cpp
average speed for five runs: 93 ms

As you can see, with a relatively trivial change, we've improved speed
by almost 2.5:1. That doesn't require a great deal of tricky coding or
anything like that either. The cstdio.h that's being included above
looks like this:

#include <stdio.h>

#ifndef JVC_STREAM
#define JVC_STREAM

namespace JVC {

class ofstream {
FILE *file;
public:
ofstream(char const *name) { file = fopen(name, "w"); }

ofstream &write(std::string const &s) {
fputs(s.c_str(), file);
return *this;
}
};

ofstream &operator<<(ofstream &os, std::string const &s) {
return os.write(s);
}

class ifstream {
FILE *file;
bool good;
public:
ifstream(char const *name) { file = fopen(name, "r"); }
ifstream &read(std::string &s) {
static char buffer[1024*1024];

good = fgets(buffer, sizeof(buffer), file);
s = buffer;
return *this;
}
operator void *() { return (void *)good; }
};

ifstream &getline(ifstream &is, std::string &s) {
return is.read(s);
}

template<class T>
class ostream_iterator {
ofstream &os_;
bool has_delim;
std::string delim_;
public:
ostream_iterator(ofstream &os) :
os_(os), has_delim(false)
{ }
ostream_iterator(ofstream &os, std::string const &delim) :
os_(os), has_delim(true), delim_(delim)
{ }

ostream_iterator &operator=(T const &t) {
os_ << t;
if (has_delim)
os_ << delim_.c_str();
return *this;
}
ostream_iterator operator*() { return *this; }
ostream_iterator operator++() { return *this; }
ostream_iterator operator++(int) { return *this; }
};

}

#endif

Of course, the benefit of this (if any) depends heavily upon the
compiler and standard library implementation you're using. With a really
efficient implementation of iostreams, this could reduce speed. With the
iostreams included with the versions of VC++ I've tried, the difference
is substantial enough to justify its use in quite a few cases.

The speed of this code depends almost entirely upon the implementation
of the standard library. For example, going from gcc 3.4 to gcc 4.3
roughly doubles the speed of the code (on my machine it's about 175-300
ms with gcc 3.4 and about 150-175 ms with gcc 4.3).

All in all, you've managed to do a better job than most: you're
obviously a troll. While many trolls are fond of meaningless benchmarks,
you've nearly set a new record for the worst benchmark ever!
 
J

Jerry Coffin

[ ... ]
I have a hard time imagining any simple way NOT to include GC time in the Java
timings.

You simply write code (like this) that almost certainly never does GC.
All memory it ever allocates remains allocated until the program
finishes execution, therefore there's never any garbage to collect. Yes,
it's theoretically possible that the garbage collector may run -- but
with nothing for it to really DO, the time taken is negligible.

By contrast, with a program that allocates and deletes memory on a
regular basis, the garbage collector actually DOES something when it
runs, and (no great surprise) the time taken to do something is greater
than the time taken to do nothing.
 
S

Steve Wampler

Complete nonsense. If you are using the VM to load and run your class
file, the time must count.

Do you count the time to boot the computer when running C++? :)
[Yes, I agree that if you did you'd have to the same for Java!]

I have an environment where I can leave the JVM running and
load/run/unload class files. Why should the time to start
up that JVM matter in this test?
 
K

kasthurirangan.balaji

Very good. You've proven beyond a shadow of a doubt that Java is the
better language for this particular toy benchmark, at least on your
machine.


I understand JVM makes java platform independent, but JVM should
definitely be platform dependent(and also optimized). I would say it
would be fair if we compile c++ with optimization option(atleast with -
O3 in gcc) and then compare, sticking strictly to the post of java vs c
++ speed(i/o & sort). Also request to add unitbuf option to the
ofstream as well.

Thanks,
Balaji.
 
R

Razii

Anybody who claims a major speed advantage for C++ (or much of anything
else) on an application that's mostly I/O bound is nuts. I doubt anybody
has claimed any such thing.

This shows that you are either lying or have a bad memory. The last
time I checked the group, many here were claiming that Java IO is much
slower.

Have a look at this post by Pete Becker Dinkumware, Ltd.
(http://www.dinkumware.com) posted to this very newsgroup...

http://groups.google.com/group/comp.lang.java.programmer/msg/1313c62be872ba7c?dmode=source


What is he trying to prove with these fake benchmarks?

Since you have been proven a liar by claiming no one has ever said IO
is slow in Java, I will ignore the rest of your post without bothering
to read.
 
D

dave_mikesell

Since you have been proven a liar by claiming no one has ever said IO
is slow in Java, I will ignore the rest of your post without bothering
to read.

OK, not only is this a troll, I'm starting to think it's a joke. Is
that Phil Hendrie behind the keyboard?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top