Java vs C++ speed (IO & Sorting)

Discussion in 'C++' started by Razii, Mar 20, 2008.

  1. Razii

    Razii Guest

    This topic was on these newsgroups 7 years ago :)

    http://groups.google.com/group/comp.lang.c /msg/695ebf877e25b287

    I said then: "How about reading the whole Bible, sorting by lines, and
    writing the sorted book to a file?"

    Who remember that from 7 years ago, one of the longest thread on this
    newsgroup :)

    The text file used for the bible is here
    ftp://ftp.cs.princeton.edu/pub/cs126/markov/textfiles/bible.txt

    Back to see if anything has changed

    (downloaded whatever is latest version from sun.java.com)

    Time for reading, sorting, writing: 359 ms (Java)
    Time for reading, sorting, writing: 375 ms (Java)
    Time for reading, sorting, writing: 375 ms (Java)

    Visual C++ express and command I used was cl IOSort.cpp /O2

    Time for reading, sorting, writing: 375 ms (c++)
    Time for reading, sorting, writing: 390 ms (c++)
    Time for reading, sorting, writing: 359 ms (c++)

    The question still is (7 years later), where is great speed advantage
    you guys were claiming for c++?

    ------------------- Java Code -------------- (same as 7 years ago :)

    import java.io.*;
    import java.util.*;
    public class IOSort
    {
    public static void main(String[] arg) throws Exception
    {
    ArrayList ar = new ArrayList(5000);


    String line = "";


    BufferedReader in = new BufferedReader(
    new FileReader("bible.txt"));
    PrintWriter out = new PrintWriter(new BufferedWriter(
    new FileWriter("output.txt")));


    long start = System.currentTimeMillis();
    while (true)
    {
    line = in.readLine();
    if (line == null)
    break;
    if (line.length() == 0)
    continue;
    ar.add(line);
    }


    Collections.sort(ar);
    int size = ar.size();
    for (int i = 0; i < size; i++)
    {
    out.println(ar.get(i));
    }
    out.close();
    long end = System.currentTimeMillis();
    System.out.println("Time for reading, sorting, writing: "+
    (end - start) + " ms");
    }
    }

    --------- C++ Code ---------------

    #include <fstream>
    #include<iostream>
    #include <string>
    #include <vector>
    #include <algorithm>
    #include <ctime>
    using namespace ::std;


    int main()
    {
    vector<string> buf;
    string linBuf;
    ifstream inFile("bible.txt");
    clock_t start=clock();
    buf.reserve(50000);


    while(getline(inFile,linBuf)) buf.insert(buf.end(), linBuf);
    sort(buf.begin(), buf.end());
    ofstream outFile("output.txt");
    copy(buf.begin(),buf.end(),ostream_iterator<string>(outFile,"\n"));
    clock_t endt=clock();
    cout <<"Time for reading, sorting, writing: " << endt-start << "
    ms\n";
    return 0;

    }
    Razii, Mar 20, 2008
    #1
    1. Advertising

  2. Razii

    Tim H Guest

    On Mar 19, 11:10 pm, Razii <> wrote:
    > This topic was on these newsgroups 7 years ago :)
    >
    > http://groups.google.com/group/comp.lang.c /msg/695ebf877e25b287
    >
    > I said then: "How about reading the whole Bible, sorting by lines, and
    > writing the sorted book to a file?"
    >
    > Who remember that from 7 years ago, one of the longest thread on this
    > newsgroup :)
    >
    > The text file used for the bible is hereftp://ftp.cs.princeton.edu/pub/cs126/markov/textfiles/bible.txt
    >
    > Back to see if anything has changed
    >
    > (downloaded whatever is latest version from sun.java.com)
    >
    > Time for reading, sorting, writing: 359 ms (Java)
    > Time for reading, sorting, writing: 375 ms (Java)
    > Time for reading, sorting, writing: 375 ms (Java)
    >
    > Visual C++ express and command I used was cl IOSort.cpp /O2
    >
    > Time for reading, sorting, writing: 375 ms (c++)
    > Time for reading, sorting, writing: 390 ms (c++)
    > Time for reading, sorting, writing: 359 ms (c++)
    >
    > The question still is (7 years later), where is great speed advantage
    > you guys were claiming for c++?
    >
    > ------------------- Java Code -------------- (same as 7 years ago :)
    >
    > import java.io.*;
    > import java.util.*;
    > public class IOSort
    > {
    > public static void main(String[] arg) throws Exception
    > {
    > ArrayList ar = new ArrayList(5000);
    >
    > String line = "";
    >
    > BufferedReader in = new BufferedReader(
    > new FileReader("bible.txt"));
    > PrintWriter out = new PrintWriter(new BufferedWriter(
    > new FileWriter("output.txt")));
    >
    > long start = System.currentTimeMillis();
    > while (true)
    > {
    > line = in.readLine();
    > if (line == null)
    > break;
    > if (line.length() == 0)
    > continue;
    > ar.add(line);
    > }
    >
    > Collections.sort(ar);
    > int size = ar.size();
    > for (int i = 0; i < size; i++)
    > {
    > out.println(ar.get(i));
    > }
    > out.close();
    > long end = System.currentTimeMillis();
    > System.out.println("Time for reading, sorting, writing: "+
    > (end - start) + " ms");
    > }
    >
    > }
    >
    > --------- C++ Code ---------------
    >
    > #include <fstream>
    > #include<iostream>
    > #include <string>
    > #include <vector>
    > #include <algorithm>
    > #include <ctime>
    > using namespace ::std;
    >
    > int main()
    > {
    > vector<string> buf;
    > string linBuf;
    > ifstream inFile("bible.txt");
    > clock_t start=clock();
    > buf.reserve(50000);
    >
    > while(getline(inFile,linBuf)) buf.insert(buf.end(), linBuf);
    > sort(buf.begin(), buf.end());
    > ofstream outFile("output.txt");
    > copy(buf.begin(),buf.end(),ostream_iterator<string>(outFile,"\n"));
    > clock_t endt=clock();
    > cout <<"Time for reading, sorting, writing: " << endt-start << "
    > ms\n";
    > return 0;
    >
    > }


    Did this include JVM startup time? What were the memory footprints?
    Tim H, Mar 20, 2008
    #2
    1. Advertising

  3. Razii

    Razii Guest

    On Wed, 19 Mar 2008 23:39:01 -0700 (PDT), Tim H <>
    wrote:

    >Did this include JVM startup time? What were the memory footprints?



    Read the code.. you will see where the time comes from. What does
    start time of virtual machine has to do with the time for reading,
    sorting and writing file?
    Razii, Mar 20, 2008
    #3
  4. Razii

    Guest

    On Mar 20, 2:10 am, Razii <> wrote:
    > The question still is (7 years later), where is great speed advantage
    > you guys were claiming for c++?


    Well I was not involved in that original topic, but I can tell you
    that Java has improved a lot over the years. VM startup times aside,
    there are VM's that will compile to native code on the fly, the byte
    code optimizers have been greatly improved, there are even CPU's that
    execute Java byte code directly (not on your test platform, but you'll
    find these on devices like PDAs and mobile phones).

    C++ will bring you closer to the hardware you are developing for, that
    is one of the strengths of the language, but Java can be just as
    respectable as far as performance goes. It really just depends on what
    you are using the language for. Use the most appropriate tool for the
    job.

    Also, comparing to your results 7 years ago, it looks like Java has
    slowed down a bit, relatively. :-D

    Jason
    , Mar 20, 2008
    #4
  5. Razii

    Razii Guest

    On Wed, 19 Mar 2008 23:57:25 -0700 (PDT), ""
    <> wrote:


    >Also, comparing to your results 7 years ago, it looks like Java has
    >slowed down a bit, relatively. :-D


    Slowed down? It was 2080 ms in the google link that I posted. It's 359
    ms this time (however, the bible.txt file was different back then. So
    there can't be any comparison with the old times).
    Razii, Mar 20, 2008
    #5
  6. Razii

    Guest

    On Mar 20, 3:03 am, Razii <> wrote:
    > On Wed, 19 Mar 2008 23:57:25 -0700 (PDT), ""
    >
    > <> wrote:
    > >Also, comparing to your results 7 years ago, it looks like Java has
    > >slowed down a bit, relatively. :-D

    >
    > Slowed down? It was 2080 ms in the google link that I posted. It's 359
    > ms this time (however, the bible.txt file was different back then. So
    > there can't be any comparison with the old times).


    Key word: relatively. I was making a joke that the old C++:Java ratio
    was 3400:2080 (1.0:0.6), and the new one is 375:370 (1.0:1.0).

    Jason
    , Mar 20, 2008
    #6
  7. Razii

    Ian Collins Guest

    Razii wrote:
    >
    > --------- C++ Code ---------------
    >
    > #include <fstream>
    > #include<iostream>
    > #include <string>
    > #include <vector>
    > #include <algorithm>
    > #include <ctime>


    #include <iterator>

    Is required for ostream_iterator.

    > using namespace ::std;
    >
    >
    > int main()
    > {
    > vector<string> buf;
    > string linBuf;
    > ifstream inFile("bible.txt");
    > clock_t start=clock();
    > buf.reserve(50000);
    >
    >
    > while(getline(inFile,linBuf)) buf.insert(buf.end(), linBuf);
    > sort(buf.begin(), buf.end());


    Why not use a sorted container? Your example takes 120mS on my box,
    using std::multiset reduces this to 90.

    > ofstream outFile("output.txt");
    > copy(buf.begin(),buf.end(),ostream_iterator<string>(outFile,"\n"));
    > clock_t endt=clock();
    > cout <<"Time for reading, sorting, writing: " << endt-start << "
    > ms\n";


    endt-start is in what ever unit the system returns from clock(), it
    should be scaled by CLOCKS_PER_SEC.

    --
    Ian Collins.
    Ian Collins, Mar 20, 2008
    #7
  8. Razii

    peter koch Guest

    On 20 Mar., 07:10, Razii <> wrote:
    > This topic was on these newsgroups 7 years ago :)
    >
    > http://groups.google.com/group/comp.lang.c /msg/695ebf877e25b287
    >
    > I said then: "How about reading the whole Bible, sorting by lines, and
    > writing the sorted book to a file?"
    >
    > Who remember that from 7 years ago, one of the longest thread on this
    > newsgroup :)
    >

    [snip]

    First of all, I believe this is a bad test. A lot of the time will be
    involved with I/O which the compilers cant really affect. I also
    notice that the time included does not involve releasing memory used
    by the Java-program which is unfair as this time was measured in the C+
    + version.
    Be that as it is, I notice that the C++ version is fifty percent
    shorter which suggests that developing with C++ will be quite a lot
    faster.
    I also wonder what happens in the hypothetical case where you were
    told that the solution produced was simply to slow. I know that C++
    offers you lots of flexibility where you could program towards a
    certain environment, using e.g. memory-mapped I/O. (*)
    So all in all, the above benchmark could never make me consider
    switching languages.

    /Peter

    (*) Simpler measures such as adjusting the buffers of the streams
    could also have an effect.
    peter koch, Mar 20, 2008
    #8
  9. Razii

    Guest

    On Mar 20, 4:20 am, Ian Collins <> wrote:
    > Why not use a sorted container? Your example takes 120mS on my box,
    > using std::multiset reduces this to 90.


    What kind of super computers are you guys using? I mean my machine is
    a little over a year old but... took me 790ms on a 2.16GHz Core Duo,
    7200 RPM SATA something or other hard drive, with GCC -O2 (MinGW, GCC
    3.4.5), using QueryPerformanceCounter for timings. Multiset reduced it
    to about 720; with 380 for read + sort and 340 for write.

    Jason
    , Mar 20, 2008
    #9
  10. Razii

    Razii Guest

    On Thu, 20 Mar 2008 01:16:58 -0700 (PDT), ""
    <> wrote:

    >Key word: relatively. I was making a joke that the old C++:Java ratio
    >was 3400:2080 (1.0:0.6), and the new one is 375:370 (1.0:1.0).


    Read the whole thread.. (700+ posts .. is that still a record in this
    group?) :)

    In the end they whined, made me change compilers, then after I got
    VC++, claimed there is a bug in VC++ 5.0 library, . I had to fix the
    bug. c++ ended up slightly faster after all that. even then there was
    nothing to brag about.
    Razii, Mar 20, 2008
    #10
  11. Razii

    Ian Collins Guest

    wrote:
    > On Mar 20, 4:20 am, Ian Collins <> wrote:
    >> Why not use a sorted container? Your example takes 120mS on my box,
    >> using std::multiset reduces this to 90.

    >
    > What kind of super computers are you guys using? I mean my machine is
    > a little over a year old but... took me 790ms on a 2.16GHz Core Duo,
    > 7200 RPM SATA something or other hard drive, with GCC -O2 (MinGW, GCC
    > 3.4.5), using QueryPerformanceCounter for timings. Multiset reduced it
    > to about 720; with 380 for read + sort and 340 for write.
    >

    Super computer? Just an AMD FX74 3Ghz, Sun CC. 70mS reading to
    multiset, 20mS writing.

    --
    Ian Collins.
    Ian Collins, Mar 20, 2008
    #11
  12. Razii

    Razii Guest

    On Thu, 20 Mar 2008 21:20:19 +1300, Ian Collins <>
    wrote:

    >Why not use a sorted container? Your example takes 120mS on my box,
    >using std::multiset reduces this to 90.


    Two chapters in the bible are identical. If you used set, that won't
    include duplicates.

    Both java and c++ used the same code, so what's the problem?

    Funny that in 2001 when I first posted this I used set. Some guy, Pete
    Becker, claimed I was comparing apples and oranges and must use
    vector.
    Razii, Mar 20, 2008
    #12
  13. Razii

    Ian Collins Guest

    Razii wrote:
    > On Thu, 20 Mar 2008 21:20:19 +1300, Ian Collins <>
    > wrote:
    >
    >> Why not use a sorted container? Your example takes 120mS on my box,
    >> using std::multiset reduces this to 90.

    >
    > Two chapters in the bible are identical. If you used set, that won't
    > include duplicates.
    >

    I said multiset.

    You're requirement was "How about reading the whole Bible, sorting by
    lines, and writing the sorted book to a file?"

    Reading into a multiset and then writing out meets those requirements.

    --
    Ian Collins.
    Ian Collins, Mar 20, 2008
    #13
  14. Razii

    Lionel B Guest

    On Thu, 20 Mar 2008 05:22:15 -0500, Razii wrote:

    > On Thu, 20 Mar 2008 21:20:19 +1300, Ian Collins <>
    > wrote:
    >
    >>Why not use a sorted container? Your example takes 120mS on my box,
    >>using std::multiset reduces this to 90.

    >
    > Two chapters in the bible are identical. If you used set, that won't
    > include duplicates.


    That's probably why he suggested using `multiset'.

    > Both java and c++ used the same code, so what's the problem?


    "Same code" seems like stretching it a bit to me...

    > Funny that in 2001 when I first posted this I used set. Some guy, Pete
    > Becker, claimed I was comparing apples and oranges and must use vector.


    Does Java have a `multiset' equivalent? If so, maybe try a comparison
    using that.

    --
    Lionel B
    Lionel B, Mar 20, 2008
    #14
  15. Razii wrote:
    > The question still is (7 years later), where is great speed advantage
    > you guys were claiming for c++?


    1) 300 ms is too short of a time for any reliable comparison.

    2) With heavy I/O, as in this case, the bottleneck is not in the
    language but in the I/O system, which is often independent of the
    language (and more dependent on the hardware and somewhat on the
    operating system).
    Just because Java can read and write files at the same speed as C++
    doesn't mean that it's equally fast in general. (OTOH, it doesn't mean
    the contrary either, of course.)
    Juha Nieminen, Mar 20, 2008
    #15
  16. Razii

    Razii Guest

    On Thu, 20 Mar 2008 03:01:28 -0700 (PDT), peter koch
    <> wrote:

    >I also
    >notice that the time included does not involve releasing memory used
    >by the Java-program which is unfair as this time was measured in the C+
    >+ version.


    You are not making sense. Where on earth is c+ releasing memory in the
    code that I posted?

    >Be that as it is, I notice that the C++ version is fifty percent
    >shorter which suggests that developing with C++ will be quite a lot
    >faster.


    No, it's generally accepted that developing in C++ is much slower and
    difficult due to pathetic c++ library, no thread support, no network
    library. As for the length of code I posted, I can jumble everything
    together and make Java code look short :)

    import java.io.*; import java.util.*; public class IOSort
    {public static void main(String[] arg) throws Exception {
    ArrayList<String> ar = new ArrayList<String>(50000); String line = "";
    BufferedReader in = new BufferedReader( new FileReader("bible.txt"));
    PrintWriter out = new PrintWriter(new BufferedWriter(new
    FileWriter("output.txt"))); long start = System.currentTimeMillis();
    while (true) { line = in.readLine(); if (line == null) break;
    ar.add(line); } Collections.sort(ar); int size = ar.size();
    for (int i = 0; i < size; i++) { out.println(ar.get(i));}
    out.close(); long end = System.currentTimeMillis();
    System.out.println("Time for reading, sorting, writing: "+ (end -
    start) + " ms"); } }

    I hope you are satisfied :))

    On a serious note, I also removed an unneeded line, (if (line.length()
    ==0) continue;) that was in the loop. That probably helped in speed.

    >So all in all, the above benchmark could never make me consider
    >switching languages.


    Yawn. I really care what language you use (NOT).
    Razii, Mar 20, 2008
    #16
  17. Razii

    Guest

    On Mar 20, 6:16 am, Razii <> wrote:
    > On Thu, 20 Mar 2008 01:16:58 -0700 (PDT), ""
    >
    > <> wrote:
    > >Key word: relatively. I was making a joke that the old C++:Java ratio
    > >was 3400:2080 (1.0:0.6), and the new one is 375:370 (1.0:1.0).

    >
    > Read the whole thread.. (700+ posts .. is that still a record in this
    > group?) :)


    I most certainly will not read the whole thread; because I do not
    care. Also I'm happy for your record. Makes for a good resume, I
    guess...? At least you'll be able to have fun trolling with the rest
    of the people here that will be taking the bait for the next few days.
    Good luck, I'll pop in and say high when this thread beats the old
    one.

    Jason



    >
    > In the end they whined, made me change compilers, then after I got
    > VC++, claimed there is a bug in VC++ 5.0 library, . I had to fix the
    > bug. c++ ended up slightly faster after all that. even then there was
    > nothing to brag about.
    , Mar 20, 2008
    #17
  18. Razii

    Guest

    On Mar 20, 6:16 am, Razii <> wrote:
    > On Thu, 20 Mar 2008 01:16:58 -0700 (PDT), ""
    >
    > <> wrote:
    > >Key word: relatively. I was making a joke that the old C++:Java ratio
    > >was 3400:2080 (1.0:0.6), and the new one is 375:370 (1.0:1.0).

    >
    > Read the whole thread.. (700+ posts .. is that still a record in this
    > group?) :)
    >
    > In the end they whined, made me change compilers, then after I got
    > VC++, claimed there is a bug in VC++ 5.0 library, . I had to fix the
    > bug. c++ ended up slightly faster after all that. even then there was
    > nothing to brag about.



    Anyways, if you use Java, it should be because of it's rich component
    library, cross-platformness, and other great strengths -- not because
    some strange test case out-performed another language by a few
    milliseconds. You are testing all the wrong things. What you really
    need to do is use whatever tool is most appropriate for the job at
    hand, not whatever tool sorts the bible 4 milliseconds faster than the
    other one...
    , Mar 20, 2008
    #18
  19. Razii

    James Kanze Guest

    On Mar 20, 7:10 am, Razii <> wrote:
    > This topic was on these newsgroups 7 years ago :)


    > http://groups.google.com/group/comp.lang.c /msg/695ebf877e25b287


    > I said then: "How about reading the whole Bible, sorting by lines, and
    > writing the sorted book to a file?"


    > Who remember that from 7 years ago, one of the longest thread on this
    > newsgroup :)


    > The text file used for the bible is
    > hereftp://ftp.cs.princeton.edu/pub/cs126/markov/textfiles/bible.txt


    > Back to see if anything has changed


    > (downloaded whatever is latest version from sun.java.com)


    > Time for reading, sorting, writing: 359 ms (Java)
    > Time for reading, sorting, writing: 375 ms (Java)
    > Time for reading, sorting, writing: 375 ms (Java)


    > Visual C++ express and command I used was cl IOSort.cpp /O2


    > Time for reading, sorting, writing: 375 ms (c++)
    > Time for reading, sorting, writing: 390 ms (c++)
    > Time for reading, sorting, writing: 359 ms (c++)


    > The question still is (7 years later), where is great speed advantage
    > you guys were claiming for c++?


    Who ever claimed a speed advantage for C++? I've said it more
    than once, I can write a benchmark in which C++ will beat Java
    hands down. Or vice versa. It happens that C++ will beat Java
    in the type of code I'm working on now, but the real reason I
    use C++ is because my applications have to be robust, and it's
    easier to develop correct code with C++ than with Java.

    For those who want to prove C++ faster, just do something with
    large arrays of user defined types having value semantics. For
    those who want to prove Java faster, use large arrays of basic
    types, or where you can swap the pointers, rather than the
    values. Like this particular example:)---I'm really surprised
    that Java didn't do a lot better.

    For those who are concerned with performance in your actual
    work, of course, write a benchmark which simulates your actual
    work (I don't know of too many people just sorting lines in a
    single large text corpus), and benchmark it, on the machine
    you'll actually be running on. (The quality of Java---and
    C++---implementations varies a lot.) In theory, Java has the
    advantage in array accesses (because of the lack of aliasing);
    C++ when it comes to handling user defined value types (no
    allocation is even cheaper than garbage collected allocation).
    In practice, however, it will depend on the implementation.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Mar 20, 2008
    #19
  20. Razii

    James Kanze Guest

    On Mar 20, 9:20 am, Ian Collins <> wrote:
    > Razii wrote:


    > > --------- C++ Code ---------------


    > > #include <fstream>
    > > #include<iostream>
    > > #include <string>
    > > #include <vector>
    > > #include <algorithm>
    > > #include <ctime>


    > #include <iterator>


    > Is required for ostream_iterator.


    Nothing to do with this thread, but I'll bet you caught that
    error with code review, and not with a unit test:).

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Mar 20, 2008
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ham

    I need speed Mr .Net....speed

    Ham, Oct 28, 2004, in forum: ASP .Net
    Replies:
    6
    Views:
    2,316
    Antony Baula
    Oct 29, 2004
  2. efiedler
    Replies:
    1
    Views:
    2,015
    Tim Ward
    Oct 9, 2003
  3. Replies:
    2
    Views:
    2,267
    Howard
    Apr 28, 2004
  4. Replies:
    2
    Views:
    330
    Christopher Benson-Manica
    Apr 28, 2004
  5. Razii
    Replies:
    454
    Views:
    8,876
Loading...

Share This Page