Benchmarking

Discussion in 'C++' started by Jerry Coffin, Jun 20, 2004.

  1. Jerry Coffin

    Jerry Coffin Guest

    This is more or less related to the (now huge) thread about Java and
    C++ performance.

    It's been noted that the C++ involved in that benchmark is hardly how
    most people would actually write code for such a problem. Since most
    of it is pretty trivial anyway, I've decided to rewrite some of it
    into something closer to normal C++ to see how things work out. I
    haven't made any particular attempt at optimizing the C++ code -- I've
    simply written code that's simple and obvious and doesn't contain any
    obvious pessimizing.

    One other area that bothered me was the inability to check or
    reproduce the results of the cited benchmark. For this series of
    posts, I'm doing timing on a 2.8 GHz Pentium IV running Windows XP
    with most current patches (I won't try to list them all). While it's
    impossible to kill ALL other tasks, prior to each run I'm verifying
    that the machine is basically quiescent. On the Java side of things,
    I'm using the Sun Java SDK, version 1.4.2_04. On the C++ side, I'm
    using Visual C++ 7.1, compiling with "-Oxb2 -G7ry" unless otherwise
    noted. At least for now, I'm simply posting the raw run times for
    each test. When I've posted a number of results, I may add on some
    attempts at an overall score.

    First let's look at hash.cpp. As originally written, it used a
    non-standard hashing-based container, converted a number to a string
    with sprintf, and generally just didn't look much like anything I can
    imagine a C++ programmer ever writing. I started over nearly from the
    beginning using std::map, std::eek:stringstream, etc. The code looks like
    this:

    #include <stdio.h>
    #include <iostream>
    #include <map>
    #include <sstream>

    int main(int argc, char *argv[]) {
    int n = 1+((argc == 2) ? atoi(argv[1]) : 200000);

    std::map<std::string, int> X;

    for (int i=0; i<n; i++) {
    std::eek:stringstream buf;
    buf << std::hex << i;
    X[buf.str()] = i;
    }

    int c = 0;
    for (int i=0; i<n; i++) {
    std::eek:stringstream buf;
    buf << i;
    if (X.find(buf.str())!=X.end())
    c++;
    }
    std::cout << c << std::endl;
    return 0;
    }

    These changes _should_ favor Java considerably -- especially the
    change from an O(K) to O(lg N) container.

    Results:

    Java client: 1.4 seconds
    Java server: 1.2 seconds
    C++: 1.2 seconds

    As can be seen, this particular test gave a dead tie between the C++
    version and the Java server, with the Java client version slightly
    behind.

    Our next target will be the matrix test. In this case, the original
    code looked a lot like C code -- malloc and free with raw pointers
    everywhere, a "matmult" to do matrix multiplication, an explicit
    function call to free the matrices, etc.

    Again, I rewrote the code into a class on the order that a C++
    programmer would typically expect. Here, instead of repeatedly
    multiplying small matrices, I decided to simply increase the matrix
    size until it took long enough to time reasonably. The code looks
    like this:

    #include <iostream>
    #include <stdlib.h>
    #include <vector>

    const int SIZE = 700;

    template <class T>
    class matrix {
    int rows, cols;
    std::vector<T> data;
    public:
    matrix() {}

    matrix(int r, int c) : rows(r), cols(c), data(r*c)
    {
    for (int i=0; i<data.size(); i++)
    data = i+1;
    }

    matrix operator*(matrix const &m2) const {
    matrix<T> result(rows, m2.cols);

    for (int i=0; i<rows; i++)
    for (int j=0; j<m2.cols; j++) {
    int val =0;
    for (int k=0; k<m2.cols; k++)
    val += (*this)(i,k) * m2(k, j);
    result(i,j) = val;
    }
    return result;
    }

    T const &operator()(int row, int col) const {
    return data[col*rows+row];
    }

    T &operator()(int row, int col) {
    return data[col*rows+row];
    }
    };

    int main(int argc, char *argv[]) {
    int i, n = ((argc == 2) ? atoi(argv[1]) : 1);

    matrix<int> m1(SIZE, SIZE);
    matrix<int> m2(SIZE, SIZE);
    matrix<int> mm;

    for (i=0; i<n; i++)
    mm = m1 * m2;

    std::cout << mm(0,0) << " " << mm(2,3) << " "
    << mm(3,2) << " " << mm(4,4) << std::endl;

    return 0;
    }

    Again, I've taken no pains to optimize this code -- for example, the
    result of the multiplication is passed by value, so unless the
    compiler includes the NRVO, the return will be quite slow. Anybody
    doing serious matrix work would likely do something considerably more
    effcient. Since Java uses GC, this issue doesn't arise. The times came
    out like this:

    Java Client: 5.1 seconds
    Java Server: 5.2 seconds
    C++: 1.9 seconds

    If we reduce SIZE to 500, the Java positions swap:

    Java Client: 2.6 seconds
    Java Server: 2.0 seconds
    C++: 0.6 seconds

    but C++ wins by about the same (wide) margin either way.

    --
    Later,
    Jerry.

    The universe is a figment of its own imagination.
    Jerry Coffin, Jun 20, 2004
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Martin Schoeberl

    Benchmarking embedded Java

    Martin Schoeberl, Sep 24, 2004, in forum: Java
    Replies:
    1
    Views:
    482
    Martin Schoeberl
    Sep 25, 2004
  2. Andy Dingley
    Replies:
    3
    Views:
    880
    Andy Dingley
    Feb 11, 2004
  3. Jerry Coffin

    Benchmarking, part 2

    Jerry Coffin, Jun 21, 2004, in forum: C++
    Replies:
    8
    Views:
    405
    David Harmon
    Jun 26, 2004
  4. Carlo Razzeto

    Benchmarking

    Carlo Razzeto, Nov 19, 2004, in forum: C++
    Replies:
    1
    Views:
    423
    Victor Bazarov
    Nov 19, 2004
  5. Project2501
    Replies:
    1
    Views:
    380
Loading...

Share This Page