g++ loop unrolling performance

Discussion in 'C++' started by =?ISO-8859-1?Q?Per_Nordl=F6w?=, Aug 31, 2004.

  1. Hi all

    I am using the boost::array template class trying to generalize my
    handcrafted
    vector specialization for dimensions 2 (class vec2), 3 (class vec3) etc.

    As performance is of greatest importance I have written an initial
    benchmarker that tests how well g++ can unroll loops whose number of
    iterations
    can be determined at compile time or upon entry to the loop. The gcc switch
    "-funroll-loops" should do just that. The test program calculates the
    dotproduct of two four-dimensional arrays of int 10 million times and
    looks like follows:

    #include "../array.hh"
    #include "../Timer.hh"

    using boost::array;
    using std::cout;
    using std::endl;

    template <typename T, std::size_t N>
    inline T general_dot(const array<T, N> & a, const array<T, N> & b)
    {
    T c = 0;
    for (size_t i = 0; i < N; i++)
    {
    c += a * b;
    }
    return c;
    }

    template <typename T>
    inline T special_dot(const array<T, 4> & a, const array<T, 4> & b)
    {
    return (a[0] * b[0] +
    a[1] * b[1] +
    a[2] * b[2] +
    a[3] * b[3]);
    }

    int main(int argc, char * argv[])
    {
    typedef array<int, 4> T;

    T a(3);

    cout << "a: " << a << endl;

    a[0] = 11;
    a[1] = 13;
    a[2] = 17;
    a[3] = 19;

    cout << "a: " << a << endl;

    T b = a;

    Timer t;

    const unsigned int nloops = 10000000;

    unsigned int sum = 0;
    t.reset();
    for (unsigned int i = 0; i < nloops; i++)
    {
    sum += general_dot(a, b);
    }
    t.read();
    cout << "general: " << t << endl;

    unsigned int tum = 0;
    t.reset();
    for (unsigned int i = 0; i < nloops; i++)
    {
    tum += special_dot(a, b);
    }
    t.read();
    cout << "special: " << t << endl;

    if (sum == tum)
    {
    cout << "Checksums are equal. OK" << endl;
    }
    else
    {
    cout << "Checksums are not equal. NOT OK" << endl;
    }

    return 0;
    }

    The calculation is performed with a general and a specialized version of
    the dot product: general_dot() and special_dot() respectively.

    However the performance of the general_dot() is terrible compared to the
    special_dot(). Around 35 times slower when I compile it with gcc-3.3.2 using
    the switches "-O3 -funroll-all-loops".

    Is gcc really that lame or have I forgotten something?


    Many thanks in advance,

    Per Nordlöw
    Swedish Defence Research Agency
    Linköping
    Sweden
    =?ISO-8859-1?Q?Per_Nordl=F6w?=, Aug 31, 2004
    #1
    1. Advertising

  2. =?ISO-8859-1?Q?Per_Nordl=F6w?=

    Jack Klein Guest

    On Tue, 31 Aug 2004 08:51:12 +0200, Per Nordlöw <> wrote in
    comp.lang.c++:

    > Hi all
    >
    > I am using the boost::array template class trying to generalize my
    > handcrafted
    > vector specialization for dimensions 2 (class vec2), 3 (class vec3) etc.
    >
    > As performance is of greatest importance I have written an initial
    > benchmarker that tests how well g++ can unroll loops whose number of
    > iterations
    > can be determined at compile time or upon entry to the loop. The gcc switch
    > "-funroll-loops" should do just that. The test program calculates the
    > dotproduct of two four-dimensional arrays of int 10 million times and
    > looks like follows:


    [snip]

    > The calculation is performed with a general and a specialized version of
    > the dot product: general_dot() and special_dot() respectively.
    >
    > However the performance of the general_dot() is terrible compared to the
    > special_dot(). Around 35 times slower when I compile it with gcc-3.3.2 using
    > the switches "-O3 -funroll-all-loops".
    >
    > Is gcc really that lame or have I forgotten something?


    Questions about gcc and specific options should be addressed to one of
    the news:gnu.gcc.* groups. The C++ language does not define
    optimization options at all, not to mention those of specific
    compilers.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
    Jack Klein, Sep 1, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Edwards
    Replies:
    5
    Views:
    356
    Thomas Matthews
    Jul 7, 2003
  2. Richard Cavell

    Unrolling a loop

    Richard Cavell, Feb 23, 2005, in forum: C++
    Replies:
    3
    Views:
    2,210
    Phillip Jordan
    Feb 23, 2005
  3. V

    unrolling nested for-loop

    V, May 10, 2008, in forum: C Programming
    Replies:
    10
    Views:
    1,063
    Willem
    May 10, 2008
  4. mark

    ultra-fast loop unrolling with g++ -O3

    mark, Jun 12, 2008, in forum: C Programming
    Replies:
    2
    Views:
    730
    santosh
    Jun 12, 2008
  5. Isaac Won
    Replies:
    9
    Views:
    342
    Ulrich Eckhardt
    Mar 4, 2013
Loading...

Share This Page