std::min/max vs own functions

Discussion in 'C++' started by eLVa, Apr 11, 2011.

  1. eLVa

    eLVa Guest

    Hi everyone,

    Below is a test code I made after noticing a big difference in terms
    of running time between the use of std::min or std::max and a
    templated version of it. I'm not sure where it does come from, so any
    comments are welcome.
    The code does not anything usefull, but it is a simplification of the
    real code (the purpose is only to show the difference of timings ..)

    I don't know if there is a proper way of posting a chunk of code, so I
    just drop it here ...

    #include <algorithm>
    #include <iostream>
    #include <sys/time.h>

    typedef unsigned char uchar;
    typedef unsigned int uint;
    using namespace std;

    void fn(const uchar *pL, const uchar *pR, int w, uint *pD) {
    int vlm, vlp;
    for (int i=0; i<w; ++i) {
    vlm = (i>0 ? (pL[i-1]+pL)/2 : pL);
    vlp = (i<w-1 ? (pL+pL[i+1])/2 : pL);

    pD = std::min<int>(vlp,std::max<int>(vlm,pR));
    }
    }

    template <class T> T min2(const T&a, const T&b) { return a<b?a:b; }
    template <class T> T max2(const T&a, const T&b) { return a<b?a:b; }

    void fn2(const uchar *pL, const uchar *pR, int w, uint *pD) {
    int vlm, vlp;
    for (int i=0; i<w; ++i) {
    vlm = (i>0 ? (pL[i-1]+pL)/2 : pL);
    vlp = (i<w-1 ? (pL+pL[i+1])/2 : pL);

    pD = min2<int>(vlp,max2<int>(vlm,pR));
    }
    }

    int main(int argc, char *argv[]) {

    int h = 400, w = 500;
    uchar *T1[h], *T2[h];
    uint *T3[h];

    for (int i=0; i<h; ++i) {
    T1 = new uchar[w];
    T2 = new uchar[w];
    T3 = new uint[w];
    }

    for (int j=0; j<h; ++j) {
    for (int i=0; i<w; ++i) {
    T1[j] = rand()%256;
    T2[j] = rand()%256;
    }
    }

    struct timeval t1, t2;
    gettimeofday(&t1, NULL);
    for (int z=0; z<1000; ++z) {
    for (int j=0; j<h; ++j) {
    uchar *p1 = T1[j];
    uchar *p2 = T2[j];
    uint *p3 = T3[j];
    fn(p1, p2, w, p3);
    }
    }
    gettimeofday(&t2, NULL);
    cout << "Took " << (t2.tv_sec-t1.tv_sec)+(t2.tv_usec-t1.tv_usec)/
    1.e6 << endl;

    gettimeofday(&t1, NULL);
    for (int z=0; z<1000; ++z) {
    for (int j=0; j<h; ++j) {
    uchar *p1 = T1[j];
    uchar *p2 = T2[j];
    uint *p3 = T3[j];
    fn2(p1, p2, w, p3);
    }
    }
    gettimeofday(&t2, NULL);
    cout << "Took " << (t2.tv_sec-t1.tv_sec)+(t2.tv_usec-t1.tv_usec)/
    1.e6 << endl;
    }

    Then I compiled it with :
    g++ test.cpp -O3 -o test

    And this are the results :
    Took 1.58356
    Took 0.789235

    So the second version which use templated functions is approx 2times
    faster. Is it normal ?

    I tested it on MacosX (10.5) (the timings shown above), and on Linux
    where timings are (5.45 for the first version and 4.71 for the second
    one : the machine is older, thus the big difference, but still the
    second is faster).

    Can anyone reproduce this and tell me if this is normal.

    Thanks
    eLVa, Apr 11, 2011
    #1
    1. Advertising

  2. eLVa

    eLVa Guest

    On Apr 11, 1:06 pm, Pete Becker <> wrote:
    > On 2011-04-11 12:10:55 -0400, eLVa said:
    >
    >
    >
    > > template <class T> T min2(const T&a, const T&b) { return a<b?a:b; }
    > > template <class T> T max2(const T&a, const T&b) { return a<b?a:b; }

    >
    > Different isn't the same. Since max2 doesn't do the same thing as
    > std::max, despite the misleading name, the measurements don't mean much.
    >
    > --
    >   Pete
    > Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
    > Standard C++ Library Extensions: a Tutorial and Reference
    > (www.petebecker.com/tr1book)


    Sorry for the mistake, it's a copy paste error, of course you should
    read

    template <class T> T max2(const T&a, const T&b) { return a>b?a:b; }

    But the timings stay the same ...
    So where could this come from ?
    eLVa, Apr 11, 2011
    #2
    1. Advertising

  3. * eLVa, on 11.04.2011 19:58:
    > On Apr 11, 1:06 pm, Pete Becker<> wrote:
    >> On 2011-04-11 12:10:55 -0400, eLVa said:
    >>
    >>
    >>
    >>> template<class T> T min2(const T&a, const T&b) { return a<b?a:b; }
    >>> template<class T> T max2(const T&a, const T&b) { return a<b?a:b; }

    >>
    >> Different isn't the same. Since max2 doesn't do the same thing as
    >> std::max, despite the misleading name, the measurements don't mean much.
    >>
    >> --
    >> Pete
    >> Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
    >> Standard C++ Library Extensions: a Tutorial and Reference
    >> (www.petebecker.com/tr1book)

    >
    > Sorry for the mistake, it's a copy paste error, of course you should
    > read
    >
    > template<class T> T max2(const T&a, const T&b) { return a>b?a:b; }
    >
    > But the timings stay the same ...
    > So where could this come from ?


    I think your testing code is pretty obscure.

    Can you reproduce in small simple test program, regardless of testing std first
    or second?

    If so, might be that std::max is instrumented in some way. Check your compiler's
    documentation about switches and preprocessor symbols. Please post your results
    -- however it turns out!


    Cheers,

    - Alf

    --
    blog at <url: http://alfps.wordpress.com>
    Alf P. Steinbach /Usenet, Apr 11, 2011
    #3
  4. eLVa

    Puppet_Sock Guest

    On Apr 11, 12:10 pm, eLVa <> wrote:
    > Hi everyone,
    >
    > Below is a test code I made after noticing a big difference in terms
    > of running time between the use of std::min or std::max and a
    > templated version of it. I'm not sure where it does come from, so any
    > comments are welcome.
    > The code does not anything usefull, but it is a simplification of the
    > real code (the purpose is only to show the difference of timings ..)
    >
    > I don't know if there is a proper way of posting a chunk of code, so I
    > just drop it here ...
    >
    > #include <algorithm>
    > #include <iostream>
    > #include <sys/time.h>
    >
    > typedef unsigned char uchar;
    > typedef unsigned int uint;
    > using namespace std;
    >
    > void fn(const uchar *pL, const uchar *pR, int w, uint *pD) {
    >     int vlm, vlp;
    >     for (int i=0; i<w; ++i) {
    >         vlm = (i>0 ? (pL[i-1]+pL)/2 : pL);
    >         vlp = (i<w-1 ? (pL+pL[i+1])/2 : pL);
    >
    >         pD = std::min<int>(vlp,std::max<int>(vlm,pR));
    >     }
    >
    > }


    Hmmm... vlm and vlp are int, getting assigned the
    results of uchar arithmetic.

    Hmmm... pR is a uchar, but the max that is getting
    called is max<int>. Hmmm...

    Hmmm... pD is a unit getting assigned the result
    of std::min<int> Hmmm...

    Also, you are doing a compare of i to 0 and a compare
    of i to w-1 for each call of std::max and std::min.
    And you are doing two averages for most cases.
    So, I'm expecting that calls to min and max do not
    dominate this function.

    I wonder if you can't make a function that is
    dominated by the max/min calls. I don't know just
    what is happening to produce the reported values
    you gave, but I'm thinking they don't show what
    you think they show.
    Socks
    Puppet_Sock, Apr 11, 2011
    #4
  5. eLVa

    Jeff Flinn Guest

    eLVa wrote:
    > On Apr 11, 1:06 pm, Pete Becker <> wrote:
    >> On 2011-04-11 12:10:55 -0400, eLVa said:
    >>
    >>
    >>
    >>> template <class T> T min2(const T&a, const T&b) { return a<b?a:b; }
    >>> template <class T> T max2(const T&a, const T&b) { return a<b?a:b; }

    >> Different isn't the same. Since max2 doesn't do the same thing as
    >> std::max, despite the misleading name, the measurements don't mean much.
    >>
    >> --
    >> Pete
    >> Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
    >> Standard C++ Library Extensions: a Tutorial and Reference
    >> (www.petebecker.com/tr1book)

    >
    > Sorry for the mistake, it's a copy paste error, of course you should
    > read
    >
    > template <class T> T max2(const T&a, const T&b) { return a>b?a:b; }
    >
    > But the timings stay the same ...
    > So where could this come from ?


    What happens if you call fn2 first, then fn?

    Jeff
    Jeff Flinn, Apr 11, 2011
    #5
  6. eLVa

    eLVa Guest

    On Apr 11, 3:06 pm, Puppet_Sock <> wrote:
    > On Apr 11, 12:10 pm, eLVa <> wrote:
    >
    >
    >
    > > Hi everyone,

    >
    > > Below is a test code I made after noticing a big difference in terms
    > > of running time between the use of std::min or std::max and a
    > > templated version of it. I'm not sure where it does come from, so any
    > > comments are welcome.
    > > The code does not anything usefull, but it is a simplification of the
    > > real code (the purpose is only to show the difference of timings ..)

    >
    > > I don't know if there is a proper way of posting a chunk of code, so I
    > > just drop it here ...

    >
    > > #include <algorithm>
    > > #include <iostream>
    > > #include <sys/time.h>

    >
    > > typedef unsigned char uchar;
    > > typedef unsigned int uint;
    > > using namespace std;

    >
    > > void fn(const uchar *pL, const uchar *pR, int w, uint *pD) {
    > >     int vlm, vlp;
    > >     for (int i=0; i<w; ++i) {
    > >         vlm = (i>0 ? (pL[i-1]+pL)/2 : pL);
    > >         vlp = (i<w-1 ? (pL+pL[i+1])/2 : pL);

    >
    > >         pD = std::min<int>(vlp,std::max<int>(vlm,pR));
    > >     }

    >
    > > }

    >
    > Hmmm... vlm and vlp are int, getting assigned the
    > results of uchar arithmetic.


    Is that a problem ?

    >
    > Hmmm... pR is a uchar, but the max that is getting
    > called is max<int>. Hmmm...


    Again, is it a problem ?
    It's a subsample of my code, where later I use int calculations ...

    >
    > Hmmm... pD is a unit getting assigned the result
    > of std::min<int> Hmmm...
    >
    > Also, you are doing a compare of i to 0 and a compare
    > of i to w-1 for each call of std::max and std::min.
    > And you are doing two averages for most cases.
    > So, I'm expecting that calls to min and max do not
    > dominate this function.


    The other function (fn2) is doing the same so why would it be
    dominated by that time in one case, not in the other ?

    >
    > I wonder if you can't make a function that is
    > dominated by the max/min calls. I don't know just
    > what is happening to produce the reported values
    > you gave, but I'm thinking they don't show what
    > you think they show.
    > Socks


    I tried to but depending on what I do, the results are not showing
    those differences ...
    I'm not sure what to test, and how to diagnose the real problem!
    eLVa, Apr 11, 2011
    #6
  7. eLVa

    eLVa Guest

    On Apr 11, 3:09 pm, Jeff Flinn <> wrote:
    > eLVa wrote:
    > > On Apr 11, 1:06 pm, Pete Becker <> wrote:
    > >> On 2011-04-11 12:10:55 -0400, eLVa said:

    >
    > >>> template <class T> T min2(const T&a, const T&b) { return a<b?a:b; }
    > >>> template <class T> T max2(const T&a, const T&b) { return a<b?a:b; }
    > >> Different isn't the same. Since max2 doesn't do the same thing as
    > >> std::max, despite the misleading name, the measurements don't mean much.

    >
    > >> --
    > >>   Pete
    > >> Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
    > >> Standard C++ Library Extensions: a Tutorial and Reference
    > >> (www.petebecker.com/tr1book)

    >
    > > Sorry for the mistake, it's a copy paste error, of course you should
    > > read

    >
    > > template <class T> T max2(const T&a, const T&b) { return a>b?a:b; }

    >
    > > But the timings stay the same ...
    > > So where could this come from ?

    >
    > What happens if you call fn2 first, then fn?


    I get the same results (fn2 is two times faster than fn).

    >
    > Jeff
    eLVa, Apr 11, 2011
    #7
  8. eLVa

    eLVa Guest

    On Apr 11, 3:18 pm, Paavo Helde <> wrote:
    > eLVa <> wrote in news:fd844d48-c7d9-4356-9ffc-
    > :
    >
    >
    >
    >
    >
    > > Hi everyone,

    >
    > > Below is a test code I made after noticing a big difference in terms
    > > of running time between the use of std::min or std::max and a
    > > templated version of it. I'm not sure where it does come from, so any
    > > comments are welcome.

    > ...
    > >     int h = 400, w = 500;
    > >     uchar *T1[h], *T2[h];
    > >     uint *T3[h];

    >
    > >     for (int i=0; i<h; ++i) {
    > >         T1 = new uchar[w];
    > >         T2 = new uchar[w];
    > >         T3 = new uint[w];
    > >     }

    >
    > If you are so keen on performance then note that allocating each row
    > separately may take quite a lot of time, and accessing the data later may 
    > also be slower because of worse locality and double indirection (the
    > latter is not a problem in this concrete example as you take out the row
    > pointers beforehand). In general it might be faster to allocate the whole
    > array as new uchar[w*h] and access as T1[j*w+i]. YMMV, of course.


    This is a sample of my code where I use a function that's applied on
    two "lines".
    In my real code, the array are allocated continuously, but for the
    test, I found it simpler
    to juste do it quickly and allocating each row separately.

    The purpose of this sample was to show the problem, and I don't think
    allocating data continuously might change it.


    >
    > > Then I compiled it with :
    > >  g++ test.cpp -O3 -o test

    >
    > > And this are the results :
    > > Took 1.58356
    > > Took 0.789235

    >
    > > So the second version which use templated functions is approx 2times
    > > faster. Is it normal ?

    >
    > std::min/max are also template functions.
    >
    > FWIW, there seems to be some effect similar to your findings also in
    > Windows (VS2010 64-bit Release build, with timeofday replaced by clock
    > ()). Seems to be due to different inlining of functions, but I'm not
    > sure. The difference is not so large, ca 0.93s vs 0.71s. In any case it's
    > quite strange, std::min/max should not have any intrinsic overhead.


    Yes I know that std::min/max are template functions, however I don't
    understand where the overhead is coming from.
    Do any of you experience the same, or is it some kind of compiler
    trick ? (I have no knowledge on this, and that's why I'm eager to know
    about your testings).

    >
    > Cheers
    > Paavo
    eLVa, Apr 11, 2011
    #8
  9. * eLVa:
    > gettimeofday(&t1, NULL);
    > for (int z=0; z<1000; ++z) {
    > for (int j=0; j<h; ++j) {
    > uchar *p1 = T1[j];
    > uchar *p2 = T2[j];
    > uint *p3 = T3[j];
    > fn(p1, p2, w, p3);
    > }
    > }
    > gettimeofday(&t2, NULL);
    > cout << "Took " << (t2.tv_sec-t1.tv_sec)+(t2.tv_usec-t1.tv_usec)/
    > 1.e6 << endl;


    While I doubt it'll explain the differences you're seeing, one word of
    advice:

    You're benchmarking against 'real time' here, which may be skewing your
    results, especially on a loaded system. It might even be getting distorted
    by efforts to keep the system clock synchronised.

    It's usually safer to benchmark against process runtime - On Unix-like
    systems, see getrusage or clock.

    --
    Martijn van Buul -
    Martijn van Buul, Apr 11, 2011
    #9
  10. eLVa

    eLVa Guest

    On Apr 11, 2:48 pm, "Alf P. Steinbach /Usenet" <alf.p.steinbach
    > wrote:
    > * eLVa, on 11.04.2011 19:58:
    >
    >
    >
    > > On Apr 11, 1:06 pm, Pete Becker<>  wrote:
    > >> On 2011-04-11 12:10:55 -0400, eLVa said:

    >
    > >>> template<class T>  T min2(const T&a, const T&b) { return a<b?a:b; }
    > >>> template<class T>  T max2(const T&a, const T&b) { return a<b?a:b; }

    >
    > >> Different isn't the same. Since max2 doesn't do the same thing as
    > >> std::max, despite the misleading name, the measurements don't mean much.

    >
    > >> --
    > >>    Pete
    > >> Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
    > >> Standard C++ Library Extensions: a Tutorial and Reference
    > >> (www.petebecker.com/tr1book)

    >
    > > Sorry for the mistake, it's a copy paste error, of course you should
    > > read

    >
    > > template<class T>  T max2(const T&a, const T&b) { return a>b?a:b; }

    >
    > > But the timings stay the same ...
    > > So where could this come from ?

    >
    > I think your testing code is pretty obscure.
    >
    > Can you reproduce in small simple test program, regardless of testing stdfirst
    > or second?
    >
    > If so, might be that std::max is instrumented in some way. Check your compiler's
    > documentation about switches and preprocessor symbols. Please post your results
    >   --  however it turns out!


    After some testing, I have found that the particular compiler I was
    using (/usr/bin/g++ v4.0.1 on MacOsX) does not give the same timings
    as a recent one (g++ v4.4). In this one, both functions works with the
    same timings.
    As for the Linux version, I have not tested, but I suppose this is the
    same issue.

    I consider my problem fixed (as it seems it was just a problem of
    updating my version of g++)
    Does any of you have the same difference in timings ?

    >
    > Cheers,
    >
    > - Alf
    >
    > --
    > blog at <url:http://alfps.wordpress.com>
    eLVa, Apr 11, 2011
    #10
  11. eLVa

    RaZiel Guest

    On 11.04.2011 18:10, eLVa wrote:
    > Hi everyone,
    >
    > Below is a test code I made after noticing a big difference in terms
    > of running time between the use of std::min or std::max and a
    > templated version of it. I'm not sure where it does come from, so any
    > comments are welcome.
    > The code does not anything usefull, but it is a simplification of the
    > real code (the purpose is only to show the difference of timings ..)
    >
    > I don't know if there is a proper way of posting a chunk of code, so I
    > just drop it here ...
    >
    > Then I compiled it with :
    > g++ test.cpp -O3 -o test
    >
    > And this are the results :
    > Took 1.58356
    > Took 0.789235
    >
    > So the second version which use templated functions is approx 2times
    > faster. Is it normal ?
    >
    > I tested it on MacosX (10.5) (the timings shown above), and on Linux
    > where timings are (5.45 for the first version and 4.71 for the second
    > one : the machine is older, thus the big difference, but still the
    > second is faster).
    >
    > Can anyone reproduce this and tell me if this is normal.
    >
    > Thanks


    MinGW 4.5.2:
    Took 2.4804
    Took 2.574

    msvc-9.0: (GetTickCount)
    Took 3.604
    Took 3.01

    - RaZ
    RaZiel, Apr 11, 2011
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lois
    Replies:
    1
    Views:
    3,180
    Ryan Stewart
    Dec 27, 2004
  2. juergen
    Replies:
    3
    Views:
    558
    opalinski from opalpaweb
    Sep 20, 2006
  3. Vijai Kalyan
    Replies:
    1
    Views:
    303
    Howard Hinnant
    Mar 21, 2006
  4. puzzlecracker
    Replies:
    3
    Views:
    1,745
    Mike Wahler
    May 8, 2006
  5. Albert Hopkins

    When is min(a, b) != min(b, a)?

    Albert Hopkins, Jan 21, 2008, in forum: Python
    Replies:
    31
    Views:
    810
    Albert van der Horst
    Feb 4, 2008
Loading...

Share This Page