Relative cost of operations

Discussion in 'C++' started by aitorf666@gmail.com, Feb 2, 2008.

  1. Guest

    Hi!
    I have been looking for a comparison table of the relative cost of
    operations in C++ because it can be very useful for all. I have found
    this:

    printf/scanf 1000
    new/delete 800
    trig. function 500
    floating point 100
    and so on...

    But it doesn´t tell anything about some C++ features, like virtual
    functions.
    What is the cost of calling a virtual function comparing with a
    ordinary function?
    Could anyone tell me where can I find tables of operations cost for C+
    +?

    Thanks.
    PD: Sorry for my English.
     
    , Feb 2, 2008
    #1
    1. Advertising

  2. * :
    > Hi!
    > I have been looking for a comparison table of the relative cost of
    > operations in C++ because it can be very useful for all. I have found
    > this:
    >
    > printf/scanf 1000
    > new/delete 800
    > trig. function 500
    > floating point 100
    > and so on...
    >
    > But it doesn´t tell anything about some C++ features, like virtual
    > functions.
    > What is the cost of calling a virtual function comparing with a
    > ordinary function?
    > Could anyone tell me where can I find tables of operations cost for C+
    > +?
    >
    > Thanks.
    > PD: Sorry for my English.


    Go to <url: http://www.open-std.org/jtc1/sc22/wg21/>, search for the
    word "performance".

    Cheers & hth.,

    - Alf

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
     
    Alf P. Steinbach, Feb 2, 2008
    #2
    1. Advertising

  3. :

    > What is the cost of calling a virtual function comparing with a
    > ordinary function?



    One more level of indirection.

    Instead of having:

    PointerToFunction();

    you have:

    PointerToVTable->PointerToFunction();

    --
    Tomás Ó hÉilidhe
     
    Tomás Ó hÉilidhe, Feb 2, 2008
    #3
  4. Rolf Magnus Guest

    Tomás Ó hÉilidhe wrote:

    > :
    >
    >> What is the cost of calling a virtual function comparing with a
    >> ordinary function?

    >
    >
    > One more level of indirection.
    >
    > Instead of having:
    >
    > PointerToFunction();


    Actually, a regular function is often not called through a pointer, but
    directly.

    > you have:
    >
    > PointerToVTable->PointerToFunction();


    Actually, the common implementation is more like
    PointerToObject->PointerToVTable->PointerToFunction(), so it's 3 additional
    levels of indirection.
     
    Rolf Magnus, Feb 2, 2008
    #4
  5. Rolf Magnus:

    >> Instead of having:
    >>
    >> PointerToFunction();

    >
    > Actually, a regular function is often not called through a pointer,
    > but directly.



    I'm not sure what you mean by "called directly". When a function is
    called in a program, the CPU's program counter is set to the address of
    the function. I don't see how you could call the function without using
    its address.
    ..

    >> you have:
    >>
    >> PointerToVTable->PointerToFunction();

    >
    > Actually, the common implementation is more like
    > PointerToObject->PointerToVTable->PointerToFunction(), so it's 3
    > additional levels of indirection.



    If we have an object on the stack, then its address is simply an
    offset from the stack pointer. So instead of having:

    JUMP TO (address_of_function);

    We have:

    JUMP TO ( (stack_pointer + offset)->vtable->address_of_function );

    Or more simply:

    JUMP TO ( address_of_object->vtable->address_of_function );

    I think this equates to two more levels of indirection in total. On
    a CPU whose clock runs at 2 GHz, and where a cycle is about 4 clock
    pulses, and where an indirecton takes 2 cycles, I think we're looking at
    a delay of something like 8 nanoseconds.

    --
    Tomás Ó hÉilidhe
     
    Tomás Ó hÉilidhe, Feb 2, 2008
    #5
  6. On 2008-02-03 00:58, Tomás Ó hÉilidhe wrote:
    > Rolf Magnus:
    >
    >>> Instead of having:
    >>>
    >>> PointerToFunction();

    >>
    >> Actually, a regular function is often not called through a pointer,
    >> but directly.

    >
    >
    > I'm not sure what you mean by "called directly". When a function is
    > called in a program, the CPU's program counter is set to the address of
    > the function. I don't see how you could call the function without using
    > its address.
    > .
    >
    >>> you have:
    >>>
    >>> PointerToVTable->PointerToFunction();

    >>
    >> Actually, the common implementation is more like
    >> PointerToObject->PointerToVTable->PointerToFunction(), so it's 3
    >> additional levels of indirection.

    >
    >
    > If we have an object on the stack, then its address is simply an
    > offset from the stack pointer. So instead of having:


    No, if the object is on the stack the type is known at compile-time and
    the correct function is called directly. VTables and such are only
    needed when the exact type is not known at compile-time.

    --
    Erik Wikström
     
    Erik Wikström, Feb 3, 2008
    #6
  7. =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=:

    >> If we have an object on the stack, then its address is simply an
    >> offset from the stack pointer. So instead of having:

    >
    > No, if the object is on the stack the type is known at compile-time and
    > the correct function is called directly. VTables and such are only
    > needed when the exact type is not known at compile-time.



    Of course, you're right. So we're left with:

    JUMP TO( address_of_object->vtable->address_of_function );


    My previous guess of 8 nanoseconds was probably way off, it's probably more
    like 10's of nanoseconds if you take into account adding the offsets and so
    forth.

    --
    Tomás Ó hÉilidhe
     
    Tomás Ó hÉilidhe, Feb 3, 2008
    #7
  8. Jerry Coffin Guest

    In article <52b46f0e-cbcc-4ccb-bbfa-
    >, says...
    > Hi!
    > I have been looking for a comparison table of the relative cost of
    > operations in C++ because it can be very useful for all. I have found
    > this:
    >
    > printf/scanf 1000
    > new/delete 800
    > trig. function 500
    > floating point 100
    > and so on...


    While that may be of some use, it's only an extremely rough guideline at
    very best. Just for example, the time for a print or scanf varies widely
    based upon the number of conversions involved, the amount of output
    produced, and the speed of the device to which the output is written --
    it can easily vary by a factor of thousands.

    > But it doesn´t tell anything about some C++ features, like virtual
    > functions.
    > What is the cost of calling a virtual function comparing with a
    > ordinary function?


    You can't reasonably compare them directly. If you want to compare
    costs, you need to compare a virtual function to something like a switch
    statement that calls any one of a number of functions, or perhaps to
    calling a function via a pointer.

    When you make such a comparison, the time (of course) depends on the
    compiler, but with the modern compilers I've tested, there's essentially
    no overhead. For example, here's some output from Joe Orost's Bench++
    benchmark:

    Test Name: F000006 Class Name: Style
    CPU Time: 1.73 nanoseconds plus or minus 0.0866
    Wall/CPU: 1.00 ratio. Iteration Count:
    1677721600
    Test Description:
    Time to test a global using a 10-way switch statement

    Test Name: F000007 Class Name: Style
    CPU Time: 2.73 nanoseconds plus or minus 0.136
    Wall/CPU: 1.00 ratio. Iteration Count:
    1677721600
    Test Description:
    Time to test a global using a 10-way sparse switch statement

    Test Name: F000008 Class Name: Style
    CPU Time: 1.94 nanoseconds plus or minus 0.0969
    Wall/CPU: 1.00 ratio. Iteration Count:
    1677721600
    Test Description:
    Time to test a global using a 10-way virtual function class

    As you can see, the time for the virtual function version is quite
    competitive -- with dense values, the switch statement is marginally
    faster, but with sparse values, the switch statement is noticeably
    slower.

    Although the values vary somewhat depending on the compiler and options
    I use, that trend has remained roughly constant.

    > Could anyone tell me where can I find tables of operations cost for C+
    > +?


    My advice would be to download (or write) and run some code to test the
    operations you care about. Writing good test code is definitely a
    challenge for even the most experienced programmer.

    --
    Later,
    Jerry.

    The universe is a figment of its own imagination.
     
    Jerry Coffin, Feb 3, 2008
    #8
  9. James Kanze Guest

    On Feb 2, 8:10 pm, wrote:

    > I have been looking for a comparison table of the relative
    > cost of operations in C++ because it can be very useful for
    > all. I have found this:


    > printf/scanf 1000
    > new/delete 800
    > trig. function 500
    > floating point 100
    > and so on...


    Which is completely irrelevant anyway---a printf( "%f" ) can
    easily take more than 10 times as much time as a printf( "some
    text" ). And the rest will depend on the platform---on my Sun
    Sparc, at work, floating point multiply and divide are faster
    than integer multiply and divide, for example.

    > But it doesn´t tell anything about some C++ features, like
    > virtual functions. What is the cost of calling a virtual
    > function comparing with a ordinary function?


    It varies greatly from one implementation to the next, so you'd
    need information relative to your platform.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Feb 3, 2008
    #9
  10. James Kanze Guest

    On Feb 3, 12:58 am, "Tomás Ó hÉilidhe" <> wrote:

    [...]
    > I think this equates to two more levels of indirection in
    > total. On a CPU whose clock runs at 2 GHz, and where a cycle
    > is about 4 clock pulses, and where an indirecton takes 2
    > cycles, I think we're looking at a delay of something like 8
    > nanoseconds.


    Do you actually know of any machines like that. The machines I
    know which run at 2 GHz all are capable of executing several
    instructions per clock as well. Depending on the instructions,
    and provided the pipeline is full, of course.

    The important point with regards to virtual functions is that
    the call is indirect. On some architectures (HP/PA, for
    example), this fouls up branch prediction completely, and may
    end up resulting in a purge of the pipeline. Virtual functions
    on those machines are fairly expensive. On other machines, this
    doesn't seem to be a problem.

    And of course, the question is always: expensive compared to
    what? You can't compare a virtual function call to a
    non-virtual function call because they don't do the same thing.
    So you end up comparing it to a if/else chain, or a switch. (In
    cases where the if/else chain is faster, the compiler may end up
    converting the virtual function call to an if, if the code is in
    a tight loop, where the profiler says it will make a
    difference.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Feb 3, 2008
    #10
  11. Rolf Magnus Guest

    Tomás Ó hÉilidhe wrote:

    > Rolf Magnus:
    >
    >>> Instead of having:
    >>>
    >>> PointerToFunction();

    >>
    >> Actually, a regular function is often not called through a pointer,
    >> but directly.

    >
    >
    > I'm not sure what you mean by "called directly". When a function is
    > called in a program, the CPU's program counter is set to the address of
    > the function. I don't see how you could call the function without using
    > its address.


    The address is part of the call instruction itself, called an immediate
    value in Assembler. If you call the function through a pointer, first the
    pointer value must be read from memory, then the function is called with
    that address, using a different instruction.
    Of course this may vary depending on the platform.

    >>> you have:
    >>>
    >>> PointerToVTable->PointerToFunction();

    >>
    >> Actually, the common implementation is more like
    >> PointerToObject->PointerToVTable->PointerToFunction(), so it's 3
    >> additional levels of indirection.

    >
    >
    > If we have an object on the stack, then its address is simply an
    > offset from the stack pointer. So instead of having:
    >
    > JUMP TO (address_of_function);
    >
    > We have:
    >
    > JUMP TO ( (stack_pointer + offset)->vtable->address_of_function );
    >
    > Or more simply:
    >
    > JUMP TO ( address_of_object->vtable->address_of_function );
    >
    > I think this equates to two more levels of indirection in total. On
    > a CPU whose clock runs at 2 GHz, and where a cycle is about 4 clock
    > pulses, and where an indirecton takes 2 cycles, I think we're looking at
    > a delay of something like 8 nanoseconds.


    However, that value doesn't say much. Obviously, you won't notice a
    difference for a function that is not called very often. If that function
    is short, and it's called a million times per second, this might be
    different. Also, not every system has a clock rate of 2 Ghz. Not everyone
    is programming PCs and the like.
     
    Rolf Magnus, Feb 3, 2008
    #11
  12. James Kanze:

    > ---on my Sun
    > Sparc, at work, floating point multiply and divide are faster
    > than integer multiply and divide, for example.



    Are you doing some sort of CAD graphic design? I got introduced to it
    there a few days in college; we're using it do design microchips, e.g.
    where to place n-type regions and so forth. What kinds of stuff is this
    machine good for?

    --
    Tomás Ó hÉilidhe
     
    Tomás Ó hÉilidhe, Feb 3, 2008
    #12
  13. It's also worth remembering that not only do some costs change between
    compiler versions, but vary during the life of the program.
    new/delete and malloc/free can suffer from fragmentation, and unlike
    trig functions altering their use a small amount may significantly
    alter their cost.
    Of course, you might not get any change at all, that's what makes this
    so fun :)
     
    Dominic Connor, Quant Headhunter, Feb 3, 2008
    #13
  14. Rolf Magnus Guest

    Dominic Connor, Quant Headhunter wrote:

    > It's also worth remembering that not only do some costs change between
    > compiler versions, but vary during the life of the program.
    > new/delete and malloc/free can suffer from fragmentation, and unlike
    > trig functions altering their use a small amount may significantly
    > alter their cost.


    You also might happen to use multiple threads, where new/delete need some
    thread synchronization, while trig functions don't. OTOH, some memory
    allocators have specific regions for objects of common size, which makes
    allocating those significantly quicker. On a system without FPU, a trig
    function might in turn easily take more than 10 times as much time than on
    one with FPU.
     
    Rolf Magnus, Feb 3, 2008
    #14

  15. > You also might happen to use multiple threads, where new/delete need some
    > thread synchronization, while trig functions don't.


    Very good point, the period that a shared resource is locked is going
    to affect performance in a very non-linear fashion.
    In small programs, it can be trivial, but suddenly you fall off a
    cliff after a trivial modification.
     
    Dominic Connor, Quant Headhunter, Feb 3, 2008
    #15
  16. James Kanze Guest

    On Feb 3, 1:07 pm, "Tomás Ó hÉilidhe" <> wrote:
    > James Kanze:


    > > ---on my Sun
    > > Sparc, at work, floating point multiply and divide are faster
    > > than integer multiply and divide, for example.


    > Are you doing some sort of CAD graphic design?


    No.

    > I got introduced to it there a few days in college; we're
    > using it do design microchips, e.g. where to place n-type
    > regions and so forth. What kinds of stuff is this machine good
    > for?


    It's a good general purpose machine. In our case, it gets the
    preference over a PC with Linux mainly because of its IO
    bandwidth. In general, Solaris is also more stable and more
    reliable than Linux or Windows.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Feb 3, 2008
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Brendan Lynskey

    Low-cost ASIC tools

    Brendan Lynskey, Sep 26, 2003, in forum: VHDL
    Replies:
    2
    Views:
    2,090
    Jerry
    Sep 27, 2003
  2. Mr. x

    cost of SSL on the client

    Mr. x, Dec 2, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    364
    Mr. x
    Dec 3, 2003
  3. Ali

    Cost for using Server control

    Ali, Dec 5, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    323
    Jason S
    Dec 5, 2003
  4. msn
    Replies:
    9
    Views:
    352
  5. Jesus M. Salvo Jr.
    Replies:
    2
    Views:
    4,327
    robert
    Feb 11, 2006
Loading...

Share This Page