Costs of function calls

Thomas Kowalski · Dec 19, 2006

Hello everybody,
currently I am tweaking a bit on one of my projects and wondered how
high are the costs for calling a (member) function. I tried some
possibilites and come to a suprising result.
Maybe someone can explain to me what I might improve on my test or how
come this results occured?
For testing I commented different lines out. Code see below.

Regards,
Thomas Kowalski

// MS C++ 8.0
// 1 Billion additions:
// simple methode call: 2134 ms, 2123 ms,
// inline call: 2037 ms, 2053 ms, 2073 ms
// simple methode ref param: 2153 ms, 2034 ms, 2093 ms
// addition no methode 2034 ms, 2037 ms, 2032 ms
// virtual funtion call ~2100 ms
// function pointer 7060 ms, 7260 ms, 7260 ms

// Results with gcc 3.3.6 are similar but function pointer are as fast
as normal calls

#include "stdafx.h"
#include <iostream>
#include <cstdlib>
#include <vector>
#include <time.h>

using namespace std;

class Abstract {
public:
virtual size_t add(size_t& a, size_t& b) = 0;
};

class B

ublic Abstract {
public:
virtual size_t add(size_t& a, size_t& b);
};

class C

ublic Abstract {
public:
virtual size_t add(size_t& a, size_t& b);
};

class D

ublic Abstract {
public:
virtual size_t add(size_t& a, size_t& b);
};
/**/

class A {
public:
static size_t add(size_t& a, size_t& b);
};

//inline
inline size_t A::add(size_t& a, size_t& b) {
return a+b;
};

size_t B::add(size_t& a, size_t& b) {
return a+b;
};

size_t C::add(size_t& a, size_t& b) {
return a+b;
};

size_t D::add(size_t& a, size_t& b) {
return a+b;
};

int main(int argc, char *argv[])
{

// Abstract* a = new D();
A a;
size_t (*addition)(size_t&, size_t&);
addition = &A::add;
const size_t REPEAT = 1000000000;
size_t i,j,k;
j=5;

clock_t start = clock();
for (i=0; i<REPEAT; ++i) {
addition(i, k);
// k = a.add(i,j);
// k = i+j;
}
clock_t ende = clock();
cout << "K ist: " << k << endl;

cout << "test took " << ((ende - start)) << " ms" << endl;
cin >> i;
return 0;
}

Victor Bazarov · Dec 19, 2006

Thomas said:
Hello everybody,
currently I am tweaking a bit on one of my projects and wondered how
high are the costs for calling a (member) function. I tried some
possibilites and come to a suprising result.
Maybe someone can explain to me what I might improve on my test or how
come this results occured?
For testing I commented different lines out. Code see below.

Regards,
Thomas Kowalski

// MS C++ 8.0
// 1 Billion additions:
// simple methode call: 2134 ms, 2123 ms,
// inline call: 2037 ms, 2053 ms, 2073 ms
// simple methode ref param: 2153 ms, 2034 ms, 2093 ms
// addition no methode 2034 ms, 2037 ms, 2032 ms
// virtual funtion call ~2100 ms
// function pointer 7060 ms, 7260 ms, 7260 ms

// Results with gcc 3.3.6 are similar but function pointer are as fast
as normal calls

[..]

To make a real case of inline versus non-inline, you _must_ place your
non-inlined member functions in a separate translation unit. Otherwise
any decent compiler should be able to simply inline all your calls if
it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.

V

Thomas Kowalski · Dec 20, 2006

To make a real case of inline versus non-inline, you _must_ place your

non-inlined member functions in a separate translation unit. Otherwise
any decent compiler should be able to simply inline all your calls if
it can see the actual body.

Ok, that makes sense. But how can it inline the virtual funtion call?
Abstract *a = new B();
a->add(i,j);

Can this still be inlined?

I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.

You mean that if you have an std::vector and use a pointer to its
values its several times slower to access a certain element then using
an index of type int or unsigned int?

Thanks a lot,
Thomas Kowalski

=?iso-8859-1?q?Erik_Wikstr=F6m?= · Dec 20, 2006

You mean that if you have an std::vector and use a pointer to its
values its several times slower to access a certain element then using
an index of type int or unsigned int?

No, the other way around:

std::vector<int> vec;
// Push some elements

int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using index)
provided that you don't count the cost of getting the pointer in the
first place. This can be useful when optimising loops iterating over
all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec += 2;

Rolf Magnus · Dec 20, 2006

Thomas said:
Ok, that makes sense. But how can it inline the virtual funtion call?
Abstract *a = new B();
a->add(i,j);

Can this still be inlined?

Basically, it can, because the compiler could know that in that particular
place the object can only be a B, but I have no idea if compilers do such
optimizations.

You mean that if you have an std::vector and use a pointer to its
values its several times slower to access a certain element then using
an index of type int or unsigned int?

I think that he means just the opposite. Last time I measured iterating a
vector, indexing was generally slower than using a pointer or an iterator
and increment it in the loop. No difference between indexing using a
pointer and using vector's operator[].

Ole Nielsby · Dec 20, 2006

Erik Wikström said:
int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using index)
provided that you don't count the cost of getting the pointer in the
first place. This can be useful when optimising loops iterating over
all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec += 2;

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).

Rolf Magnus · Dec 20, 2006

Ole said:
int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec += 2;

Click to expand...

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

And here you can see what it looks like if you go one step further in using
algorithms:

std::transform(vec.begin(), vec.end(), vec.begin(),
std::bind2nd(std:lus<int>(), 2));

This is harder to write, harder to read and not shorter than the for loop.

Victor Bazarov · Dec 20, 2006

Ole said:
Erik Wikström said:

int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using
index) provided that you don't count the cost of getting the pointer
in the first place. This can be useful when optimising loops
iterating over all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec += 2;

Click to expand...

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).

Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and not
++i). Besides, it has to compare the iterators, it has to dereference
the 'i'. Don't just jump to conclusions. Didn't I say I profiled
that sort of code enough to claim pointers are the fastest?

Bottomline: avoid function calls like the plague!

V

Victor Bazarov · Dec 20, 2006

Rolf said:
[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true. Indexing using a pointer is still faster because it does
not involve a function call. That's the whole point!

V

Lionel B · Dec 20, 2006

Ole said:
Ole said:

Erik WikstrÃ¶m said:

int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using
index) provided that you don't count the cost of getting the pointer
in the first place. This can be useful when optimising loops
iterating over all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec += 2;

Click to expand...

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).

Click to expand...

Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and not
++i). Besides, it has to compare the iterators, it has to dereference
the 'i'.

Surely you have to compare pointers (or ints, or whatever) to detect the
loop termination condition, and you have to dereference a pointer too...?
Ok, the iterator version may involve function calls but these are quite
likely trivial enough to be optimised away by any compiler worth its salt.

Don't just jump to conclusions. Didn't I say I profiled
that sort of code enough to claim pointers are the fastest?

Bottomline: avoid function calls like the plague!

Click to expand...

....except if you are certain they will be magicked away by your
compiler.

Victor Bazarov · Dec 20, 2006

Lionel said:
Ole said:

int& a = vec[5]; // Using index

int* ptr = &(vec[0]);
int& b = *(ptr + 5) // Using pointer

The second (using pointer) will be faster than the first (using
index) provided that you don't count the cost of getting the
pointer in the first place. This can be useful when optimising
loops iterating over all the elements in a vector like:

int* ptr = &(vec[0]);
int size = vec.size();
for (int i = 0; i < size; ++i)
*(ptr + i) += 2; // Instead of vec += 2;

Or shorter:

for (std::vector<int>::iterator i = vec.begin(); i < vec.end(); i++)
*i += 2;

which does the same and produces similar code on modern
compilers tuned to deal with STL (tested on vc8).

Click to expand...

Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and
not ++i). Besides, it has to compare the iterators, it has to
dereference the 'i'.

Click to expand...

Surely you have to compare pointers (or ints, or whatever) to detect
the loop termination condition, and you have to dereference a pointer
too...? Ok, the iterator version may involve function calls but these
are quite likely trivial enough to be optimised away by any compiler
worth its salt.

Generally speaking, no. Never assume. Especially about your compiler
(which you may be required to use, instead of picking one "worth its
salt").

...except if you are certain they will be magicked away by your
compiler.

Click to expand...

You can never be certain unless you actually see the assembly code.
You can never have enough time to look through all assembly code if
your project is beyond the simplest few KLOC. Conclusion: never
trust your compiler to do "the right thing" when it comes to your
program's performance. Measure and make appropriate changes.

V

Lionel B · Dec 20, 2006

On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:

/.../

...except if you are certain they will be magicked away by your
compiler.

Ok, here's a small test program:

--- begin code ---

#include <iostream>
#include <vector>
#include <time.h>

int main()
{
const int s = 100000000;
std::vector<int> vec(s,1);

clock_t start,endt;

start = clock();
for (int i=0; i<s; ++i) vec += 2;
endt = clock();
std::cout << "index : time = " << endt-start << '\n';

start = clock();
for (std::vector<int>::iterator i=vec.begin(); i!=vec.end(); ++i) *i += 2;
endt = clock();
std::cout << "iterator : time = " << endt-start << '\n';

int* p = &(vec[0]);
start = clock();
for (int i=0; i<s; ++i) *(p+i) += 2;
endt = clock();
std::cout << "pointer : time = " << endt-start << '\n';

return 0;
}

--- end code ---

Compiled with default optimisation (gcc 4.1.1 on linux x86-64):

index : time = 3240000
iterator : time = 3140000
pointer : time = 640000

Compiled with -O1:

index : time = 270000
iterator : time = 230000
pointer : time = 230000

Compiled with -O3:

index : time = 230000
iterator : time = 220000
pointer : time = 230000

Victor Bazarov · Dec 20, 2006

Lionel said:
On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:

/.../

...except if you are certain they will be magicked away by your
compiler.

Click to expand...

Ok, here's a small test program:

--- begin code ---
[..]
--- end code ---

Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]

I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations, the newsgroup dedicated to that compiler is
probably a better place. Here we talk C++ *in general*, and *in
general* nothing is "magicked away". That's all.

V

Lionel B · Dec 20, 2006

Lionel said:
Lionel said:

On Wed, 20 Dec 2006 08:48:27 -0500, Victor Bazarov wrote:

/.../

Bottomline: avoid function calls like the plague!

...except if you are certain they will be magicked away by your
compiler.

Click to expand...

Ok, here's a small test program:

--- begin code ---
[..]
--- end code ---

Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]

Click to expand...

I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations,

I don't. I only included that for information. The point I would like
to make is that I would not wilfully obfuscate my code with pointers where
indexing or iterators are more appropriate, for the sole purpose of
second-guessing compiler optimisation capabilities - and this seems to be
exactly what you are suggesting when you say "avoid function calls like the
plague". To me that is premature optimisation pure and simple.

Mathias Gaunard · Dec 20, 2006

Victor said:
Tested with what optimization settings? This code has to instantiate
the vector<int>::iterator vec.size() times (since you used i++ and not
++i). Besides, it has to compare the iterators, it has to dereference
the 'i'. Don't just jump to conclusions. Didn't I say I profiled
that sort of code enough to claim pointers are the fastest?

Modern compilers generate exactly the same assembler when using vector
iterators or pointers.

Rolf Magnus · Dec 20, 2006

Victor said:
Rolf said:

[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Click to expand...

Not true.

Well, it's truely what I measured.

Indexing using a pointer is still faster because it does
not involve a function call. That's the whole point!

This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient. I wouldn't expect
anything different, since the operator usually just contains nothing more
than the pointer indexing, and the code for that is smaller than a function
call, so I would expect a size optimizing compiler to inline the operator,
producing the exact same code as the direct pointer indexing.

peter koch · Dec 20, 2006

Victor Bazarov skrev:
[snip]

To make a real case of inline versus non-inline, you _must_ place your
non-inlined member functions in a separate translation unit. Otherwise
any decent compiler should be able to simply inline all your calls if
it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha. Build
your project with optimization for size, use 'std::vector' and its
indexing argument, and you'll notice that if you switch to indexing
from a pointer to the first element instead, you'll get plenty of
performance improvement. Up to ten times, as a matter of fact, in
some cases.

Your argument is valid for a specific compiler and a specific
optimisation setting and not very generic. On most compilers and with
most optimisation-settings, there should be no difference at all.
Actually, some advocate using indexing rather than pointer-iterations
for raw vectors, claiming better optimisation capabilities due to fewer
aliasing issues for the compiler.

/Peter

Rolf Magnus · Dec 20, 2006

Victor said:
Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]

Click to expand...

I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations, the newsgroup dedicated to that compiler is
probably a better place. Here we talk C++ *in general*, and *in
general* nothing is "magicked away". That's all.

Well, *in general*, you can't do any optimizations, because C++ doesn't
specify how much time the generated code must take for a specific piece of
source code. So if you suggest any optimization techniques, you have to
look at the real world, not just at the standard.

Bo Persson · Dec 20, 2006

Victor said:
Thomas said:

Hello everybody,
currently I am tweaking a bit on one of my projects and wondered
how high are the costs for calling a (member) function. I tried
some possibilites and come to a suprising result.
Maybe someone can explain to me what I might improve on my test or
how come this results occured?
For testing I commented different lines out. Code see below.

Regards,
Thomas Kowalski

// MS C++ 8.0
// 1 Billion additions:
// simple methode call: 2134 ms, 2123 ms,
// inline call: 2037 ms, 2053 ms, 2073 ms
// simple methode ref param: 2153 ms, 2034 ms, 2093 ms
// addition no methode 2034 ms, 2037 ms, 2032 ms
// virtual funtion call ~2100 ms
// function pointer 7060 ms, 7260 ms, 7260 ms

// Results with gcc 3.3.6 are similar but function pointer are as
fast as normal calls

[..]

Click to expand...

To make a real case of inline versus non-inline, you _must_ place
your non-inlined member functions in a separate translation unit.
Otherwise any decent compiler should be able to simply inline all
your calls if it can see the actual body.

With the compiler used here that won't help, as it is perfectly capable of
inlining over module boundaries.

Writing a synthetic benchmark for a global optimizer is extremely difficult.

Bo Persson

Victor Bazarov · Dec 20, 2006

Rolf said:
Victor said:

Compiled with default optimisation (gcc 4.1.1 on linux x86-64):
[..]

Click to expand...

I think you're missing the point. If you want to discuss the
capabilities of any particular compiler to perform some function
call optimisations, the newsgroup dedicated to that compiler is
probably a better place. Here we talk C++ *in general*, and *in
general* nothing is "magicked away". That's all.

Click to expand...

Well, *in general*, you can't do any optimizations, because C++
doesn't specify how much time the generated code must take for a
specific piece of source code. So if you suggest any optimization
techniques, you have to look at the real world, not just at the
standard.

*In general* if one's concerned with overhead of calling a function,
one should inline the function body. Whether your compiler is up
to snuff for that or you do it manually does not matter. In general,
however, you can't rely on your compiler, so doing it manually is
your only *generally* good approach. Are you suggesting that, given

int foo(int a, int b)
{
return a+b;
}
int bar(int a, int b)
{
return foo(a+b);
}

, calling 'bar' *can* be faster than calling 'foo'?

V

RSA implementation issues in public key pem loader function	0	May 21, 2025
Database Manager: A C++ Console Application	14	May 12, 2025
How to multiply two matrices of size in using inline assembly in C++	3	Mar 3, 2024
Rich Text Format (RTF) Document Builder in C++: Code and Features	0	Sep 28, 2025
AES-128 Clipboard Protector: Auto-Encrypt Ctrl+C, Smart-Decrypt Ctrl+V (C++ Windows Hook)	7	Mar 24, 2026
Chatbot	0	Oct 8, 2024
Mini Web Server in C++ (Part One)	4	Oct 2, 2025
Can someone pls help me with a little algorithm script	1	Nov 28, 2024

Costs of function calls

Thomas Kowalski

Victor Bazarov

Thomas Kowalski

=?iso-8859-1?q?Erik_Wikstr=F6m?=

Rolf Magnus

Ole Nielsby

Rolf Magnus

Victor Bazarov

Victor Bazarov

Lionel B

Victor Bazarov

Lionel B

Victor Bazarov

Lionel B

Mathias Gaunard

Rolf Magnus

peter koch

Rolf Magnus

Bo Persson

Victor Bazarov

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads