Costs of function calls

V

Victor Bazarov

Mathias said:
Modern compilers generate exactly the same assembler when using vector
iterators or pointers.

No, they don't. It depends on what you ask them, and it depends on the
implementation of the Standard library. I am not going to go into the
details, try different optimization levels with your favourite compiler
and you will see. And keep in mind that maximum optimization is not
always the right choice.

V
 
V

Victor Bazarov

Rolf said:
Victor said:
Rolf said:
[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true.

Well, it's truely what I measured.

I am not disputing what you measured. I am disagreeing with the
last statement of that paragraph. It is unclear whether it's what
you measured or what you deduced to be the reality.
This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient.

Isn't sufficient for what? For getting the "correct" results in
indexing a pointer versus calling the vector::eek:perator[]?
I wouldn't expect
anything different, since the operator usually just contains nothing
more than the pointer indexing, and the code for that is smaller than
a function call, so I would expect a size optimizing compiler to
inline the operator, producing the exact same code as the direct
pointer indexing.

Hey, maybe you and many others would _expect_ the world peace next
year. It's not gonna happen, though. Just like _expecting_ the
compiler to do something (or forgo doing something) does not mean
it is so.

In certain situations indexing operator implemenation _can_ contain
other stuff (whatever the library implementors decided to put in
there), which the compiler has to convert into code; it doesn't
have to do that if you just index a pointer. IOW, it's not just
the overhead of preparing the stack frame for the call, it's the
contents of the function that when they are not inlined, which we
have to compare with a plain pointer dereferencing/indexing.

V
 
V

Victor Bazarov

peter said:
Victor Bazarov skrev:
[snip]
To make a real case of inline versus non-inline, you _must_ place
your non-inlined member functions in a separate translation unit.
Otherwise any decent compiler should be able to simply inline all
your calls if it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha.
Build your project with optimization for size, use 'std::vector' and
its indexing argument, and you'll notice that if you switch to
indexing from a pointer to the first element instead, you'll get
plenty of performance improvement. Up to ten times, as a matter of
fact, in some cases.
Your argument is valid for a specific compiler and a specific
optimisation setting and not very generic. On most compilers and with
most optimisation-settings, there should be no difference at all.

If among one million outcomes there are ten that are not like the other
nine hundred ninety nine thousand nine hundred and ninety, you cannot
call those others "the general case". If there is at least one compiler
that doesn't perform like the others, we cannot say the general case is
that optimization takes care of that. You can call that "mainly" or
"chiefly" or "usually". Not "generally". *Generally* you cannot rely
on the compiler to do that.
Actually, some advocate using indexing rather than pointer-iterations
for raw vectors,

What are those? Do you mean "arrays"?
claiming better optimisation capabilities due to
fewer aliasing issues for the compiler.

Actually indexing is a better technique for parallel computations, by
far, than dereferencing the pointer (although I've seen indexing code
to be slower about 10% than dereferencing an incrementing pointer).
Indexing the pointer, that is, not calling 'vector::eek:perator[]' of
course.

V
 
P

peter koch

Victor Bazarov skrev:
Rolf said:
Victor said:
Rolf Magnus wrote:
[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true.

Well, it's truely what I measured.

I am not disputing what you measured. I am disagreeing with the
last statement of that paragraph. It is unclear whether it's what
you measured or what you deduced to be the reality.
This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient.

Isn't sufficient for what? For getting the "correct" results in
indexing a pointer versus calling the vector::eek:perator[]?
I wouldn't expect
anything different, since the operator usually just contains nothing
more than the pointer indexing, and the code for that is smaller than
a function call, so I would expect a size optimizing compiler to
inline the operator, producing the exact same code as the direct
pointer indexing.

Hey, maybe you and many others would _expect_ the world peace next
year. It's not gonna happen, though. Just like _expecting_ the
compiler to do something (or forgo doing something) does not mean
it is so.
I do not expect world peace even if that would be a nice feature, but I
do expect you to know your compiler and e.g. set the appropriate
switches in order to get optimal performance.
In certain situations indexing operator implemenation _can_ contain
other stuff (whatever the library implementors decided to put in
there), which the compiler has to convert into code; it doesn't
have to do that if you just index a pointer. IOW, it's not just
the overhead of preparing the stack frame for the call, it's the
contents of the function that when they are not inlined, which we
have to compare with a plain pointer dereferencing/indexing.

And in that case you'd compare apples to oranges, If the vendor should
decide to e.g put boundary checking into the indexing operator but not
put checking code into a "pointer" version, the two implementations are
not the same. You should - if you could - disable the extra code put in
by the vendor to get a fair comparison.

/Peter
 
P

peter koch

Victor Bazarov skrev:
peter said:
Victor Bazarov skrev:
[snip]
To make a real case of inline versus non-inline, you _must_ place
your non-inlined member functions in a separate translation unit.
Otherwise any decent compiler should be able to simply inline all
your calls if it can see the actual body.

I _know_ that if a function is not inlined, it's gonna costcha.
Build your project with optimization for size, use 'std::vector' and
its indexing argument, and you'll notice that if you switch to
indexing from a pointer to the first element instead, you'll get
plenty of performance improvement. Up to ten times, as a matter of
fact, in some cases.
Your argument is valid for a specific compiler and a specific
optimisation setting and not very generic. On most compilers and with
most optimisation-settings, there should be no difference at all.

If among one million outcomes there are ten that are not like the other
nine hundred ninety nine thousand nine hundred and ninety, you cannot
call those others "the general case". If there is at least one compiler
that doesn't perform like the others, we cannot say the general case is
that optimization takes care of that. You can call that "mainly" or
"chiefly" or "usually". Not "generally". *Generally* you cannot rely
on the compiler to do that.
Actually, some advocate using indexing rather than pointer-iterations
for raw vectors,

What are those? Do you mean "arrays"?
Right - to sloppy and to fast to respond here. I'm slightly surprised
you did not understand what I meant.
claiming better optimisation capabilities due to
fewer aliasing issues for the compiler.

Actually indexing is a better technique for parallel computations, by
far, than dereferencing the pointer (although I've seen indexing code
to be slower about 10% than dereferencing an incrementing pointer).
Indexing the pointer, that is, not calling 'vector::eek:perator[]' of
course.

I believe it is time you come out of the bush. What compilers apart
from MSVC 8.0 does give you an overhead using operator[]? And what
compiler out there gives you a performance where indexing is
significantly (say 10%) worse than pointer/iterator usage that can't be
cured by changing the compiler settings: without changing the
sourcecode?

/Peter
 
R

Rolf Magnus

Victor said:
Rolf said:
Victor said:
Rolf Magnus wrote:
[..] Last time I measured
iterating a vector, indexing was generally slower than using a
pointer or an iterator and increment it in the loop. No difference
between indexing using a pointer and using vector's operator[].

Not true.

Well, it's truely what I measured.

I am not disputing what you measured. I am disagreeing with the
last statement of that paragraph. It is unclear whether it's what
you measured or what you deduced to be the reality.

It's what I measured.
This happens only if I switch optimizations off completely. Size
optimitation - as you suggested - isn't sufficient.

Isn't sufficient for what? For getting the "correct" results in
indexing a pointer versus calling the vector::eek:perator[]?

It isn't sufficient to make my compiler not inline the function. As I
understand, you were suggesting that size optimization would disable
inlining of that operator.
Hey, maybe you and many others would _expect_ the world peace next
year. It's not gonna happen, though. Just like _expecting_ the
compiler to do something (or forgo doing something) does not mean
it is so.

It's actually the result of an observation. I saw the compiler do that, and
then I thought "hey, that's logical. It's what one would expect from a good
compiler".
In certain situations indexing operator implemenation _can_ contain
other stuff (whatever the library implementors decided to put in
there), which the compiler has to convert into code; it doesn't
have to do that if you just index a pointer. IOW, it's not just
the overhead of preparing the stack frame for the call, it's the
contents of the function that when they are not inlined, which we
have to compare with a plain pointer dereferencing/indexing.

Right. But again, this is all implementation specific, and discussing about
whether one way of doing something is faster than another, you have to pick
an implementation and look at what it does. Otherwise, you are just
discussing about what could, might or should happen.
 
V

Victor Bazarov

Rolf said:
[..] this is all implementation specific, and discussing
about whether one way of doing something is faster than another, you
have to pick an implementation and look at what it does. Otherwise,
you are just discussing about what could, might or should happen.

<shrug> The discussion started with the question about the cost of
calling a member function. The results were specific (to VC++ v8),
but the question is generic. The answer is usually "you have to
measure to know". However, I found it that with all other things
different, if you can avoid calling a function, do that. You all
seem to disagree with that, and I for the hell of it cannot figure
out what (if anything) you have to say to prove me wrong, except
"duh, compilers should take care of it".

V
 
V

Victor Bazarov

peter said:
Victor Bazarov skrev:
peter koch wrote: [..]
Actually, some advocate using indexing rather than
pointer-iterations for raw vectors,

What are those? Do you mean "arrays"?
Right - to sloppy and to fast to respond here. I'm slightly surprised
you did not understand what I meant.

I understood (otherwise how would I be able to suggest "arrays"?)
But for the slim chance that there is somebody who is even less
attentive than I am, it was necessary to confirm.
claiming better optimisation capabilities due to
fewer aliasing issues for the compiler.

Actually indexing is a better technique for parallel computations, by
far, than dereferencing the pointer (although I've seen indexing code
to be slower about 10% than dereferencing an incrementing pointer).
Indexing the pointer, that is, not calling 'vector::eek:perator[]' of
course.

I believe it is time you come out of the bush. What compilers apart
from MSVC 8.0 does give you an overhead using operator[]?

I don't know. I (not surprisingly) don't actually have all compilers
in the world available to me to check all their optimization settings.
And what
compiler out there gives you a performance where indexing is
significantly (say 10%) worse than pointer/iterator usage that can't
be cured by changing the compiler settings: without changing the
sourcecode?

I am sure that behaviour of the code can be usually affected/fixed
by tweaking compiler settings (instead of changing the code). But
it is usually is the freedom that only programmers of small projects
enjoy. As soon as you have a larger organization consuming your
product (code), there are always things in the way, like coding
standards, like the need to justify or prove the benefit of changing
compiler settings, and so forth. It is much easier to fix the code
than to fix the project settings, when the large organisations like
that are concerned.

V
 
D

David Harmon

On Wed, 20 Dec 2006 14:19:21 +0100 in comp.lang.c++, Rolf Magnus
std::transform(vec.begin(), vec.end(), vec.begin(),
std::bind2nd(std::plus<int>(), 2));

This is harder to write, harder to read and not shorter than the for loop.

#include <boost/lambda/lambda.hpp>

std::for_each(vec.begin(), vec.end(), _1 += 2);
 
L

Lilburne

Victor said:
Rolf said:
[..] this is all implementation specific, and discussing
about whether one way of doing something is faster than another, you
have to pick an implementation and look at what it does. Otherwise,
you are just discussing about what could, might or should happen.


<shrug> The discussion started with the question about the cost of
calling a member function. The results were specific (to VC++ v8),
but the question is generic. The answer is usually "you have to
measure to know". However, I found it that with all other things
different, if you can avoid calling a function, do that. You all
seem to disagree with that, and I for the hell of it cannot figure
out what (if anything) you have to say to prove me wrong, except
"duh, compilers should take care of it".

Well I for one would agree with you. Avoiding the function call will
speed things up. Mostly you won't notice it won't have any significant
affect, but when it does it does. So as you say you need to measure first.

Being overly concerned about what one compiler does can be
counterproductive, as can the statement the "compile should take care of
it". The GNU compiler rarely generates the same code for a significant
function from one release to the next, if you were to rely on the output
of that you may find yourself stuck with one particular release, bugs
and all.
 
R

Rolf Magnus

Victor said:
Rolf said:
[..] this is all implementation specific, and discussing
about whether one way of doing something is faster than another, you
have to pick an implementation and look at what it does. Otherwise,
you are just discussing about what could, might or should happen.

<shrug> The discussion started with the question about the cost of
calling a member function. The results were specific (to VC++ v8),
but the question is generic. The answer is usually "you have to
measure to know".
Agreed.

However, I found it that with all other things different, if you can avoid
calling a function, do that.

If by "function call" you mean machine level (as opposed to a function call
in C++ that gets inlined).
You all seem to disagree with that, and I for the hell of it cannot figure
out what (if anything) you have to say to prove me wrong, except "duh,
compilers should take care of it".

I wasn't really disagreeing on that, but rather on your suggestions on how
to measure the difference.
Anyway, I actually have heard (but not verified myself) that it can happen
that inlining a function increases execution time because of increased code
size and thus more cache load. If the inlined function contains branches
and is called from many places, branch prediction can also become less
efficient.
 
V

Victor Bazarov

Rolf said:
[..]
Anyway, I actually have heard (but not verified myself) that it can
happen that inlining a function increases execution time because of
increased code size and thus more cache load. If the inlined function
contains branches and is called from many places, branch prediction
can also become less efficient.

I've heard of it too. I also heard that any attempt to improve code
performance by caching more information in objects can have the adverse
effect of ballooning memory to the point where cache misses and page
faults accessing objects in memory starts hindering performance... I
think we already agreed that it can only be discovered by measuring
the code execution, and very difficult to guess. I have measured the
effects of reducing function calls. I haven't seen any direct evidence
of code size affecting performance, as of yet.

V
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,212
Latest member
BrennaCaba

Latest Threads

Top