Function pointers: performance penalty?

Gene · Oct 12, 2009

This is theoretically possible, but does it ever happen in a C
compiler? In many implementations, qsort is just a function compiled
long ago that simply has code in it to do an indirect call. Have you
come across a system that does inline or re-write this call?

I don't know specifically about qsort(), but JIT compilers based on
technologies like LLVM would do this as a matter of course. See
llvm.org.

jacob navia · Oct 12, 2009

Seebs a écrit :

In the general case, there's no answer. It's not even entirely obvious
that there will be a penalty at all.

Interesting, very interesting.

All the research papers about this subject (and there are hundreds
of research papers in the literature on this subject)
are done by people that missed your point. Interesting really.

For instance:
http://www.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.pdf
The Direct Cost of Virtual Function Calls in C++

That is just nonsense research. All those papers that try to improve
the direct cost of virtual function calls are just wrong...

Incredible how much nonsense can be written here.

There are TWO costs associated with an indirect function call (i.e.
through a function pointer)

(1) The direct cost of pipeline problems as I described in my other
message
(2) The INDIRECT cost because the call graph is not explicit, what makes
many optimizations impossible. This second cost is often neglected but
it is very real.

And I will forget about the nonsense of heathfield that tries to make
a distinction between assembler mnemonics and machine code, or even
kuyper that says that all function calls are pointer function calls
with some citation from the C standard... (even saying that loading the
value of the call into the instruction pointer is a "pointer" call).

jacob navia · Oct 12, 2009

Richard Tobin a écrit :

The obvious way to call a function is by using the processor's
function call instruction. In the past, this often corresponded
fairly directly to putting the address into an instruction pointer
register, and on some architectures it might even be the same as a
"move" instruction to that register. This is much less common today:
any such instruction pointer register is likely to be a substantial
abstraction from the real hardware.

-- Richard

heathfield is (as always) playing with words, implying that the
call instruction loads the address of the function into the
instruction pointer...

The fact that the offset is an explicit one in the case of a
call instruction or an implicit offset not readily available
to the speculative instruction execution module goes beyond
his knowledge of assembly... It is the same when he says that
the ascii mnemonics differ from the machine code, or all those
word games that he plays here.

Richard Tobin · Oct 12, 2009

Richard Heathfield said:
The obvious way
to call a function is by loading its address into the instruction
pointer register.

The obvious way to call a function is by using the processor's
function call instruction. In the past, this often corresponded
fairly directly to putting the address into an instruction pointer
register, and on some architectures it might even be the same as a
"move" instruction to that register. This is much less common today:
any such instruction pointer register is likely to be a substantial
abstraction from the real hardware.

-- Richard

Richard Tobin · Oct 12, 2009

The fact that the offset is an explicit one in the case of a
call instruction or an implicit offset not readily available
to the speculative instruction execution module

I think your use of "readily" may be significant here. Depending on
when the value for an indirect call is loaded into the register, the
pipeline may be still be able to make use of it.

-- Richard

Richard Tobin · Oct 12, 2009

By the way, I am sure that those interested could have a better-informed
discussion over in comp.arch.

-- Richard

Seebs · Oct 12, 2009

if there isn't a 1 -to-1 mapping between assembler and machine code
doesn't that kind of defeat the purpose of assembler?

No.

The purpose of the assembler is to let you pick the instructions you want,
and to handle the fiddly bits. Thus, assemblers typically do things like
translate symbolic names for labels, so you can write "foo:" and later to
a branch to "foo" and expect the right thing to happen. But on processors
where, say, there's a difference between immediate branches within 128
bytes and immediate branches within 32768 bytes, or something similar,
you might not want to have to know -- so you write a generic branch and
the assembler uses the right one once it knows the actual distance.

Etcetera.

Why would you want assembly code that lied to you?

Because otherwise you couldn't write it.

It's not a hugely common thing, but it does come up; the other thing is,
a lot of assembly systems support some degree of macro writing, and often
calling sequences can be among the things for which you emit macro
calls...

-s

Ben Bacarisse · Oct 12, 2009

<snip>

Small point:

*f (i); /* I know I don't need the * */

you need *that* star. You meant (*f)(i), I think, where the * is
indeed redundant but usefully adds clarity to the discussion you were
having.

<snip>

Seebs · Oct 12, 2009

All the research papers about this subject (and there are hundreds
of research papers in the literature on this subject)
are done by people that missed your point. Interesting really.
Uh.

For instance:
http://www.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.pdf
The Direct Cost of Virtual Function Calls in C++

That is just nonsense research. All those papers that try to improve
the direct cost of virtual function calls are just wrong...

No, they're quite right. They're answering, though, a VERY different
question.

Direct call:
puts("hello, world!\n");

Function pointer:

int (*foo)(const char *) = puts;
foo("hello, world!\n");

Virtual function call:

struct funcs {
struct funcs *next;
int (*foo)(const char *)puts_func;
};
struct foo {
struct funcs *functable;
char *s;
};
struct funcs stdio_functable = { NULL, puts };
struct foo greet_user = { &stdio_functable, "hello, world!\n" };
struct foo *greeter = &greet_user;

/* ... */
if (greeter->functable) {
struct funcs *vtable = greeter->functable;
while (vtable) {
if (vtable->puts_func) {
return vtable->puts_func(greeter->s);
}
vtable = vtable->next;
}
}

.... Oh, you're right, there's NO DIFFERENCE AT ALL between a virtual
function call using virtual function tables and a call through a function
pointer, I'm sure they're exactly the same thing and have identical
performance characteristics.

(Disclaimer: The above vtable interface was written on the fly
and is untested, but things of roughly this sort have been used in the
past, and are about the right mapping for the behavior of a virtual
function implementation, complete with the dynamic runtime calculation,
not just of what address to call, but where to find the address you plan
to call...)

And I will forget about the nonsense of heathfield that tries to make
a distinction between assembler mnemonics and machine code, or even
kuyper that says that all function calls are pointer function calls
with some citation from the C standard... (even saying that loading the
value of the call into the instruction pointer is a "pointer" call).

Of course you will.

Except that, in both cases, they're right. Assembler mnemonics are not always
1:1 with machine code in modern assemblers, and a call through an external
symbol is not necessarily noticably different between a symbol which the
linker maps to a function's address and a symbol which the compiler maps to a
variable holding a function's address.

-s

Tim Rentsch · Oct 12, 2009

Seebs said:
True.

I guess I'm more looking at the question "is there a definite answer",
and the answer to that is "no, not in general".

I don't disagree, but I would choose a different phrasing,
because I think there is useful information to give.
Of course, even on a specific platform what the cost
is of calling through a function pointer depends on
lots of different things, often too numerous to know.
But in general, the answer I'd be inclined to give
is "it depends /a lot/ on which processor architecture(s)
the code is expected to run on."

Antoninus Twink · Oct 12, 2009

This is theoretically possible, but does it ever happen in a C
compiler?

You should be careful, Ben.

That's the sort of subversive question that gets you branded as a
"troll" in this group.

bartc · Oct 12, 2009

Richard Heathfield said:
In <[email protected]>,
Nick Keighley wrote:

Neither did I. It is not clear to me what he actually means. I took a
guess, and asked him whether that guess was correct. It seems he now
thinks I'm playing games with him, and I don't have the patience for
that kind of discussion right now.

I don't have much patience for those games either.

I said I hadn't seen any evidence, from asm listings, that compilers were
generating indirect call instructions in preference to regular ones, if
calling via function pointers was in fact always faster, as claimed by Mr
Carmody.

The discussion then went off at a tangent, but I still haven't seen such
evidence (and I'm familiar enough with asm to know what's what). Your reply
suggested that if I looked at machine instead code, I would actually see
that evidence.

Now if Mr Carmody had some particular machine in mind where his statement
would be correct, then it would be intriguing to know what it was.
Memory-indirect calls require an extra memory access which usually involves
some of performance penalty (everything else being equal...).

Keith Thompson · Oct 12, 2009

Phil Carmody said:
Nope, function pointers are faster.

Could you please expand on this statement? Was it intended as
a joke? If not, what exactly do you mean?

Seebs · Oct 12, 2009

I said I hadn't seen any evidence, from asm listings, that compilers were
generating indirect call instructions in preference to regular ones, if
calling via function pointers was in fact always faster, as claimed by Mr
Carmody.

I don't know about "always" faster. I would not be surprised if there existed
cases in which a call through a function pointer were faster.

Now if Mr Carmody had some particular machine in mind where his statement
would be correct, then it would be intriguing to know what it was.
Memory-indirect calls require an extra memory access which usually involves
some of performance penalty (everything else being equal...).

I would guess that at least some cases exist in which the direct call can't
be fully figured out at compile time, so it'd be a direct call to something
else which figures out what to actually call. (On at least some common
implementations, this is nearly always the case for, say, everything in the
standard C library.) And there might then exist cases in which a function
pointer call could be to the "target" function. In which case it'd be faster.

Consider, though, a case like:

for (i = 0; i < 100; ++i) {
foo(a);
}

It might be that, if foo is a function pointer, this ends up being:

load r1,<dereference-of-memory>
...
push a
call r1

and if it's immediate, it ends up being

...
push a
load r1,<immediate value>
call r1

or something comparable. In such a case, the call through a pointer
may be enough of a hint to tell the compiler to keep that address loaded
in a suitable register.

So I think in the real world, the answer is you'd have to test to find out.

On my Mac, I see no difference at all between the two cases.

Unless the function is "printf" and the argument is "", in which case the
direct call gets a free printf-specific optimization that makes more than an
order of magnitude difference... But that's not the call overhead, it's
a special-case optimization.

Otherwise, they seem to be the same. I see:

call *%rax
call _y

-s

user923005 · Oct 12, 2009

I said I hadn't seen any evidence, from asm listings, that compilers were
generating indirect call instructions in preference to regular ones, if
calling via function pointers was in fact always faster, as claimed by Mr
Carmody.

Click to expand...

I don't know about "always" faster. I would not be surprised if there existed
cases in which a call through a function pointer were faster.

Now if Mr Carmody had some particular machine in mind where his statement
would be correct, then it would be intriguing to know what it was.
Memory-indirect calls require an extra memory access which usually involves
some of performance penalty (everything else being equal...).

Click to expand...

I would guess that at least some cases exist in which the direct call can't
be fully figured out at compile time, so it'd be a direct call to something
else which figures out what to actually call. (On at least some common
implementations, this is nearly always the case for, say, everything in the
standard C library.) And there might then exist cases in which a function
pointer call could be to the "target" function. In which case it'd be faster.

Consider, though, a case like:

for (i = 0; i < 100; ++i) {
foo(a);
}

It might be that, if foo is a function pointer, this ends up being:

load r1,<dereference-of-memory>
...
push a
call r1

and if it's immediate, it ends up being

...
push a
load r1,<immediate value>
call r1

or something comparable. In such a case, the call through a pointer
may be enough of a hint to tell the compiler to keep that address loaded
in a suitable register.

So I think in the real world, the answer is you'd have to test to find out.

On my Mac, I see no difference at all between the two cases.

Unless the function is "printf" and the argument is "", in which case the
direct call gets a free printf-specific optimization that makes more than an
order of magnitude difference... But that's not the call overhead, it's
a special-case optimization.

Otherwise, they seem to be the same. I see:

call *%rax
call _y

Given this code:
#include<math.h>
#include<stdio.h>
typedef double (*f_t) (double);
static f_t f[] =
{log, log10, sqrt, cos, cosh, exp, sin, sinh, tan, tanh, 0};
int main(void)
{
int i;
f_t *flist = f;
for (i = 0; f; i++) {
printf("new style function %d = %g\n", i, f (0.5));
}
for (i = 0; f; i++) {
printf("old style function %d = %g\n", i, (*f) (0.5));
}
while (*flist) {
f_t ff = *flist;
printf("function pointer = %g\n", ff(0.5));
flist++;
}
printf("direct function call = %g\n", log(0.5));
printf("direct function call = %g\n", log10(0.5));
printf("direct function call = %g\n", sqrt(0.5));
printf("direct function call = %g\n", cos(0.5));
printf("direct function call = %g\n", cosh(0.5));
printf("direct function call = %g\n", exp(0.5));
printf("direct function call = %g\n", sin(0.5));
printf("direct function call = %g\n", sinh(0.5));
printf("direct function call = %g\n", tan(0.5));
printf("direct function call = %g\n", tanh(0.5));
return 0;
}

The following 64 bit Intel assembly is generated:
; Listing generated by Microsoft (R) Optimizing Compiler Version
15.00.30729.01

include listing.inc

INCLUDELIB OLDNAMES

PUBLIC ??_C@_0BM@CKAFHGKH@new?5style?5function?5?$CFd?5?$DN?5?$CFg?6?
$AA@ ; `string'
PUBLIC ??_C@_0BM@NPLLPAPB@old?5style?5function?5?$CFd?5?$DN?5?$CFg?6?
$AA@ ; `string'
PUBLIC ??_C@_0BH@LIMEHMAH@function?5pointer?5?$DN?5?$CFg?6?$AA@ ;
`string'
PUBLIC ??_C@_0BL@KHOEBBJB@direct?5function?5call?5?$DN?5?$CFg?6?$AA@ ;
`string'
EXTRN __imp_printfROC
EXTRN coshROC
EXTRN cosROC
EXTRN expROC
EXTRN log10ROC
EXTRN logROC
EXTRN sqrtROC
EXTRN sinhROC
EXTRN sinROC
EXTRN tanhROC
EXTRN tanROC
; COMDAT ??_C@_0BL@KHOEBBJB@direct?5function?5call?5?$DN?5?$CFg?6?$AA@
CONST SEGMENT
??_C@_0BL@KHOEBBJB@direct?5function?5call?5?$DN?5?$CFg?6?$AA@ DB
'direct '
DB 'function call = %g', 0aH, 00H ; `string'
CONST ENDS
; COMDAT ??_C@_0BH@LIMEHMAH@function?5pointer?5?$DN?5?$CFg?6?$AA@
CONST SEGMENT
??_C@_0BH@LIMEHMAH@function?5pointer?5?$DN?5?$CFg?6?$AA@ DB 'function
poi'
DB 'nter = %g', 0aH, 00H ; `string'
CONST ENDS
; COMDAT ??_C@_0BM@NPLLPAPB@old?5style?5function?5?$CFd?5?$DN?5?$CFg?6?
$AA@
CONST SEGMENT
??_C@_0BM@NPLLPAPB@old?5style?5function?5?$CFd?5?$DN?5?$CFg?6?$AA@ DB
'ol'
DB 'd style function %d = %g', 0aH, 00H ; `string'
CONST ENDS
; COMDAT ??_C@_0BM@CKAFHGKH@new?5style?5function?5?$CFd?5?$DN?5?$CFg?6?
$AA@
CONST SEGMENT
??_C@_0BM@CKAFHGKH@new?5style?5function?5?$CFd?5?$DN?5?$CFg?6?$AA@ DB
'ne'
DB 'w style function %d = %g', 0aH, 00H ; `string'
f DQ FLAT:log
DQ FLAT:log10
DQ FLAT:sqrt
DQ FLAT:cos
DQ FLAT:cosh
DQ FLAT:exp
DQ FLAT:sin
DQ FLAT:sinh
DQ FLAT:tan
DQ FLAT:tanh
DQ 0000000000000000H
PUBLIC __real@3fe0000000000000
PUBLIC main
EXTRN _fltusedWORD
; COMDAT pdata
; File c:\tmp\tt.c
pdata SEGMENT
$pdata$main DD imagerel $LN17
DD imagerel $LN17+571
DD imagerel $unwind$main
pdata ENDS
; COMDAT xdata
xdata SEGMENT
$unwind$main DD 0a2201H
DD 026822H
DD 0a6414H
DD 095414H
DD 083414H
DD 070105214H
xdata ENDS
; COMDAT __real@3fe0000000000000
CONST SEGMENT
__real@3fe0000000000000 DQ 03fe0000000000000r ; 0.5
; Function compile flags: /Ogtpy
CONST ENDS
; COMDAT main
_TEXT SEGMENT
main PROC ; COMDAT

; 7 : {

$LN17:
00000 48 89 5c 24 08 mov QWORD PTR [rsp+8], rbx
00005 48 89 6c 24 10 mov QWORD PTR [rsp+16], rbp
0000a 48 89 74 24 18 mov QWORD PTR [rsp+24], rsi
0000f 57 push rdi
00010 48 83 ec 30 sub rsp, 48 ; 00000030H

; 8 : int i;
; 9 : f_t *flist = f;
; 10 : for (i = 0; f; i++) {

00014 48 8b 05 00 00
00 00 mov rax, QWORD PTR f
0001b 33 ff xor edi, edi
0001d 0f 29 74 24 20 movaps XMMWORD PTR [rsp+32], xmm6
00022 f2 0f 10 35 00
00 00 00 movsdx xmm6, QWORD PTR __real@3fe0000000000000
0002a 48 8d 1d 00 00
00 00 lea rbx, OFFSET FLAT:f
00031 8b f7 mov esi, edi
00033 48 8b ef mov rbp, rdi
00036 48 85 c0 test rax, rax
00039 74 39 je SHORT $LN6@main
0003b 0f 1f 44 00 00 npad 5
$LL8@main:

; 11 : printf("new style function %d = %g\n", i, f
(0.5));

00040 66 0f 28 c6 movapd xmm0, xmm6
00044 ff d0 call rax
00046 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BM@CKAFHGKH@new?5style?5function?
5?$CFd?5?$DN?5?$CFg?6?$AA@
0004d 8b d6 mov edx, esi
0004f 66 0f 28 d0 movapd xmm2, xmm0
00053 66 49 0f 7e d0 movd r8, xmm2
00058 ff 15 00 00 00
00 call QWORD PTR __imp_printf
0005e 48 8b 44 eb 08 mov rax, QWORD PTR [rbx+rbp*8+8]
00063 48 ff c5 inc rbp
00066 ff c6 inc esi
00068 48 85 c0 test rax, rax
0006b 75 d3 jne SHORT $LL8@main

; 8 : int i;
; 9 : f_t *flist = f;
; 10 : for (i = 0; f; i++) {

0006d 48 8b 05 00 00
00 00 mov rax, QWORD PTR f
$LN6@main:

; 12 : }
; 13 : for (i = 0; f; i++) {

00074 48 8b f7 mov rsi, rdi
00077 48 85 c0 test rax, rax
0007a 74 6d je SHORT $LN1@main
0007c 0f 1f 40 00 npad 4
$LL5@main:

; 14 : printf("old style function %d = %g\n", i, (*f)
(0.5));

00080 66 0f 28 c6 movapd xmm0, xmm6
00084 ff d0 call rax
00086 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BM@NPLLPAPB@old?5style?5function?
5?$CFd?5?$DN?5?$CFg?6?$AA@
0008d 8b d7 mov edx, edi
0008f 66 0f 28 d0 movapd xmm2, xmm0
00093 66 49 0f 7e d0 movd r8, xmm2
00098 ff 15 00 00 00
00 call QWORD PTR __imp_printf
0009e 48 8b 44 f3 08 mov rax, QWORD PTR [rbx+rsi*8+8]
000a3 48 ff c6 inc rsi
000a6 ff c7 inc edi
000a8 48 85 c0 test rax, rax
000ab 75 d3 jne SHORT $LL5@main

; 15 : }
; 16 : while (*flist) {

000ad 48 8b 05 00 00
00 00 mov rax, QWORD PTR f
000b4 48 85 c0 test rax, rax
000b7 74 30 je SHORT $LN1@main
000b9 0f 1f 80 00 00
00 00 npad 7
$LL2@main:

; 17 : f_t ff = *flist;
; 18 : printf("function pointer = %g\n", ff(0.5));

000c0 66 0f 28 c6 movapd xmm0, xmm6
000c4 ff d0 call rax
000c6 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BH@LIMEHMAH@function?5pointer?5?
$DN?5?$CFg?6?$AA@
000cd 66 0f 28 c8 movapd xmm1, xmm0
000d1 66 48 0f 7e ca movd rdx, xmm1
000d6 ff 15 00 00 00
00 call QWORD PTR __imp_printf
000dc 48 8b 43 08 mov rax, QWORD PTR [rbx+8]

; 19 : flist++;

000e0 48 83 c3 08 add rbx, 8
000e4 48 85 c0 test rax, rax
000e7 75 d7 jne SHORT $LL2@main
$LN1@main:

; 20 : }
; 21 : printf("direct function call = %g\n", log(0.5));

000e9 66 0f 28 c6 movapd xmm0, xmm6
000ed e8 00 00 00 00 call log
000f2 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
000f9 66 0f 28 c8 movapd xmm1, xmm0
000fd 66 48 0f 7e ca movd rdx, xmm1
00102 ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 22 : printf("direct function call = %g\n", log10(0.5));

00108 66 0f 28 c6 movapd xmm0, xmm6
0010c e8 00 00 00 00 call log10
00111 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
00118 66 0f 28 c8 movapd xmm1, xmm0
0011c 66 48 0f 7e ca movd rdx, xmm1
00121 ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 23 : printf("direct function call = %g\n", sqrt(0.5));

00127 66 0f 28 c6 movapd xmm0, xmm6
0012b e8 00 00 00 00 call sqrt
00130 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
00137 66 0f 28 c8 movapd xmm1, xmm0
0013b 66 48 0f 7e ca movd rdx, xmm1
00140 ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 24 : printf("direct function call = %g\n", cos(0.5));

00146 66 0f 28 c6 movapd xmm0, xmm6
0014a e8 00 00 00 00 call cos
0014f 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
00156 66 0f 28 c8 movapd xmm1, xmm0
0015a 66 48 0f 7e ca movd rdx, xmm1
0015f ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 25 : printf("direct function call = %g\n", cosh(0.5));

00165 66 0f 28 c6 movapd xmm0, xmm6
00169 e8 00 00 00 00 call cosh
0016e 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
00175 66 0f 28 c8 movapd xmm1, xmm0
00179 66 48 0f 7e ca movd rdx, xmm1
0017e ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 26 : printf("direct function call = %g\n", exp(0.5));

00184 66 0f 28 c6 movapd xmm0, xmm6
00188 e8 00 00 00 00 call exp
0018d 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
00194 66 0f 28 c8 movapd xmm1, xmm0
00198 66 48 0f 7e ca movd rdx, xmm1
0019d ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 27 : printf("direct function call = %g\n", sin(0.5));

001a3 66 0f 28 c6 movapd xmm0, xmm6
001a7 e8 00 00 00 00 call sin
001ac 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
001b3 66 0f 28 c8 movapd xmm1, xmm0
001b7 66 48 0f 7e ca movd rdx, xmm1
001bc ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 28 : printf("direct function call = %g\n", sinh(0.5));

001c2 66 0f 28 c6 movapd xmm0, xmm6
001c6 e8 00 00 00 00 call sinh
001cb 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
001d2 66 0f 28 c8 movapd xmm1, xmm0
001d6 66 48 0f 7e ca movd rdx, xmm1
001db ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 29 : printf("direct function call = %g\n", tan(0.5));

001e1 66 0f 28 c6 movapd xmm0, xmm6
001e5 e8 00 00 00 00 call tan
001ea 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
001f1 66 0f 28 c8 movapd xmm1, xmm0
001f5 66 48 0f 7e ca movd rdx, xmm1
001fa ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 30 : printf("direct function call = %g\n", tanh(0.5));

00200 66 0f 28 c6 movapd xmm0, xmm6
00204 e8 00 00 00 00 call tanh
00209 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:??_C@_0BL@KHOEBBJB@direct?5function?
5call?5?$DN?5?$CFg?6?$AA@
00210 66 0f 28 c8 movapd xmm1, xmm0
00214 66 48 0f 7e ca movd rdx, xmm1
00219 ff 15 00 00 00
00 call QWORD PTR __imp_printf

; 31 : return 0;
; 32 : }

0021f 48 8b 5c 24 40 mov rbx, QWORD PTR [rsp+64]
00224 48 8b 6c 24 48 mov rbp, QWORD PTR [rsp+72]
00229 0f 28 74 24 20 movaps xmm6, XMMWORD PTR [rsp+32]
0022e 48 8b 74 24 50 mov rsi, QWORD PTR [rsp+80]
00233 33 c0 xor eax, eax
00235 48 83 c4 30 add rsp, 48 ; 00000030H
00239 5f pop rdi
0023a c3 ret 0
main ENDP
_TEXT ENDS
END

Showing that the only real differences here are 'spelling' differences
(IOW 'call rax' verses 'call tan').

If we add in virtual functions, we can toss in the cost of a hash
table lookup.

All in all, I think that it is a big waste of time to worry about how
efficient the function calls will be unless you have a deeply nested
loop doing a bazillion function calls where the function execution
time is trivial.

If we choose to use virtual functions, that means it is important to
have replaceable function pointers on the fly. Of course, there is a
small cost to that (essentially a hash table lookup), but it is a cost
we request because of the needed functionality.

Herbert Rosenau · Oct 12, 2009

The three assembly languages I'm familiar with all perform function
calls by loading a special register with the memory address (a pointer
value) of the entry point to the function to be called. I'm curious -
how does the non-pointer way of implementing function calls work?

I know a lot of different assembly languages but none of them has only
a call indirectly.

jsr target, r7 # call subroutine at address given by target,
# use register 7 as stackpointer

jsr r4, r7 # call subroutine whoes target address is in register 4,
# use register 7 as stackpointer

jsr (r6(r2)), r7 # call subroutine
# r6 is a base address and r2 an index, so
# r6 + r2 gives the target address and r7 is the
# stackpointer

another mashine:

jsr <subroutine> ; address of subroutine immediate

jsr @:x ; address to call is in accu (the only general purpose
register)
; store return address in register x
; beside rx index register 1 and ry index register 2
;

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2R Deutsch ist da!

Phil Carmody · Oct 12, 2009

Seebs said:
Wow, he sure pissed in your anonymous cheerios!

I'm sure the blind sycophantic blind support that I show for Richard
will come as an even greater surprise to Richard than it does to me!

So, are oranges heavier?

Phil

Phil Carmody · Oct 12, 2009

bartc said:
If that was the case then all function calls would be via pointers,
with the compiler pointerfying direct function calls.

Direct function calls through fixed pointers can indeed often be faster
than calls via function pointers.

But where did he mention the former? Message-id and line number please.

But oddly I haven't noticed that when I look at asm output.

Maybe you're too busy making ASSumptions.

Phil

robertwessel2 · Oct 12, 2009

I know a lot of different assembly languages but none of them has only
a call indirectly.

S/360, S/370, etc. until recently. There were basic flavors of
subroutine call, one that was purely indirect, the other was register-
plus-offset. The typical conventions assigned one or more GP
registers as base pointers, each covering 4KB of memory. You could
then directly access (and call or branch to) addresses in one of those
4KB areas by using the reg+offset form of the instructions.

There was no relative or absolute branch or call (well, the reg+offset
form could be used to call any location in the first 4096 bytes of
memory, given the register-0-reads-as-zero implemented on S/360 for
most address computations). Some relative branches and calls did get
added more recently.

James Kuyper · Oct 12, 2009

Herbert said:
I know a lot of different assembly languages but none of them has only
a call indirectly.

My description was intended to cover both direct and indirect calls. The
only difference between them is the way in which the pointer value ends
up in the register.

dereferencing performance penalty?	7	Jul 27, 2009
Trouble calling a function with enum parameter	3	Jan 13, 2023
Integer promotions and performance penalty	2	Oct 2, 2007
Getting extra blank rows from appending HTML..?	2	Oct 24, 2023
[C++] Pointers declared inside a function, how do I manage them?	5	May 3, 2023
Unraveling Pointers and Arrays in C++: Seeking Expert Advice.	1	Jan 26, 2024
template metaprogramming in C?	36	Mar 24, 2012
Help with pointers	1	Mar 13, 2022

Function pointers: performance penalty?

Gene

jacob navia

jacob navia

Richard Tobin

Richard Tobin

Richard Tobin

Seebs

Ben Bacarisse

Seebs

Tim Rentsch

Antoninus Twink

bartc

Keith Thompson

Seebs

user923005

Herbert Rosenau

Phil Carmody

Phil Carmody

robertwessel2

James Kuyper

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads