When to use std::pow(x,n) instead of times x for n times?

P

Peng Yu

Hi,

I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

Thanks,
Peng
 
J

jellybean stonerfish

Hi,

I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

Thanks,
Peng

It can be hard to represent fractional powers in x*x notation.

sf
 
J

Juha Nieminen

Peng said:
I'm wondering if there is any general guideline on when to using
something like
std::pow(x, n)
rather than
x * x * x * ... * x (n x's).

If you need to calculate that function millions of times per second,
then using the latter form can be considerably faster up to a certain n
(after which std::pow() becomes faster). The maximum n for which the
latter form is faster than std::pow() can be surprisingly large,
depending on the code (eg. something like n=8 might not be far-fetched).

Of course this is heavily system-dependent so there's no rule.
 
G

gpderetta

  If you need to calculate that function millions of times per second,
then using the latter form can be considerably faster up to a certain n
(after which std::pow() becomes faster).

Why? There is no reason for the compiler not to transform pow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).
 
M

Michael DOUBEZ

Juha Nieminen a écrit :
If you need to calculate that function millions of times per second,
then using the latter form can be considerably faster up to a certain n
(after which std::pow() becomes faster). The maximum n for which the
latter form is faster than std::pow() can be surprisingly large,
depending on the code (eg. something like n=8 might not be far-fetched).

Of course this is heavily system-dependent so there's no rule.

In fact, there are some algorithms that are faster than others to
compute powers of natural number (by example derived from the russian
peasant multiplication).

I expect c++ libraries have a specialization with natural number as
second argument that gives better results in general.

If n is known at compile time, it is not hard to implement a template
with the russian paysan algorithm (the depth of instantiation recursion
should be sizeof(long)).
 
J

Juha Nieminen

gpderetta said:
Why? There is no reason for the compiler not to transform pow(x,
<integral-constant>) to the latter form if it were actually faster
(and in fact some compilers do).

Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it. (At most the pow()
function itself might have optimizations in it, but in my experience it
doesn't: With most compilers it just generates the FPU opcodes necessary
to calculate the result.)

But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.
 
G

gpderetta

Some compilers might be able to do that optimizations, others aren't.
And if n is a variable, then it cannot optimize it.

and if it is variable, you can't write an explicit expression either.
You could use a for loop
(At most the pow()
function itself might have optimizations in it, but in my experience it
doesn't: With most compilers it just generates the FPU opcodes necessary
to calculate the result.)

Today hand optimizations are tomorrow pessimizations. Let the compiler
do its job.

The usual rule apply: use pow, and only if the profiler tells it is a
bottleneck, try to optimize it by hand.
But we don't have to speculate about this as it's trivially easy to
test in practice. Go ahead and try it.

I had already tried. 'pow(x, 16)' is inlined exactly as four
multiplies, at least with a recent gcc.
 
J

Juha Nieminen

gpderetta said:
and if it is variable, you can't write an explicit expression either.
You could use a for loop

In my experience even performing a set of multiplications in a loop
while interpreting bytecode can be faster than a single std::pow() call,
up to a certain exponent.

I have made a function parser/interpreter, and in practice eg.
interpreting the function "x*x*x*x" (which it bytecompiles to three
multiplications) is faster than "x^4" (which it bytecompiles to one
std::pow() call). std::pow() can be incredibly slow.
 
P

Peng Yu

and if it is variable, you can't write an explicit expression either.
You could use a for loop


Today hand optimizations are tomorrow pessimizations. Let the compiler
do its job.

The usual rule apply: usepow, and only if the profiler tells it is a
bottleneck, try to optimize it by hand.




I had already tried. 'pow(x, 16)' is inlined exactly as four
multiplies, at least with a recent gcc.

Would you please let me know the details the procedure on how you
figure this out? Sometimes I want to know what the compiler compile
the code to.

Thanks,
Peng
 
P

Peng Yu

* Peng Yu:


e.g.

g++ -S -masm=intel x.cpp

It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code. For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?

Thanks,
Peng

$cat main.cc main.s
#include <cmath>
#include <iostream>

int main() {
double x = 1;
std::cout << "pox(x, 1) = " << std::pow(x, 1) << std::endl;
std::cout << "pox(x, 2) = " << std::pow(x, 2) << std::endl;
std::cout << "pox(x, 3) = " << std::pow(x, 3) << std::endl;
std::cout << "pox(x, 4) = " << std::pow(x, 4) << std::endl;
std::cout << "pox(x, 5) = " << std::pow(x, 5) << std::endl;
std::cout << "pox(x, 6) = " << std::pow(x, 6) << std::endl;
std::cout << "pox(x, 7) = " << std::pow(x, 7) << std::endl;
std::cout << "pox(x, 8) = " << std::pow(x, 8) << std::endl;
std::cout << "pox(x, 9) = " << std::pow(x, 9) << std::endl;
std::cout << "pox(x, 10) = " << std::pow(x, 10) << std::endl;
std::cout << "pox(x, 11) = " << std::pow(x, 11) << std::endl;
std::cout << "pox(x, 12) = " << std::pow(x, 12) << std::endl;
std::cout << "pox(x, 13) = " << std::pow(x, 13) << std::endl;
std::cout << "pox(x, 14) = " << std::pow(x, 14) << std::endl;
std::cout << "pox(x, 15) = " << std::pow(x, 15) << std::endl;
std::cout << "pox(x, 16) = " << std::pow(x, 16) << std::endl;
}
.file "main.cc"
.intel_syntax
.section .ctors,"aw",@progbits
.align 8
.quad _GLOBAL__I_main
.text
.align 2
.type _Z41__static_initialization_and_destruction_0ii,
@function
_Z41__static_initialization_and_destruction_0ii:
..LFB1504:
push %rbp
..LCFI0:
mov %rbp, %rsp
..LCFI1:
sub %rsp, 16
..LCFI2:
mov DWORD PTR [%rbp-4], %edi
mov DWORD PTR [%rbp-8], %esi
cmp DWORD PTR [%rbp-4], 1
jne .L5
cmp DWORD PTR [%rbp-8], 65535
jne .L5
mov %edi, OFFSET FLAT:_ZSt8__ioinit
call _ZNSt8ios_base4InitC1Ev
mov %edx, OFFSET FLAT:__dso_handle
mov %esi, 0
mov %edi, OFFSET FLAT:__tcf_0
call __cxa_atexit
..L5:
leave
ret
..LFE1504:
.size _Z41__static_initialization_and_destruction_0ii, .-
_Z41__static_initialization_and_destruction_0ii
..globl __gxx_personality_v0
.align 2
.type _GLOBAL__I_main, @function
_GLOBAL__I_main:
..LFB1506:
push %rbp
..LCFI3:
mov %rbp, %rsp
..LCFI4:
mov %esi, 65535
mov %edi, 1
call _Z41__static_initialization_and_destruction_0ii
leave
ret
..LFE1506:
.size _GLOBAL__I_main, .-_GLOBAL__I_main
.align 2
.type __tcf_0, @function
__tcf_0:
..LFB1505:
push %rbp
..LCFI5:
mov %rbp, %rsp
..LCFI6:
sub %rsp, 16
..LCFI7:
mov QWORD PTR [%rbp-8], %rdi
mov %edi, OFFSET FLAT:_ZSt8__ioinit
call _ZNSt8ios_base4InitD1Ev
leave
ret
..LFE1505:
.size __tcf_0, .-__tcf_0
..globl __powidf2
.section .text._ZSt3powdi,"axG",@progbits,_ZSt3powdi,comdat
.align 2
.weak _ZSt3powdi
.type _ZSt3powdi, @function
_ZSt3powdi:
..LFB54:
push %rbp
..LCFI8:
mov %rbp, %rsp
..LCFI9:
sub %rsp, 32
..LCFI10:
movsd QWORD PTR [%rbp-8], %xmm0
mov DWORD PTR [%rbp-12], %edi
mov %edi, DWORD PTR [%rbp-12]
movlpd %xmm0, QWORD PTR [%rbp-8]
call __powidf2
movsd QWORD PTR [%rbp-24], %xmm0
mov %rax, QWORD PTR [%rbp-24]
mov QWORD PTR [%rbp-24], %rax
movlpd %xmm0, QWORD PTR [%rbp-24]
leave
ret
..LFE54:
.size _ZSt3powdi, .-_ZSt3powdi
.section .rodata
..LC1:
.string "pox(x, 1) = "
..LC2:
.string "pox(x, 2) = "
..LC3:
.string "pox(x, 3) = "
..LC4:
.string "pox(x, 4) = "
..LC5:
.string "pox(x, 5) = "
..LC6:
.string "pox(x, 6) = "
..LC7:
.string "pox(x, 7) = "
..LC8:
.string "pox(x, 8) = "
..LC9:
.string "pox(x, 9) = "
..LC10:
.string "pox(x, 10) = "
..LC11:
.string "pox(x, 11) = "
..LC12:
.string "pox(x, 12) = "
..LC13:
.string "pox(x, 13) = "
..LC14:
.string "pox(x, 14) = "
..LC15:
.string "pox(x, 15) = "
..LC16:
.string "pox(x, 16) = "
.text
.align 2
..globl main
.type main, @function
main:
..LFB1496:
push %rbp
..LCFI11:
mov %rbp, %rsp
..LCFI12:
push %rbx
..LCFI13:
sub %rsp, 24
..LCFI14:
movabs %rax, 4607182418800017408
mov QWORD PTR [%rbp-16], %rax
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 1
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC1
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 2
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC2
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 3
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC3
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 4
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC4
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 5
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC5
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 6
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC6
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 7
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC7
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 8
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC8
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 9
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC9
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 10
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC10
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 11
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC11
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 12
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC12
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 13
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC13
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 14
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC14
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 15
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC15
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %rax, QWORD PTR [%rbp-16]
mov %edi, 16
mov QWORD PTR [%rbp-32], %rax
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZSt3powdi
movsd QWORD PTR [%rbp-32], %xmm0
mov %rbx, QWORD PTR [%rbp-32]
mov %esi, OFFSET FLAT:.LC16
mov %edi, OFFSET FLAT:_ZSt4cout
call
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
mov %rdi, %rax
mov QWORD PTR [%rbp-32], %rbx
movlpd %xmm0, QWORD PTR [%rbp-32]
call _ZNSolsEd
mov %rdi, %rax
mov %esi, OFFSET
FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
call _ZNSolsEPFRSoS_E
mov %eax, 0
add %rsp, 24
pop %rbx
leave
ret
..LFE1496:
.size main, .-main
.local _ZSt8__ioinit
.comm _ZSt8__ioinit,1,1
.weakref _Z20__gthrw_pthread_oncePiPFvvE,pthread_once
.weakref
_Z27__gthrw_pthread_getspecificj,pthread_getspecific
.weakref
_Z27__gthrw_pthread_setspecificjPKv,pthread_setspecific
.weakref
_Z22__gthrw_pthread_createPmPK14pthread_attr_tPFPvS3_ES3_,pthread_create
.weakref _Z22__gthrw_pthread_cancelm,pthread_cancel
.weakref
_Z26__gthrw_pthread_mutex_lockP15pthread_mutex_t,pthread_mutex_lock
.weakref
_Z29__gthrw_pthread_mutex_trylockP15pthread_mutex_t,pthread_mutex_trylock
.weakref
_Z28__gthrw_pthread_mutex_unlockP15pthread_mutex_t,pthread_mutex_unlock
.weakref
_Z26__gthrw_pthread_mutex_initP15pthread_mutex_tPK19pthread_mutexattr_t,pthread_mutex_init
.weakref
_Z26__gthrw_pthread_key_createPjPFvPvE,pthread_key_create
.weakref
_Z26__gthrw_pthread_key_deletej,pthread_key_delete
.weakref
_Z30__gthrw_pthread_mutexattr_initP19pthread_mutexattr_t,pthread_mutexattr_init
.weakref
_Z33__gthrw_pthread_mutexattr_settypeP19pthread_mutexattr_ti,pthread_mutexattr_settype
.weakref
_Z33__gthrw_pthread_mutexattr_destroyP19pthread_mutexattr_t,pthread_mutexattr_destroy
.section .eh_frame,"a",@progbits
..Lframe1:
.long .LECIE1-.LSCIE1
..LSCIE1:
.long 0x0
.byte 0x1
.string "zPR"
.uleb128 0x1
.sleb128 -8
.byte 0x10
.uleb128 0x6
.byte 0x3
.long __gxx_personality_v0
.byte 0x3
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.align 8
..LECIE1:
..LSFDE1:
.long .LEFDE1-.LASFDE1
..LASFDE1:
.long .LASFDE1-.Lframe1
.long .LFB1504
.long .LFE1504-.LFB1504
.uleb128 0x0
.byte 0x4
.long .LCFI0-.LFB1504
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI1-.LCFI0
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE1:
..LSFDE3:
.long .LEFDE3-.LASFDE3
..LASFDE3:
.long .LASFDE3-.Lframe1
.long .LFB1506
.long .LFE1506-.LFB1506
.uleb128 0x0
.byte 0x4
.long .LCFI3-.LFB1506
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI4-.LCFI3
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE3:
..LSFDE5:
.long .LEFDE5-.LASFDE5
..LASFDE5:
.long .LASFDE5-.Lframe1
.long .LFB1505
.long .LFE1505-.LFB1505
.uleb128 0x0
.byte 0x4
.long .LCFI5-.LFB1505
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI6-.LCFI5
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE5:
..LSFDE7:
.long .LEFDE7-.LASFDE7
..LASFDE7:
.long .LASFDE7-.Lframe1
.long .LFB54
.long .LFE54-.LFB54
.uleb128 0x0
.byte 0x4
.long .LCFI8-.LFB54
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI9-.LCFI8
.byte 0xd
.uleb128 0x6
.align 8
..LEFDE7:
..LSFDE9:
.long .LEFDE9-.LASFDE9
..LASFDE9:
.long .LASFDE9-.Lframe1
.long .LFB1496
.long .LFE1496-.LFB1496
.uleb128 0x0
.byte 0x4
.long .LCFI11-.LFB1496
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI12-.LCFI11
.byte 0xd
.uleb128 0x6
.byte 0x4
.long .LCFI14-.LCFI12
.byte 0x83
.uleb128 0x3
.align 8
..LEFDE9:
.ident "GCC: (GNU) 4.1.2 20061115 (prerelease) (Debian
4.1.1-21)"
.section .note.GNU-stack,"",@progbits
 
K

Kai-Uwe Bux

Peng said:
It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code.

Nobody said, it would be easy.

For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?
[snip]

Did you try modifying one line of code and seeing which portion of the
assembly changed?


Best

Kai-Uwe Bux
 
P

Peng Yu

It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code.

Nobody said, it would be easy.
For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?

[snip]

Did you try modifying one line of code and seeing which portion of the
assembly changed?

g++ -O3 -S -masm=intel main.cc

I tried to compile the code and the variant of it. And I got the
difference by diff. But I still have difficulty to understand what it
does. Is there a way to annotate the C++ code in the assembly code?

Thanks,
Peng
 
E

Erik Wikström

Would you please let me know the details the procedure on how you
figure this out? Sometimes I want to know what the compiler compile
the code to.

In Visual Studio you can run the program in the debugger and then bring
up the assembly code and it will show you can step through it and switch
back and forth between the code and assembly code. I would imagine you
can do similar things in other IDEs and in gdb.
 
I

Ian Collins

Peng said:
Peng said:
* Peng Yu:
Sometimes I want to know what the compiler compile
the code to.
e.g.
g++ -S -masm=intel x.cpp
It is pretty hard to figure out what part of assembly code is
associated with a give portion of source code.
Nobody said, it would be easy.
For example, the
following C++ and assembly code. How do I figure out where the pow
functions are at in the code?
[snip]

Did you try modifying one line of code and seeing which portion of the
assembly changed?

g++ -O3 -S -masm=intel main.cc

I tried to compile the code and the variant of it. And I got the
difference by diff. But I still have difficulty to understand what it
does. Is there a way to annotate the C++ code in the assembly code?
Some compilers (Sun CC for example) do so by default. If you are
working on Solaris or Linux, give it a try.
 
P

Peng Yu

In Visual Studio you can run the program in the debugger and then bring
up the assembly code and it will show you can step through it and switch
back and forth between the code and assembly code. I would imagine you
can do similar things in other IDEs and in gdb.

Shall there be a problem if I use -O3 option? The source code and the
assembly code might have one-one relationship.

Thanks,
Peng
 
E

Erik Wikström

Shall there be a problem if I use -O3 option? The source code and the
assembly code might have one-one relationship.

There might be a problem if the compiler decides to remove some code
completely, otherwise no.
 
S

Sherm Pendley

Peng Yu said:
g++ -O3 -S -masm=intel main.cc

I tried to compile the code and the variant of it. And I got the
difference by diff. But I still have difficulty to understand what it
does. Is there a way to annotate the C++ code in the assembly code?

Yes, and I'd also disable optimization, which can make the generated
asm code more difficult to follow.

g++ -S -masm=intel -fverbose-asm main.cc

sherm--
 
P

Peng Yu

Yes, and I'd also disable optimization, which can make the generated
asm code more difficult to follow.

g++ -S -masm=intel -fverbose-asm main.cc

But I have to enable the optimization, because I want to know how the
compiler optimize std::pow(x,n).

Thanks,
Peng
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top