When to use std::pow(x,n) instead of times x for n times?

Discussion in 'C++' started by Peng Yu, Sep 10, 2008.

  1. Peng Yu

    Peng Yu Guest

    Hi,

    I'm wondering if there is any general guideline on when to using
    something like
    std::pow(x, n)
    rather than
    x * x * x * ... * x (n x's).

    Thanks,
    Peng
    Peng Yu, Sep 10, 2008
    #1
    1. Advertising

  2. On Tue, 09 Sep 2008 17:30:04 -0700, Peng Yu wrote:

    > Hi,
    >
    > I'm wondering if there is any general guideline on when to using
    > something like
    > std::pow(x, n)
    > rather than
    > x * x * x * ... * x (n x's).
    >
    > Thanks,
    > Peng


    It can be hard to represent fractional powers in x*x notation.

    sf
    jellybean stonerfish, Sep 10, 2008
    #2
    1. Advertising

  3. Peng Yu wrote:
    > I'm wondering if there is any general guideline on when to using
    > something like
    > std::pow(x, n)
    > rather than
    > x * x * x * ... * x (n x's).


    If you need to calculate that function millions of times per second,
    then using the latter form can be considerably faster up to a certain n
    (after which std::pow() becomes faster). The maximum n for which the
    latter form is faster than std::pow() can be surprisingly large,
    depending on the code (eg. something like n=8 might not be far-fetched).

    Of course this is heavily system-dependent so there's no rule.
    Juha Nieminen, Sep 10, 2008
    #3
  4. Peng Yu

    gpderetta Guest

    On Sep 10, 4:24 pm, Juha Nieminen <> wrote:
    > Peng Yu wrote:
    > > I'm wondering if there is any general guideline on when to using
    > > something like
    > > std::pow(x, n)
    > > rather than
    > > x * x * x * ... * x (n x's).

    >
    >   If you need to calculate that function millions of times per second,
    > then using the latter form can be considerably faster up to a certain n
    > (after which std::pow() becomes faster).


    Why? There is no reason for the compiler not to transform pow(x,
    <integral-constant>) to the latter form if it were actually faster
    (and in fact some compilers do).

    --
    gpd
    gpderetta, Sep 10, 2008
    #4
  5. Juha Nieminen a écrit :
    > Peng Yu wrote:
    >> I'm wondering if there is any general guideline on when to using
    >> something like
    >> std::pow(x, n)
    >> rather than
    >> x * x * x * ... * x (n x's).

    >
    > If you need to calculate that function millions of times per second,
    > then using the latter form can be considerably faster up to a certain n
    > (after which std::pow() becomes faster). The maximum n for which the
    > latter form is faster than std::pow() can be surprisingly large,
    > depending on the code (eg. something like n=8 might not be far-fetched).
    >
    > Of course this is heavily system-dependent so there's no rule.


    In fact, there are some algorithms that are faster than others to
    compute powers of natural number (by example derived from the russian
    peasant multiplication).

    I expect c++ libraries have a specialization with natural number as
    second argument that gives better results in general.

    If n is known at compile time, it is not hard to implement a template
    with the russian paysan algorithm (the depth of instantiation recursion
    should be sizeof(long)).

    --
    Michael
    Michael DOUBEZ, Sep 11, 2008
    #5
  6. gpderetta wrote:
    > Why? There is no reason for the compiler not to transform pow(x,
    > <integral-constant>) to the latter form if it were actually faster
    > (and in fact some compilers do).


    Some compilers might be able to do that optimizations, others aren't.
    And if n is a variable, then it cannot optimize it. (At most the pow()
    function itself might have optimizations in it, but in my experience it
    doesn't: With most compilers it just generates the FPU opcodes necessary
    to calculate the result.)

    But we don't have to speculate about this as it's trivially easy to
    test in practice. Go ahead and try it.
    Juha Nieminen, Sep 11, 2008
    #6
  7. Peng Yu

    gpderetta Guest

    On Sep 11, 6:39 pm, Juha Nieminen <> wrote:
    > gpderetta wrote:
    > > Why? There is no reason for the compiler not to transform pow(x,
    > > <integral-constant>) to the latter form if it were actually faster
    > > (and in fact some compilers do).

    >
    > Some compilers might be able to do that optimizations, others aren't.
    > And if n is a variable, then it cannot optimize it.


    and if it is variable, you can't write an explicit expression either.
    You could use a for loop

    > (At most the pow()
    > function itself might have optimizations in it, but in my experience it
    > doesn't: With most compilers it just generates the FPU opcodes necessary
    > to calculate the result.)


    Today hand optimizations are tomorrow pessimizations. Let the compiler
    do its job.

    The usual rule apply: use pow, and only if the profiler tells it is a
    bottleneck, try to optimize it by hand.

    >
    > But we don't have to speculate about this as it's trivially easy to
    > test in practice. Go ahead and try it.


    I had already tried. 'pow(x, 16)' is inlined exactly as four
    multiplies, at least with a recent gcc.

    --
    gpd
    gpderetta, Sep 11, 2008
    #7
  8. gpderetta wrote:
    >> Some compilers might be able to do that optimizations, others aren't.
    >> And if n is a variable, then it cannot optimize it.

    >
    > and if it is variable, you can't write an explicit expression either.
    > You could use a for loop


    In my experience even performing a set of multiplications in a loop
    while interpreting bytecode can be faster than a single std::pow() call,
    up to a certain exponent.

    I have made a function parser/interpreter, and in practice eg.
    interpreting the function "x*x*x*x" (which it bytecompiles to three
    multiplications) is faster than "x^4" (which it bytecompiles to one
    std::pow() call). std::pow() can be incredibly slow.
    Juha Nieminen, Sep 12, 2008
    #8
  9. Peng Yu

    Peng Yu Guest

    On Sep 11, 4:56 pm, gpderetta <> wrote:
    > On Sep 11, 6:39 pm, Juha Nieminen <> wrote:
    >
    > > gpderetta wrote:
    > > > Why? There is no reason for the compiler not to transformpow(x,
    > > > <integral-constant>) to the latter form if it were actually faster
    > > > (and in fact some compilers do).

    >
    > > Some compilers might be able to do that optimizations, others aren't.
    > > And if n is a variable, then it cannot optimize it.

    >
    > and if it is variable, you can't write an explicit expression either.
    > You could use a for loop
    >
    > > (At most thepow()
    > > function itself might have optimizations in it, but in my experience it
    > > doesn't: With most compilers it just generates the FPU opcodes necessary
    > > to calculate the result.)

    >
    > Today hand optimizations are tomorrow pessimizations. Let the compiler
    > do its job.
    >
    > The usual rule apply: usepow, and only if the profiler tells it is a
    > bottleneck, try to optimize it by hand.
    >
    >
    >
    > > But we don't have to speculate about this as it's trivially easy to
    > > test in practice. Go ahead and try it.

    >
    > I had already tried. 'pow(x, 16)' is inlined exactly as four
    > multiplies, at least with a recent gcc.


    Would you please let me know the details the procedure on how you
    figure this out? Sometimes I want to know what the compiler compile
    the code to.

    Thanks,
    Peng
    Peng Yu, Sep 13, 2008
    #9
  10. Peng Yu

    Peng Yu Guest

    On Sep 13, 4:09 pm, "Alf P. Steinbach" <> wrote:
    > * Peng Yu:
    >
    > > Sometimes I want to know what the compiler compile
    > > the code to.

    >
    > e.g.
    >
    > g++ -S -masm=intel x.cpp


    It is pretty hard to figure out what part of assembly code is
    associated with a give portion of source code. For example, the
    following C++ and assembly code. How do I figure out where the pow
    functions are at in the code?

    Thanks,
    Peng

    $cat main.cc main.s
    #include <cmath>
    #include <iostream>

    int main() {
    double x = 1;
    std::cout << "pox(x, 1) = " << std::pow(x, 1) << std::endl;
    std::cout << "pox(x, 2) = " << std::pow(x, 2) << std::endl;
    std::cout << "pox(x, 3) = " << std::pow(x, 3) << std::endl;
    std::cout << "pox(x, 4) = " << std::pow(x, 4) << std::endl;
    std::cout << "pox(x, 5) = " << std::pow(x, 5) << std::endl;
    std::cout << "pox(x, 6) = " << std::pow(x, 6) << std::endl;
    std::cout << "pox(x, 7) = " << std::pow(x, 7) << std::endl;
    std::cout << "pox(x, 8) = " << std::pow(x, 8) << std::endl;
    std::cout << "pox(x, 9) = " << std::pow(x, 9) << std::endl;
    std::cout << "pox(x, 10) = " << std::pow(x, 10) << std::endl;
    std::cout << "pox(x, 11) = " << std::pow(x, 11) << std::endl;
    std::cout << "pox(x, 12) = " << std::pow(x, 12) << std::endl;
    std::cout << "pox(x, 13) = " << std::pow(x, 13) << std::endl;
    std::cout << "pox(x, 14) = " << std::pow(x, 14) << std::endl;
    std::cout << "pox(x, 15) = " << std::pow(x, 15) << std::endl;
    std::cout << "pox(x, 16) = " << std::pow(x, 16) << std::endl;
    }
    .file "main.cc"
    .intel_syntax
    .section .ctors,"aw",@progbits
    .align 8
    .quad _GLOBAL__I_main
    .text
    .align 2
    .type _Z41__static_initialization_and_destruction_0ii,
    @function
    _Z41__static_initialization_and_destruction_0ii:
    ..LFB1504:
    push %rbp
    ..LCFI0:
    mov %rbp, %rsp
    ..LCFI1:
    sub %rsp, 16
    ..LCFI2:
    mov DWORD PTR [%rbp-4], %edi
    mov DWORD PTR [%rbp-8], %esi
    cmp DWORD PTR [%rbp-4], 1
    jne .L5
    cmp DWORD PTR [%rbp-8], 65535
    jne .L5
    mov %edi, OFFSET FLAT:_ZSt8__ioinit
    call _ZNSt8ios_base4InitC1Ev
    mov %edx, OFFSET FLAT:__dso_handle
    mov %esi, 0
    mov %edi, OFFSET FLAT:__tcf_0
    call __cxa_atexit
    ..L5:
    leave
    ret
    ..LFE1504:
    .size _Z41__static_initialization_and_destruction_0ii, .-
    _Z41__static_initialization_and_destruction_0ii
    ..globl __gxx_personality_v0
    .align 2
    .type _GLOBAL__I_main, @function
    _GLOBAL__I_main:
    ..LFB1506:
    push %rbp
    ..LCFI3:
    mov %rbp, %rsp
    ..LCFI4:
    mov %esi, 65535
    mov %edi, 1
    call _Z41__static_initialization_and_destruction_0ii
    leave
    ret
    ..LFE1506:
    .size _GLOBAL__I_main, .-_GLOBAL__I_main
    .align 2
    .type __tcf_0, @function
    __tcf_0:
    ..LFB1505:
    push %rbp
    ..LCFI5:
    mov %rbp, %rsp
    ..LCFI6:
    sub %rsp, 16
    ..LCFI7:
    mov QWORD PTR [%rbp-8], %rdi
    mov %edi, OFFSET FLAT:_ZSt8__ioinit
    call _ZNSt8ios_base4InitD1Ev
    leave
    ret
    ..LFE1505:
    .size __tcf_0, .-__tcf_0
    ..globl __powidf2
    .section .text._ZSt3powdi,"axG",@progbits,_ZSt3powdi,comdat
    .align 2
    .weak _ZSt3powdi
    .type _ZSt3powdi, @function
    _ZSt3powdi:
    ..LFB54:
    push %rbp
    ..LCFI8:
    mov %rbp, %rsp
    ..LCFI9:
    sub %rsp, 32
    ..LCFI10:
    movsd QWORD PTR [%rbp-8], %xmm0
    mov DWORD PTR [%rbp-12], %edi
    mov %edi, DWORD PTR [%rbp-12]
    movlpd %xmm0, QWORD PTR [%rbp-8]
    call __powidf2
    movsd QWORD PTR [%rbp-24], %xmm0
    mov %rax, QWORD PTR [%rbp-24]
    mov QWORD PTR [%rbp-24], %rax
    movlpd %xmm0, QWORD PTR [%rbp-24]
    leave
    ret
    ..LFE54:
    .size _ZSt3powdi, .-_ZSt3powdi
    .section .rodata
    ..LC1:
    .string "pox(x, 1) = "
    ..LC2:
    .string "pox(x, 2) = "
    ..LC3:
    .string "pox(x, 3) = "
    ..LC4:
    .string "pox(x, 4) = "
    ..LC5:
    .string "pox(x, 5) = "
    ..LC6:
    .string "pox(x, 6) = "
    ..LC7:
    .string "pox(x, 7) = "
    ..LC8:
    .string "pox(x, 8) = "
    ..LC9:
    .string "pox(x, 9) = "
    ..LC10:
    .string "pox(x, 10) = "
    ..LC11:
    .string "pox(x, 11) = "
    ..LC12:
    .string "pox(x, 12) = "
    ..LC13:
    .string "pox(x, 13) = "
    ..LC14:
    .string "pox(x, 14) = "
    ..LC15:
    .string "pox(x, 15) = "
    ..LC16:
    .string "pox(x, 16) = "
    .text
    .align 2
    ..globl main
    .type main, @function
    main:
    ..LFB1496:
    push %rbp
    ..LCFI11:
    mov %rbp, %rsp
    ..LCFI12:
    push %rbx
    ..LCFI13:
    sub %rsp, 24
    ..LCFI14:
    movabs %rax, 4607182418800017408
    mov QWORD PTR [%rbp-16], %rax
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 1
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC1
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 2
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC2
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 3
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC3
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 4
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC4
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 5
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC5
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 6
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC6
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 7
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC7
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 8
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC8
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 9
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC9
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 10
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC10
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 11
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC11
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 12
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC12
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 13
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC13
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 14
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC14
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 15
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC15
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %rax, QWORD PTR [%rbp-16]
    mov %edi, 16
    mov QWORD PTR [%rbp-32], %rax
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZSt3powdi
    movsd QWORD PTR [%rbp-32], %xmm0
    mov %rbx, QWORD PTR [%rbp-32]
    mov %esi, OFFSET FLAT:.LC16
    mov %edi, OFFSET FLAT:_ZSt4cout
    call
    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    mov %rdi, %rax
    mov QWORD PTR [%rbp-32], %rbx
    movlpd %xmm0, QWORD PTR [%rbp-32]
    call _ZNSolsEd
    mov %rdi, %rax
    mov %esi, OFFSET
    FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
    call _ZNSolsEPFRSoS_E
    mov %eax, 0
    add %rsp, 24
    pop %rbx
    leave
    ret
    ..LFE1496:
    .size main, .-main
    .local _ZSt8__ioinit
    .comm _ZSt8__ioinit,1,1
    .weakref _Z20__gthrw_pthread_oncePiPFvvE,pthread_once
    .weakref
    _Z27__gthrw_pthread_getspecificj,pthread_getspecific
    .weakref
    _Z27__gthrw_pthread_setspecificjPKv,pthread_setspecific
    .weakref
    _Z22__gthrw_pthread_createPmPK14pthread_attr_tPFPvS3_ES3_,pthread_create
    .weakref _Z22__gthrw_pthread_cancelm,pthread_cancel
    .weakref
    _Z26__gthrw_pthread_mutex_lockP15pthread_mutex_t,pthread_mutex_lock
    .weakref
    _Z29__gthrw_pthread_mutex_trylockP15pthread_mutex_t,pthread_mutex_trylock
    .weakref
    _Z28__gthrw_pthread_mutex_unlockP15pthread_mutex_t,pthread_mutex_unlock
    .weakref
    _Z26__gthrw_pthread_mutex_initP15pthread_mutex_tPK19pthread_mutexattr_t,pthread_mutex_init
    .weakref
    _Z26__gthrw_pthread_key_createPjPFvPvE,pthread_key_create
    .weakref
    _Z26__gthrw_pthread_key_deletej,pthread_key_delete
    .weakref
    _Z30__gthrw_pthread_mutexattr_initP19pthread_mutexattr_t,pthread_mutexattr_init
    .weakref
    _Z33__gthrw_pthread_mutexattr_settypeP19pthread_mutexattr_ti,pthread_mutexattr_settype
    .weakref
    _Z33__gthrw_pthread_mutexattr_destroyP19pthread_mutexattr_t,pthread_mutexattr_destroy
    .section .eh_frame,"a",@progbits
    ..Lframe1:
    .long .LECIE1-.LSCIE1
    ..LSCIE1:
    .long 0x0
    .byte 0x1
    .string "zPR"
    .uleb128 0x1
    .sleb128 -8
    .byte 0x10
    .uleb128 0x6
    .byte 0x3
    .long __gxx_personality_v0
    .byte 0x3
    .byte 0xc
    .uleb128 0x7
    .uleb128 0x8
    .byte 0x90
    .uleb128 0x1
    .align 8
    ..LECIE1:
    ..LSFDE1:
    .long .LEFDE1-.LASFDE1
    ..LASFDE1:
    .long .LASFDE1-.Lframe1
    .long .LFB1504
    .long .LFE1504-.LFB1504
    .uleb128 0x0
    .byte 0x4
    .long .LCFI0-.LFB1504
    .byte 0xe
    .uleb128 0x10
    .byte 0x86
    .uleb128 0x2
    .byte 0x4
    .long .LCFI1-.LCFI0
    .byte 0xd
    .uleb128 0x6
    .align 8
    ..LEFDE1:
    ..LSFDE3:
    .long .LEFDE3-.LASFDE3
    ..LASFDE3:
    .long .LASFDE3-.Lframe1
    .long .LFB1506
    .long .LFE1506-.LFB1506
    .uleb128 0x0
    .byte 0x4
    .long .LCFI3-.LFB1506
    .byte 0xe
    .uleb128 0x10
    .byte 0x86
    .uleb128 0x2
    .byte 0x4
    .long .LCFI4-.LCFI3
    .byte 0xd
    .uleb128 0x6
    .align 8
    ..LEFDE3:
    ..LSFDE5:
    .long .LEFDE5-.LASFDE5
    ..LASFDE5:
    .long .LASFDE5-.Lframe1
    .long .LFB1505
    .long .LFE1505-.LFB1505
    .uleb128 0x0
    .byte 0x4
    .long .LCFI5-.LFB1505
    .byte 0xe
    .uleb128 0x10
    .byte 0x86
    .uleb128 0x2
    .byte 0x4
    .long .LCFI6-.LCFI5
    .byte 0xd
    .uleb128 0x6
    .align 8
    ..LEFDE5:
    ..LSFDE7:
    .long .LEFDE7-.LASFDE7
    ..LASFDE7:
    .long .LASFDE7-.Lframe1
    .long .LFB54
    .long .LFE54-.LFB54
    .uleb128 0x0
    .byte 0x4
    .long .LCFI8-.LFB54
    .byte 0xe
    .uleb128 0x10
    .byte 0x86
    .uleb128 0x2
    .byte 0x4
    .long .LCFI9-.LCFI8
    .byte 0xd
    .uleb128 0x6
    .align 8
    ..LEFDE7:
    ..LSFDE9:
    .long .LEFDE9-.LASFDE9
    ..LASFDE9:
    .long .LASFDE9-.Lframe1
    .long .LFB1496
    .long .LFE1496-.LFB1496
    .uleb128 0x0
    .byte 0x4
    .long .LCFI11-.LFB1496
    .byte 0xe
    .uleb128 0x10
    .byte 0x86
    .uleb128 0x2
    .byte 0x4
    .long .LCFI12-.LCFI11
    .byte 0xd
    .uleb128 0x6
    .byte 0x4
    .long .LCFI14-.LCFI12
    .byte 0x83
    .uleb128 0x3
    .align 8
    ..LEFDE9:
    .ident "GCC: (GNU) 4.1.2 20061115 (prerelease) (Debian
    4.1.1-21)"
    .section .note.GNU-stack,"",@progbits
    Peng Yu, Sep 14, 2008
    #10
  11. Peng Yu

    Kai-Uwe Bux Guest

    Peng Yu wrote:

    > On Sep 13, 4:09 pm, "Alf P. Steinbach" <> wrote:
    >> * Peng Yu:
    >>
    >> > Sometimes I want to know what the compiler compile
    >> > the code to.

    >>
    >> e.g.
    >>
    >> g++ -S -masm=intel x.cpp

    >
    > It is pretty hard to figure out what part of assembly code is
    > associated with a give portion of source code.


    Nobody said, it would be easy.


    > For example, the
    > following C++ and assembly code. How do I figure out where the pow
    > functions are at in the code?

    [snip]

    Did you try modifying one line of code and seeing which portion of the
    assembly changed?


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Sep 14, 2008
    #11
  12. Peng Yu

    Peng Yu Guest

    On Sep 13, 7:50 pm, Kai-Uwe Bux <> wrote:
    > Peng Yu wrote:
    > > On Sep 13, 4:09 pm, "Alf P. Steinbach" <> wrote:
    > >> * Peng Yu:

    >
    > >> > Sometimes I want to know what the compiler compile
    > >> > the code to.

    >
    > >> e.g.

    >
    > >> g++ -S -masm=intel x.cpp

    >
    > > It is pretty hard to figure out what part of assembly code is
    > > associated with a give portion of source code.

    >
    > Nobody said, it would be easy.
    >
    > > For example, the
    > > following C++ and assembly code. How do I figure out where the pow
    > > functions are at in the code?

    >
    > [snip]
    >
    > Did you try modifying one line of code and seeing which portion of the
    > assembly changed?


    g++ -O3 -S -masm=intel main.cc

    I tried to compile the code and the variant of it. And I got the
    difference by diff. But I still have difficulty to understand what it
    does. Is there a way to annotate the C++ code in the assembly code?

    Thanks,
    Peng
    Peng Yu, Sep 14, 2008
    #12
  13. On 2008-09-13 23:06, Peng Yu wrote:
    > On Sep 11, 4:56 pm, gpderetta <> wrote:
    >> On Sep 11, 6:39 pm, Juha Nieminen <> wrote:
    >>
    >> > gpderetta wrote:
    >> > > Why? There is no reason for the compiler not to transformpow(x,
    >> > > <integral-constant>) to the latter form if it were actually faster
    >> > > (and in fact some compilers do).

    >>
    >> > Some compilers might be able to do that optimizations, others aren't.
    >> > And if n is a variable, then it cannot optimize it.

    >>
    >> and if it is variable, you can't write an explicit expression either.
    >> You could use a for loop
    >>
    >> > (At most thepow()
    >> > function itself might have optimizations in it, but in my experience it
    >> > doesn't: With most compilers it just generates the FPU opcodes necessary
    >> > to calculate the result.)

    >>
    >> Today hand optimizations are tomorrow pessimizations. Let the compiler
    >> do its job.
    >>
    >> The usual rule apply: usepow, and only if the profiler tells it is a
    >> bottleneck, try to optimize it by hand.
    >>
    >>
    >>
    >> > But we don't have to speculate about this as it's trivially easy to
    >> > test in practice. Go ahead and try it.

    >>
    >> I had already tried. 'pow(x, 16)' is inlined exactly as four
    >> multiplies, at least with a recent gcc.

    >
    > Would you please let me know the details the procedure on how you
    > figure this out? Sometimes I want to know what the compiler compile
    > the code to.


    In Visual Studio you can run the program in the debugger and then bring
    up the assembly code and it will show you can step through it and switch
    back and forth between the code and assembly code. I would imagine you
    can do similar things in other IDEs and in gdb.

    --
    Erik Wikström
    Erik Wikström, Sep 14, 2008
    #13
  14. Peng Yu

    Ian Collins Guest

    Peng Yu wrote:
    > On Sep 13, 7:50 pm, Kai-Uwe Bux <> wrote:
    >> Peng Yu wrote:
    >>> On Sep 13, 4:09 pm, "Alf P. Steinbach" <> wrote:
    >>>> * Peng Yu:
    >>>>> Sometimes I want to know what the compiler compile
    >>>>> the code to.
    >>>> e.g.
    >>>> g++ -S -masm=intel x.cpp
    >>> It is pretty hard to figure out what part of assembly code is
    >>> associated with a give portion of source code.

    >> Nobody said, it would be easy.
    >>
    >>> For example, the
    >>> following C++ and assembly code. How do I figure out where the pow
    >>> functions are at in the code?

    >> [snip]
    >>
    >> Did you try modifying one line of code and seeing which portion of the
    >> assembly changed?

    >
    > g++ -O3 -S -masm=intel main.cc
    >
    > I tried to compile the code and the variant of it. And I got the
    > difference by diff. But I still have difficulty to understand what it
    > does. Is there a way to annotate the C++ code in the assembly code?
    >

    Some compilers (Sun CC for example) do so by default. If you are
    working on Solaris or Linux, give it a try.

    --
    Ian Collins.
    Ian Collins, Sep 14, 2008
    #14
  15. Peng Yu

    Peng Yu Guest

    On Sep 14, 4:37 am, Erik Wikström <> wrote:
    > On 2008-09-13 23:06, Peng Yu wrote:
    >
    >
    >
    > > On Sep 11, 4:56 pm, gpderetta <> wrote:
    > >> On Sep 11, 6:39 pm, Juha Nieminen <> wrote:

    >
    > >> > gpderetta wrote:
    > >> > > Why? There is no reason for the compiler not to transformpow(x,
    > >> > > <integral-constant>) to the latter form if it were actually faster
    > >> > > (and in fact some compilers do).

    >
    > >> > Some compilers might be able to do that optimizations, others aren't.
    > >> > And if n is a variable, then it cannot optimize it.

    >
    > >> and if it is variable, you can't write an explicit expression either.
    > >> You could use a for loop

    >
    > >> > (At most thepow()
    > >> > function itself might have optimizations in it, but in my experience it
    > >> > doesn't: With most compilers it just generates the FPU opcodes necessary
    > >> > to calculate the result.)

    >
    > >> Today hand optimizations are tomorrow pessimizations. Let the compiler
    > >> do its job.

    >
    > >> The usual rule apply: usepow, and only if the profiler tells it is a
    > >> bottleneck, try to optimize it by hand.

    >
    > >> > But we don't have to speculate about this as it's trivially easy to
    > >> > test in practice. Go ahead and try it.

    >
    > >> I had already tried. 'pow(x, 16)' is inlined exactly as four
    > >> multiplies, at least with a recent gcc.

    >
    > > Would you please let me know the details the procedure on how you
    > > figure this out? Sometimes I want to know what the compiler compile
    > > the code to.

    >
    > In Visual Studio you can run the program in the debugger and then bring
    > up the assembly code and it will show you can step through it and switch
    > back and forth between the code and assembly code. I would imagine you
    > can do similar things in other IDEs and in gdb.


    Shall there be a problem if I use -O3 option? The source code and the
    assembly code might have one-one relationship.

    Thanks,
    Peng
    Peng Yu, Sep 14, 2008
    #15
  16. On 2008-09-14 15:47, Peng Yu wrote:
    > On Sep 14, 4:37 am, Erik Wikström <> wrote:
    >> On 2008-09-13 23:06, Peng Yu wrote:


    >> > On Sep 11, 4:56 pm, gpderetta <> wrote:
    >> >> On Sep 11, 6:39 pm, Juha Nieminen <> wrote:


    >> >> Today hand optimizations are tomorrow pessimizations. Let the compiler
    >> >> do its job.

    >>
    >> >> The usual rule apply: usepow, and only if the profiler tells it is a
    >> >> bottleneck, try to optimize it by hand.

    >>
    >> >> > But we don't have to speculate about this as it's trivially easy to
    >> >> > test in practice. Go ahead and try it.

    >>
    >> >> I had already tried. 'pow(x, 16)' is inlined exactly as four
    >> >> multiplies, at least with a recent gcc.

    >>
    >> > Would you please let me know the details the procedure on how you
    >> > figure this out? Sometimes I want to know what the compiler compile
    >> > the code to.

    >>
    >> In Visual Studio you can run the program in the debugger and then bring
    >> up the assembly code and it will show you can step through it and switch
    >> back and forth between the code and assembly code. I would imagine you
    >> can do similar things in other IDEs and in gdb.

    >
    > Shall there be a problem if I use -O3 option? The source code and the
    > assembly code might have one-one relationship.


    There might be a problem if the compiler decides to remove some code
    completely, otherwise no.

    --
    Erik Wikström
    Erik Wikström, Sep 14, 2008
    #16
  17. Peng Yu <> writes:

    > g++ -O3 -S -masm=intel main.cc
    >
    > I tried to compile the code and the variant of it. And I got the
    > difference by diff. But I still have difficulty to understand what it
    > does. Is there a way to annotate the C++ code in the assembly code?


    Yes, and I'd also disable optimization, which can make the generated
    asm code more difficult to follow.

    g++ -S -masm=intel -fverbose-asm main.cc

    sherm--

    --
    My blog: http://shermspace.blogspot.com
    Cocoa programming in Perl: http://camelbones.sourceforge.net
    Sherm Pendley, Sep 14, 2008
    #17
  18. Peng Yu

    Peng Yu Guest

    On Sep 14, 10:49 am, Sherm Pendley <> wrote:
    > Peng Yu <> writes:
    > > g++ -O3 -S -masm=intel main.cc

    >
    > > I tried to compile the code and the variant of it. And I got the
    > > difference by diff. But I still have difficulty to understand what it
    > > does. Is there a way to annotate the C++ code in the assembly code?

    >
    > Yes, and I'd also disable optimization, which can make the generated
    > asm code more difficult to follow.
    >
    > g++ -S -masm=intel -fverbose-asm main.cc


    But I have to enable the optimization, because I want to know how the
    compiler optimize std::pow(x,n).

    Thanks,
    Peng
    Peng Yu, Sep 14, 2008
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Clueless Moron

    math.pow vs pow

    Clueless Moron, Nov 27, 2003, in forum: Python
    Replies:
    5
    Views:
    907
    John J. Lee
    Nov 28, 2003
  2. Victo

    Can't use pow() and sin()....

    Victo, Mar 28, 2005, in forum: C Programming
    Replies:
    6
    Views:
    354
    Mark McIntyre
    Mar 28, 2005
  3. Michel Rouzic

    pow(2, 1/2) != pow(2, 0.5) problem

    Michel Rouzic, Jun 15, 2005, in forum: C Programming
    Replies:
    52
    Views:
    1,622
    Alan Balmer
    Jun 20, 2005
  4. aaragon
    Replies:
    5
    Views:
    288
  5. Peng Yu
    Replies:
    3
    Views:
    264
    Peng Yu
    Sep 10, 2008
Loading...

Share This Page