RE: strange exponentiation performance

Discussion in 'Python' started by Tim Peters, Nov 23, 2003.

  1. Tim Peters

    Tim Peters Guest

    [Jeff Davis]
    > I was doing some thinking about exponentiation algorithms along with a
    > friend, and I happened upon some interesting results. Particularly, I
    > was able to outperform the ** operator in at least one case, with a
    > recursive algorithm. This leads me to believe that perhaps the **
    > operator should tune it's algorithm based on inputs or some such
    > thing. Here is my data:


    I'm taking the liberty of rewriting your functions to stop trying to squash
    as much as they can on each line. It's important to do that if you want to
    modify the code quickly when experimenting.

    First you rediscovered the "left to right" binary exponentiation algorithm:

    def h(b, e):
    if e == 0:
    return 1
    if e == 1:
    return b
    t = h(b, e >> 1)
    return t*t * h(b, e & 1)

    When you tried to remove recursion from it, you rediscovered the "right to
    left" binary exponentiation algorithm:

    def f(b, e):
    n = 1
    while e > 0:
    if e & 1:
    n *= b
    e >>= 1
    b *= b
    return n

    That's close to what Python's builtin ** does, but is missing an important
    speed trick (explained later). Note that h and f aren't really the *same*
    algorithm! f squares the current value of b on every trip through the loop.
    h does not. If you print out all the inputs to all the multiplications,
    you'll see that they're not at all the same.

    Finally (I'll skip g(), as it isn't interesting):

    def o(b, e):
    return b**e

    > ...
    > now, g() was blown out of the water, as expected, but the others were
    > close enough for another test at a higher "e" value.
    >
    > >>> test(f,19,500000)

    > 8.02366995811
    > >>> test(h,19,500000)

    > 3.66968798637
    > >>> test(o,19,500000)

    > 5.29517292976
    > >>>


    Your box is faster than mine. Here under 2.3.2 on Windows:

    f 14.6648669941
    h 6.53576912117
    o 9.59630399437

    > Now, that is the interesting part. How did ** not measure up to h()?


    The left-to-right exponentiation algorithm can be faster than the
    right-to-left one. The former is clumsier to code in C, though, and nobody
    ever bothered to endure that pain in Python's implementation. If you search
    the archives hard enough, you'll eventually find a thread about this from
    Christian Tismer.

    Neither algorithm is optimal for all inputs, BTW. Turns out optimal
    exponentiation is a very hard problem; Knuth Volume 2 has an extensive
    discussion of this ("Evaluation of Powers"); another algorithm based on
    first factoring the exponent (expressing it as a product of primes) is
    better "on average" than either binary method, but sometimes loses to them.

    > It's also interesting that f(), which is supposed to be a more
    > efficient version of h(), is lagging behind.


    The biggest trick it's missing is that it *always* does

    b *= b

    even on the last trip through the loop. The value of b it computes on the
    last trip is never used, and b has grown very large by that time so
    computing b*b is *very* expensive then. Change f like so:

    def f(b, e):
    n = 1
    while e > 0:
    if e & 1:
    n *= b
    e >>= 1
    if e: # new guard: only square b if the result will be used
    b *= b
    return n

    and then it's essentially identical to the builtin **:

    f 9.5959586986
    h 6.54231046447
    o 9.59677081413

    > I would like help explaining the following:
    > (1) How did my less-than-perfectly-optimized recursive algorithm win
    > against the ** operator?


    It used left-to-right instead of right-to-left, which on this pair of inputs
    is a better approach.

    > (2) How can I unwrap and optimize h()?


    Make it work left-to-right instead of right-to-left. That's tedious to do.
    You can't get the same sequence of multiplication inputs by going
    right-to-left, so the code will have to be more complicated. Start by
    finding the highest bit set in e. For example,

    def q(b, e):
    if e == 0:
    return 1
    if e == 1:
    return b
    e2, numbits = e, 0
    while e2:
    e2 >>= 1
    numbits += 1
    assert numbits >= 2
    result = b
    mask = 1L << (numbits - 2)
    for dummy in range(numbits - 1):
    result *= result
    if e & mask:
    result *= b
    mask >>= 1
    return result

    That's bound to be a little faster than h on most inputs, because it also
    optimizes away multiplications by 1 (well, unless b happens to be 1). Let's
    see:

    f 9.53526793946
    o 9.53408287098
    h 6.54592768903
    q 6.11897031462

    Yup, those matter, but not a whale of a lot. The savings in skipping
    multiplications by 1 is proportional to the number of 0 bits in the
    exponent. What *didn't* matter is whether it's recursive or iterative.

    > From what I understand,recursion is never supposed to be the most
    > efficient. I suspect there are some hidden inefficiencies in using
    > while(), but I'd like to know the specifics.


    Na, none of that matters. All these algorithms do a number of
    multiplications proportional to log(e, 2), which is insignificantly tiny
    compared to the magnitude of the result. Long-int multiplication is the
    only thing that consumes significant time here, so the only thing that
    matters to the time is the exact sequence of inputs fed to *. Even whether
    you write the *driver* in Python or C or assembly language is insignificant
    compared to that.

    > If my algorithm h() is better, why can't ** use a quick test to change
    > algorithms based on inputs? Or is mine better in all cases?


    See Knuth <wink>.

    > ...
    > Also note that I'm not exactly an algorithms expert, I just happened
    > upon these results while chatting with a friend.


    You did good! The first couple things you tried only nick the surface of
    what's known about this problem. You *could* spend a year learning
    everything that's known about it. In the end, if you implemented all that,
    you could easily end up with 1000x more code, which sometimes ran faster.
    To become an algorithm expert, you should do that <wink>. In real life, a
    crucial complementary skill is learning when to say "good enough", and move
    on. That's indeed why Python's implementation settled for the
    easy-to-code-in-C right-to-left binary exponentiation method. If that part
    of Python had been written in Python instead, I would have used the
    easy-to-code-in-Python recursive left-to-right method instead. Sticking to
    easy-to-code methods also has good influence on minimizing the # of bugs, of
    course.
     
    Tim Peters, Nov 23, 2003
    #1
    1. Advertising

  2. Tim Peters

    Jeff Davis Guest

    > I'm taking the liberty of rewriting your functions to stop trying to squash
    > as much as they can on each line. It's important to do that if you want to
    > modify the code quickly when experimenting.


    Fair enough.

    >
    > First you rediscovered the "left to right" binary exponentiation algorithm:
    >


    Interesting.

    <snip>

    > When you tried to remove recursion from it, you rediscovered the "right to
    > left" binary exponentiation algorithm:


    Interesting. I suspected it wasn't a perfect unwrapping of the recursion,
    because I didn't take the time to examine it carefully. Classic case of a
    wonderful idea I had at 3:30 in the morning :)

    <snip>


    > That's close to what Python's builtin ** does, but is missing an important
    > speed trick (explained later). Note that h and f aren't really the *same*
    > algorithm! f squares the current value of b on every trip through the loop.
    > h does not. If you print out all the inputs to all the multiplications,
    > you'll see that they're not at all the same.
    >


    Also interesting.

    <snip>

    >
    > Your box is faster than mine. Here under 2.3.2 on Windows:
    >


    That means I win, right ;)

    <snip>

    > exponentiation is a very hard problem; Knuth Volume 2 has an extensive
    > discussion of this ("Evaluation of Powers"); another algorithm based on
    > first factoring the exponent (expressing it as a product of primes) is
    > better "on average" than either binary method, but sometimes loses to
    > them.


    Ahh... I'm on vol. 1. I saw that it would be a while before I made it to
    page 100, so I actually loaned out vol. 2. I'll be getting that back now I
    guess :)



    <snip>

    >
    > def q(b, e):
    > if e == 0:
    > return 1
    > if e == 1:
    > return b
    > e2, numbits = e, 0
    > while e2:
    > e2 >>= 1
    > numbits += 1
    > assert numbits >= 2
    > result = b
    > mask = 1L << (numbits - 2)
    > for dummy in range(numbits - 1):
    > result *= result
    > if e & mask:
    > result *= b
    > mask >>= 1
    > return result
    >


    Now I understand what you mean about the right-left and left-right
    versions. I only had a vague understanding before.

    > That's bound to be a little faster than h on most inputs, because it
    > also optimizes away multiplications by 1 (well, unless b happens to be
    > 1). Let's see:


    Might as well eliminate the multiplication by 1, for some reason I
    assumed that could be optimized out somehow by the computer.

    <snip>
    > Yup, those matter, but not a whale of a lot. The savings in

    skipping
    > multiplications by 1 is proportional to the number of 0 bits in the
    > exponent. What *didn't* matter is whether it's recursive or iterative.


    Yeah, I guess it's always the case: algorithm first, then optimize. I know
    the rule, and I follow it when being paid (so as not to waste anyone's
    money). When doing something more academic I always feel like I need to
    break it down to the most simple operations so I understand what's going
    on. To me, recursion hides what's going on to an extent, and I felt like I
    didn't entirely understand the algorithm.

    <snip>

    >> If my algorithm h() is better, why can't ** use a quick test to change
    >> algorithms based on inputs? Or is mine better in all cases?

    >
    > See Knuth <wink>.
    >


    Uh-oh, another problem that's more difficult than meets the eye. I'll be
    back with more questions in a few years I guess ;)

    Thanks,
    Jeff
     
    Jeff Davis, Nov 24, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Steven T. Hatton

    An exponentiation function for int?

    Steven T. Hatton, Oct 13, 2004, in forum: C++
    Replies:
    14
    Views:
    753
    JXStern
    Oct 16, 2004
  2. Jeff Davis

    strange exponentiation performance

    Jeff Davis, Nov 23, 2003, in forum: Python
    Replies:
    0
    Views:
    344
    Jeff Davis
    Nov 23, 2003
  3. elventear

    Decimal and Exponentiation

    elventear, May 19, 2006, in forum: Python
    Replies:
    7
    Views:
    648
    Tim Peters
    May 20, 2006
  4. exponentiation operator (lack of)

    , Dec 22, 2005, in forum: C Programming
    Replies:
    67
    Views:
    1,535
    Dave Thompson
    Jan 4, 2006
  5. Jon

    Stopping number exponentiation

    Jon, Jul 8, 2008, in forum: ASP General
    Replies:
    1
    Views:
    146
    Bob Barrows [MVP]
    Jul 8, 2008
Loading...

Share This Page