Recursive Functions

Glen Herrmannsfeldt · Oct 29, 2003

Marcus Lessard said:
Then maybe it is just me, but I didn't explain myself clearly. The
objective of calculating powers is all well and good and no doubt of use but
what confuses me is the highly specified nature of the algorithm: (from OP):

"...breaking n down into halves(where half of n=n/2), squaring
Power(x,n/2), and multiplying by x again if n was odd..."

Will this be more efficient? Or is it just the solution demanded by the
professor?

Except for some special cases, it is the most efficient set of multiplies to
generate a given power.

In the iterative form, it is the algorithm used by languages that I know of
that supply such an operation.

If n is a power of two, such as 2**m, it will square the number m times.

I suppose as an example for recursive algorithms it is a little better than
factorial.

-- glen

Eric Sosman · Oct 29, 2003

Richard said:
Hmmm. I raised 7 to the power 7777 using recursion and iteration. (Since the
result occupies over 6500 decimal digits, I won't display it here!)

The recursive calculation took 0.22 seconds, and the iterative version 1.06
seconds - almost five times slower. Perhaps you could demonstrate an
iterative version that can hold a candle to the recursive technique?

SomeType my_power(SomeType x, unsigned int n) {
/* This is the right-to-left binary method used
* by the O.P., expressed iteratively rather
* than recursively. A similar left-to-right
* method would use one less multiplication
* (note that the first multiplication in this
* method is always by unity, and thus wasted),
* but needs extra code to find the high-order
* 1-bit of `n'. Other methods are faster for
* certain values of `n' (e.g., 15), but are
* also more complicated. See Knuth, "The
* Art of Computer Programming, Volume II:
* Seminumerical Algorithms" Section 4.6.3;
* this is Algorithm A from that section.
*/
SomeType y = 1;

/* Optional: Make an explicit test for x==0 here.
* As things stand, my_power(0,0) -> 1.
*/

for (;

{
if (n & 1)
y *= x;
if ((n >>= 1) == 0)
return y;
x *= x;
}
}

Marcus Lessard · Oct 29, 2003

"Glen Herrmannsfeldt"
....

If n is a power of two, such as 2**m, it will square the number m times.

I suppose as an example for recursive algorithms it is a little better than
factorial.

-- glen

From a mathematical point of view there is no difference in breaking up the
computation into a set of squares and then multiplying. What is going on in
the underlying processing that makes using the squares more efficient than
just multiplying x by itself n times?

ML

Eric Sosman · Oct 29, 2003

Marcus said:
"Glen Herrmannsfeldt"
...

From a mathematical point of view there is no difference in breaking up the
computation into a set of squares and then multiplying. What is going on in
the underlying processing that makes using the squares more efficient than
just multiplying x by itself n times?

Count the multiplications:

x*x*x*...*x*x*x = x^128

x*x = x^2, x^2*x^2 = x^4, ..., x^64*x^64 = x^128

Which do you think will be faster: 127 multiplications or 7?

Richard Heathfield · Oct 29, 2003

James said:
Richard, could you post your iterative version? Maybe we can help
you tweak it.

Well, I'm not really ready to release the underlying library code yet. But
here's the high-level stuff:

while(MF_Compare(a2, zero) != 0)
{
MF_Multiply(t, a3, a1);
MF_Copy(a3, t);
MF_Subtract(t, a2, one);
MF_Copy(a2, t);
}

As you can see, this does involve two copy operations, which I concede is a
bit lame, but eliminating them is not going to make a huge difference. (The
profile suggests that they take up zero time, which can't be quite true, of
course, but you get the general idea.)

The profile says that the recursive version spends 0.21 seconds doing 18
multiplications, whilst the iterative version spends 0.96 seconds doing
7777 multiplications.

Richard Heathfield · Oct 29, 2003

Wolfgang said:
6573 (!?)

Yes, exactly right.

In case you want a quick check against your own code, the first few digits
are 21254808227343471907345591571884156862...

and it finishes ...65911020435734015379207.

Dave Vandervies · Oct 29, 2003

Well, I'm not really ready to release the underlying library code yet. But
here's the high-level stuff:

while(MF_Compare(a2, zero) != 0)
{
MF_Multiply(t, a3, a1);
MF_Copy(a3, t);
MF_Subtract(t, a2, one);
MF_Copy(a2, t);
}

As you can see, this does involve two copy operations, which I concede is a
bit lame, but eliminating them is not going to make a huge difference. (The
profile suggests that they take up zero time, which can't be quite true, of
course, but you get the general idea.)

If you have bitwise operators (and-with-1 for "is odd" and shift-right
for "integer divide by two"), you might want to try an iterative
square-and-multiply, something like this:
--------
/*Raise a1 to the power of a2, storing result in a3*/
if(MF_Compare(a2,zero) == 0) /*Assumes return 0 -> equal*/
{
/*Special case, zeroth power is always 1*/
MF_Copy(a3,one);
}
else
{
MF_Copy(a3,a1);
while(MF_Compare(a2,one) > 0) /*assumes return > 0 -> arg1>arg2*/
{
MF_Multiply(t,a3,a3);
/*MF_IS_ODD can be implemented using a bitwise and with 1*/
if(MF_IS_ODD(a2))
{
/*As a side effect of the multiplication, we can put the
result back in the variable we want it to end up in,
and avoid a separate copy operation
*/
MF_Multiply(a3,t,a1);
}
else
{
MF_Copy(a3,t);
}
/*MF_DIVIDE_BY_TWO can be implemented by a bitwise right shift,
avoiding having to do a long-division operation
*/
MF_DIVIDE_BY_TWO(a2);
}
}
/*a3 now contains the desired result*/
--------
This was translated without additional checking from code I wrote that
passed a brief sanity check doing the same thing with unsigned longs.
Any bugs are a result of either translation error or incomplete testing
of the original.

The profile says that the recursive version spends 0.21 seconds doing 18
multiplications, whilst the iterative version spends 0.96 seconds doing
7777 multiplications.

Ahh. So what you were really comparing was a recursive square-and-
multiply(?) and an iterative counted-multiply.
The iterative square-and-multiply should compare somewhat more favorably
with the recursive one.

dave

--
Dave Vandervies (e-mail address removed)

[O]ur recruitment drive isn't until June, so we're short-handed as it is.

Damn! My copy of the Gay Agenda must be out of date.
--Graham Reed and Zebee Johnstone in the scary devil monastery

Micah Cowan · Oct 29, 2003

Richard Heathfield said:
Hmmm. I raised 7 to the power 7777 using recursion and iteration. (Since the
result occupies over 6500 decimal digits, I won't display it here!)

The recursive calculation took 0.22 seconds, and the iterative version 1.06
seconds - almost five times slower. Perhaps you could demonstrate an
iterative version that can hold a candle to the recursive technique?

Richard, are you serious? How about "emulating" the recursion via
stack and iteration? There are several advantages to this, IMHO:

- You have precise control the size of the stack
- If the stack runs out, you can exit gracefully: or even grow
the stack. Try doing that when your implementation's stack
runs out.

Glen Herrmannsfeldt · Oct 29, 2003

Arthur J. O'Dwyer said:
No, I used bignum routines written in ISO C.

Click to expand...

[I did figure as much; still, I don't think the original discussion
was meant to generalize to bignums. I thought we'd been talking
about 'foo(double, unsigned int)'.]

Well, mostly the suggestion was int, float, and double, but the basic
algorithm should work for other types.

Though by iterative algorithm I did mean an iterative implementation of the
square and multiply algorithm.

Here is the loop from the OS/360 (not copyrighted) Fortran library routine
for double base to integer power. Not included is the check for a negative
power, and later reciprocal of the result. They use a two register shift,
instead of AND to test the low bit after the shift, which might save one or
two instructions.

PLUS LD FACTOR,ONE LOAD FACTOR OF ONE IN FACTOR REG
LOOP SRDL EXPN,1 SHIFT LOW BIT EXPN REG INTO ADDR REG
LTR ADDR,ADDR TEST SIGN POS ADDR REG FOR MINUS BIT
BC 10,JUMP IF SIGN BIT NOT MINUS,BRANCH TO JUMP
MDR FACTOR,BASE MULTIPLY FACTOR REG BY BASE NO REG
JUMP LTR EXPN,EXPN CHECK IF EXPONENT PLUS,MINUS,OR ZERO
BC 8,NEXT IF EXPONENT NOW ZERO, BRANCH TO NEXT
MDR BASE,BASE MULTIPLY BASE NO BY DOUBLING ITSELF
BC 15,LOOP BRANCH TO LOOP TO TEST NEXT EXPN BIT

I post this as proof that the algorithm has actually been used in production
software. With any function call overhead, I would expect it to be slower
in the recursive version. Though for bignum versions, the difference may be
small, as the multiply will be somewhat slower. Also, I don't know which
bignum libraries can do a square in place (without copying it first).

-- glen

-- glen

Tim Woodall · Oct 29, 2003

What's wrong with n==15? The algorithm given is still O(lg N)
in such cases. No, probably Knuth was thinking of *better*
algorithms that could compute powers more quickly -- I don't
know of any, but perhaps a couple of exp() and log() lookup
tables, plus iteration, could do things faster than straight
recursion.

x^3 can be done with two multiplications and x^5 can be done with 3
using the square and multiply algorithm

x2 =x * x x2 = x * x
y =x * x2 z = x * x2
x4 = x2 * x2
y2 = y * y z = z * x4
y4 = y2 * y2 x8 = x4 * x4
z = y * y4 z = z * x8

So x^15 can be done with 5 multiplications by calling the square and
multiply algorithm twice. But the square and multiply algorithm
requires 6 multiplications for x^15. A general algorithm would
require the ability to factor n which is not known to be easy ;-)

(Factoring n isn't guaranteed to give a better solution than square
and multiply, 33 being the smallest counter example. Knuth has about
20 pages on finding a general optimal solution)

Tim.

Richard Heathfield · Oct 29, 2003

Dave said:
Ahh. So what you were really comparing was a recursive square-and-
multiply(?) and an iterative counted-multiply.
Indeed.

The iterative square-and-multiply should compare somewhat more favorably
with the recursive one.

++Indeed.

Tim Woodall · Oct 29, 2003

small, as the multiply will be somewhat slower. Also, I don't know which
bignum libraries can do a square in place (without copying it first).

void Bignum_Square(unsigned int* data, int size)
{
int i;
all_integer_fft(data, size); /* See Knuth 4.6.4 exercise 59 */
convolute_with_itself(data, size);
all_integer_ifft(data, size);
normalize(data, size);
}

The precise representation of a Bignum inside an integer array is left as
an exercise for the reader

Tim.

Julian V. Noble · Oct 30, 2003

Richard said:
Hmmm. I raised 7 to the power 7777 using recursion and iteration. (Since the
result occupies over 6500 decimal digits, I won't display it here!)

The recursive calculation took 0.22 seconds, and the iterative version 1.06
seconds - almost five times slower. Perhaps you could demonstrate an
iterative version that can hold a candle to the recursive technique?

--
Richard Heathfield : (e-mail address removed)
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

To compare recursive and iterative versions you have to know what is
happening at the machine level. Some processors do recursion faster
because they can parallelize stack ops with arithmetic ops.

I found that the recursive version of the extended Euclid algorithm
for greatest common divisor runs 20-30% faster than the iterative
version given by Knuth, on Pentium-class machines.

I expect this would not be true on machines with enough registers that
all the intermediate results could be allocated to registers.

--
Julian V. Noble
Professor Emeritus of Physics
(e-mail address removed)
^^^^^^^^^^^^^^^^^^
http://galileo.phys.virginia.edu/~jvn/

"Science knows only one commandment: contribute to science."
-- Bertolt Brecht, "Galileo".

James Hu · Oct 30, 2003

Well, I'm not really ready to release the underlying library code yet. But
here's the high-level stuff:

while(MF_Compare(a2, zero) != 0)
{
MF_Multiply(t, a3, a1);
MF_Copy(a3, t);
MF_Subtract(t, a2, one);
MF_Copy(a2, t);
}

As you can see, this does involve two copy operations, which I concede is a
bit lame, but eliminating them is not going to make a huge difference. (The
profile suggests that they take up zero time, which can't be quite true, of
course, but you get the general idea.)

As others have noted, you are not comparing the same algorithms

.

But anyway, as a way to tweak your existing implementation, maybe
you could maintain pointers:

while(MF_Compare(*a2, zero) != 0)
{
void *p;
MF_Multiply(*t, *a3, a1);
p = t; t = a3; a3 = p;
MF_Subtract(*t, *a2, one);
p = t; t = a2; a2 = p;
}

-- James

James Hu · Oct 30, 2003

I'm surprised there is much if any difference:

long mpow(long x, long e)
{
long r=1;

while(e)
{
if(e&1)
r*=x;
x*=x;
e>>=1;
}
return r;
}

Its fairly trivial to remove one multiplication at the cost of n if
tests (where n is the number of bits in e) and one multiplication can
become an assignment but other than that I don't think fewer
multiplications are possible using the square and multiply idiom.

double ulpow (double x, unsigned long n)
{
double t = 1;

if (n & 1) t = x;
while (n >>= 1) {
x *= x;
if (n & 1) t *= x;
}
return t;
}

Am I missing something? Where did we add an additional test per
iteration?

For comparison, here is an alternative recursive implementation.

static double ulpow_r(double x, unsigned long n, double t)
{
if (n == 1) return t*x;
return ulpow_r(x*x, n >> 1, ((n&1) ? t*x : t));
}

double ulpow (double x, unsigned long n)
{
if (n == 0) return 1;
return ulpow_r(x, n, 1);
}

-- James

Richard Heathfield · Oct 30, 2003

James said:
As others have noted, you are not comparing the same algorithms .

Mea culpa; I think of the divide-and-conquer method with 18 multiplies as
being "recursive" and the rollup method with 7777 multiplies as being
"iterative". Hence the confusion. As the man said, "si tacuisses,
philosophus mansisses"; I should heed his advice more often.

But anyway, as a way to tweak your existing implementation, maybe
you could maintain pointers:

Yes, I know.

Wolfgang Riedel · Oct 30, 2003

Richard said:
Yes, exactly right.

In case you want a quick check against your own code, the first few digits
are 21254808227343471907345591571884156862...

and it finishes ...65911020435734015379207.

--
Richard Heathfield : (e-mail address removed)
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

3001 - 3020:
58808357019792422968

Dan Pop · Oct 30, 2003

In said:
The profile says that the recursive version spends 0.21 seconds doing 18
multiplications, whilst the iterative version spends 0.96 seconds doing
7777 multiplications.

That's the kind of comparison I would have expected from Mark McIntyre...

It makes NO sense to talk about "versions" in such a comparison, you're
comparing implementations of *different* algorithms.

Dan

Richard Heathfield · Oct 30, 2003

Wolfgang said:
3001 - 3020:
58808357019792422968

Looks good to me.

Richard Heathfield · Oct 30, 2003

Dan said:
In <[email protected]> Richard Heathfield

The profile says that the recursive version spends 0.21 seconds doing 18
multiplications, whilst the iterative version spends 0.96 seconds doing
7777 multiplications.

Click to expand...

[...]
It makes NO sense to talk about "versions" in such a comparison, you're
comparing implementations of *different* algorithms.

Well, I was indeed comparing different algorithms. My mistake was to label
them "recursive" and "iterative". It was in fact the difference between
square-and-multiply and multiply-n-times that I was attempting to
highlight, and I made a complete pig's breakfast of it by attaching
informal labels ("recursive" and "iterative") to these algorithms instead
of spelling out precisely what I mean. Silly of me.

C Programming functions	2	Dec 3, 2021
recursive fuction	8	Feb 20, 2007
[newbie] Recursive algorithm - review	5	Jan 4, 2014
[newbie] Recursive algorithm - review	5	Jan 4, 2014
recursive fuction	3	Feb 19, 2007
Faster Recursive Fibonacci Numbers	23	May 17, 2011
Does Python optimize low-power functions?	6	Dec 6, 2013
Recursive functions	41	Apr 1, 2007

Recursive Functions

Glen Herrmannsfeldt

Eric Sosman

Marcus Lessard

Eric Sosman

Richard Heathfield

Richard Heathfield

Dave Vandervies

Micah Cowan

Glen Herrmannsfeldt

Tim Woodall

Richard Heathfield

Tim Woodall

Julian V. Noble

James Hu

James Hu

Richard Heathfield

Wolfgang Riedel

Dan Pop

Richard Heathfield

Richard Heathfield

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads