Computing a*b/c without overflow in the preprocessor

P

Peter Nilsson

Tim Rentsch said:
I believe I have a solution to this problem.  Is anyone
still interested?

I imagine you spent some little time working on it. Posting
takes 10 seconds to format and send. So, why not just post
it?
 
T

Tim Rentsch

Peter Nilsson said:
I imagine you spent some little time working on it. Posting
takes 10 seconds to format and send. So, why not just post
it?

The short answer is that providing an explanation along with
the macro definition would take a lot more than ten seconds,
and this isn't the sort of thing someone wants to read without
an explanation. (Also I have a rule about not talking when I
don't think anyone is listening.)
 
T

Tim Rentsch

Jason Earl said:
I am listening. In fact, I would really appreciate it.

Here is a sketch.

We start with the identity given in the first posting

a*b/c == (a/c)*b + (b/c)*(a%c) + (a%c)*(b%c)/c

or writing in something more C-like

#define MD(a,b,c) (a/c*b + b/c*(a%c) + a%c *(b%c) /c)

This gets some part of the result, because the first two
terms can be evaluated directly; only the last term might
cause overflow. (Incidentally, normal good practice in
writing macro definitions will put parentheses around
macro paramters where ever they occur in the macro body.
I'm not going to do that, to make the presentation more
compact. Obviously these parentheses should be put in
when writing the actual macro definitions.)

Re-writing the above just slightly,

#define MD(a,b,c) (a/c*b + b/c*(a%c) + Q( (a%c), (b%c), c) )

where Q(a,b,c) means "a*b/c", we have a nice recursive
formulation of the problem. So we would like to do again
something like MD (through a succession of Q0, Q1, Q2, ...)
to get more exact answers. Unfortunately using the same
identity again doesn't help, because (a%c) and (b%c) are
both less than c, so the first two terms of MD (if used
in Q) would be zero.

Suppose we consider an actual example. Suppose we have
a = 186737000, b = 116080197, c = 1430650345. We're looking
to get a*b/c = 15151478. Choose K to be (2**32-1)/a. We
have the rough identity

a*b/c ~~= a*K/c * (b/K)

(because the K's cancel, and we've divided by c before
multiplying, but that's just part of what makes it
an approximation). If we calculate, we get

186737000 * (4294967295 / 186737000) / 1430650345
* (116080197 / (4294967295 / 186737000))
== 15140895

which is a very nice approximation! (Of course, we
got lucky with this one, but we'll deal with that
shortly.) Notice the nice thing about the approximation
formula -- both 'a*K/c' and 'b/K' can be calculated without
fear of overflow in 32 bits. Because K == (2**32-1)/a,
a*K will always fit in 32 bits. So this term can be
computed without ever overflowing (and hence also the
product 'a*K/c * (b/K)' as long as we know a*b/c fits
in 32 bits).

Here is a more fleshed out version of the approximation
given above

a*b/c ~= K*a /c *(b/K) + b%K *a /c + K*a %c *(b/K) /c

What's missing in this second version is the remainders.
Using the identity

(a+b)/c == a/c + b/c + (a%c + b%c)/c

we get an exact identity

a*b/c == K*a /c *(b/K) + b%K *a /c + K*a %c *(b/K) /c
+ (b%K *a %c + K*a %c *(b/K) %c) /c

We can use this identity as the basis of macro definitions
for a series of Q's. To illustrate,

#define QK0(K,a,b,c) ( \
K*a /c *(b/K) + b%K *a /c + Q1( K*a %c, b/K, c ) \
+ (b%K *a %c + R1( K*a %c, b/K, c ) /c \
)

#define Q1(a,b,c) QK1( (0xffffffffUL/a), a, b, c )

... etc ...

where Rn(a,b,c) means "a*b%c". We use a similar recursive
formulation for the Rn's (details left to the reader). The
series terminates when we've gotten the approximation close
enough, eg, perhaps

#define Q5(a,b,c) (a*b/c)
#define R5(a,b,c) (a*b%c)

Well, that's most of it. What's left is some protection
against problem conditions and some engineering. Here
are the main considerations:

1. Have to protect against dividing by a if a is 0.
If a is 0, a*b/c == 0, and a*b%c == 0. Use ?:.

2. It turns out the convergence is faster if we switch
the order of the arguments in the Qn/Rn macros.
Mathematically it doesn't matter, but practically
we want the calls to be

Qn( b/K, K*a %c, c )

and

Rn( b/K, K*a %c, c )

rather than how they are written above.

3. This formulation will give an exact answer when the
result is under about 20,000, provided we choose
the smaller of (a%c) and (b%c) as the first argument
to Q0. Use ?: to do that.

4. When c is large and a*b/c is large, this approach
doesn't quite converge fast enough. Fortunately
there is a simple approximation that works

a*b/c ~= (a/25)*b/(c/25)

Applying Q0 to these arguments yields an approximate
value to better than one part in 10,000. (This time
we want to use the larger of (a%c) and (b%c) as the
first argument).

5. Deciding which realm we are in (and so whether to
use the approximation explained in (4) can be done
by calculating an approximate answer. It's easy
to get an approximate answer that good to within
a factor of, oh, 25%, and that's easily good enough
to choose between case (4) and all the others.


Also: there is a more elaborate approach that produces an exact
answer for all values in the range desired (a,b,c <= 2000000000),
but that's more complicated so I thought I'd just give this one.
Folks who are mathematically minded may enjoy working out the
more elaborate approach using the sketched approach as a starting
point.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,602
Members
45,185
Latest member
GluceaReviews

Latest Threads

Top