# Optimizing pow() function

Discussion in 'C Programming' started by pozzugno@gmail.com, Apr 22, 2013.

1. ### Guest

I have a simple (16-bit) embedded platform and I have to make the following calculations.

unsigned int
func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
return (unsigned int)(pow((double)adc, exp) *
(double)pwr / pow((double)pt, exp));
}

I noticed this calculation takes too much time, most probably for pow() function call and double conversion.

I'm finding a way to simplify/optimize this function. Consider that adc and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be 50-10000.

Any suggestion to avoid pow() function call?

, Apr 22, 2013

2. ### Eric SosmanGuest

On 4/22/2013 11:23 AM, wrote:
> I have a simple (16-bit) embedded platform and I have to make the following calculations.
>
> unsigned int
> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
> return (unsigned int)(pow((double)adc, exp) *
> (double)pwr / pow((double)pt, exp));
> }
>
> I noticed this calculation takes too much time, most probably for pow() function call and double conversion.
>
> I'm finding a way to simplify/optimize this function. Consider that adc and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be 50-10000.
>
> Any suggestion to avoid pow() function call?

Simple enough to avoid half of them:

return pwr * pow( (double)adc / (double)pt, exp);

This is "algebraically equivalent" to the original, although
it probably won't give exactly the same result every time.

--
Eric Sosman
d

Eric Sosman, Apr 22, 2013

3. ### Fred KGuest

On Monday, April 22, 2013 11:08:24 AM UTC-7, Rich Webb wrote:
> On Mon, 22 Apr 2013 08:23:51 -0700 (PDT), wrote: >I have a simple (16-bit) embedded platform and I have to make the following calculations. > >unsigned int >func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) { > return (unsigned int)(pow((double)adc, exp) * > (double)pwr / pow((double)pt, exp)); >} > >I noticed this calculation takestoo much time, most probably for pow() function call and double conversion.. > >I'm finding a way to simplify/optimize this function. Consider that adc and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be 50-10000. > >Any suggestion to avoid pow() function call? You might try asking over in comp.arch.embedded, where this kind of an issue is more often addressed than here in a general C group. Depending on whatyou do with the result of func(), you might be able to get away with operating on the logarithms, applying Jack Crenshaw's bitlog() function as a fast approximation.

If pt is indeed in the range 0...1023, you have to handle the extreme case where pt=0 somehow, since the result is infinity, unless adc is also zero, in which case the result is indeterminant.
--
Fred K

Fred K, Apr 22, 2013
4. ### Edward A. FalkGuest

In article <>,
<> wrote:
>I have a simple (16-bit) embedded platform and I have to make the following calculations.
>
>unsigned int
>func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
> return (unsigned int)(pow((double)adc, exp) *
> (double)pwr / pow((double)pt, exp));
>}
>
>I noticed this calculation takes too much time, most probably for pow() function call
>and double conversion.
>
>I'm finding a way to simplify/optimize this function. Consider that adc and pt
>parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be
>50-10000.

Are you not worried about overflow?

Your embedded processor almost certainly does not have a floating-point
unit. This explains why your code is so slow.

I see all values are guaranteed to be integer values.

If you're not worried about overflow in the intermediate expressions,
or you can simplify your expression to reduce the risk, then consider
writing your own "ipow()" function.

--
-Ed Falk,
http://thespamdiaries.blogspot.com/

Edward A. Falk, Apr 22, 2013
5. ### James KuyperGuest

On 04/22/2013 02:34 PM, Edward A. Falk wrote:
> In article <>,
> <> wrote:
>> I have a simple (16-bit) embedded platform and I have to make the following calculations.
>>
>> unsigned int
>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>> return (unsigned int)(pow((double)adc, exp) *
>> (double)pwr / pow((double)pt, exp));
>> }
>>
>> I noticed this calculation takes too much time, most probably for pow() function call
>> and double conversion.
>>
>> I'm finding a way to simplify/optimize this function. Consider that adc and pt
>> parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be
>> 50-10000.

>
> Are you not worried about overflow?
>
> Your embedded processor almost certainly does not have a floating-point
> unit. This explains why your code is so slow.
>
> I see all values are guaranteed to be integer values.

The variable exp is double, and he didn't say that only integral values
of exp are allowed. If that is the case, he can speed up the code a lot
by using multiplications, rather than calls to pow().

Note: 'exp' is a bad name for a variable, given that it hides the
<math.h> function of the same name.

James Kuyper, Apr 22, 2013
6. ### BartCGuest

<> wrote in message
news:...
> I have a simple (16-bit) embedded platform and I have to make the
> following calculations.
>
> unsigned int
> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
> return (unsigned int)(pow((double)adc, exp) *
> (double)pwr / pow((double)pt, exp));
> }
>
> I noticed this calculation takes too much time, most probably for pow()
> function call and double conversion.
>
> I'm finding a way to simplify/optimize this function. Consider that adc
> and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00,
> and pwr can be 50-10000.
>
> Any suggestion to avoid pow() function call?

Is there a version of pow() that uses float instead of double? That might be
a bit faster if floating point has to be done in software.

Can exp be anything from 1.00 to 3.00? How often will it be 1.0, 2.0, and
3.0? (These can be optimised.) Will it also only be specified to two
decimals? That suggests only 201 different values, although making use of a
200K lookup table indexed by adc/pt and 100*exp is probably not that
practical, and the results will be approximate.

Are the same adc/pt and exp combinations likely to recur frequently? It
might be possible to cache the results of the calculations (in a table a lot
smaller than 200K entries), but execution time might become non-linear. If
the combinations are predictable, then the table could be precalculated (it
all depends how this func() function is used).

--
Bartc

BartC, Apr 22, 2013
7. ### James KuyperGuest

On 04/22/2013 03:09 PM, BartC wrote:
....
> Is there a version of pow() that uses float instead of double? ...

C99 introduced powf(), which takes float arguments and returns a float
value.

James Kuyper, Apr 22, 2013
8. ### Malcolm McLeanGuest

On Monday, April 22, 2013 8:09:31 PM UTC+1, Bart wrote:
> <> wrote in message
>
>
> Can exp be anything from 1.00 to 3.00? How often will it be 1.0, 2.0, and
> 3.0? (These can be optimised.) Will it also only be specified to two
> decimals? That suggests only 201 different values, although making use of a
> 200K lookup table indexed by adc/pt and 100*exp is probably not that
> practical, and the results will be approximate.
>

pow(x, y) = exp(y * log(x)) for positive values.

the expensive part is working out log x. But if you can look it up, you need
far fewer than 200k entries.

--
You can read about how to implement the math library in my book Basic Algorithms
http://www.malcolmmclean.site11.com/www

Malcolm McLean, Apr 22, 2013
9. ### Keith ThompsonGuest

James Kuyper <> writes:
> On 04/22/2013 03:09 PM, BartC wrote:
> ...
>> Is there a version of pow() that uses float instead of double? ...

>
> C99 introduced powf(), which takes float arguments and returns a float
> value.

But I wouldn't count on it being significantly faster than pow().

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Apr 22, 2013
10. ### BartCGuest

"Keith Thompson" <> wrote in message
news:...
> James Kuyper <> writes:
>> On 04/22/2013 03:09 PM, BartC wrote:
>> ...
>>> Is there a version of pow() that uses float instead of double? ...

>>
>> C99 introduced powf(), which takes float arguments and returns a float
>> value.

>
> But I wouldn't count on it being significantly faster than pow().

Not on a desktop PC with floating point hardware. But the OP is using a
'simple' 16-bit processor.

--
Bartc

BartC, Apr 22, 2013
11. ### glen herrmannsfeldtGuest

James Kuyper <> wrote:
> On 04/22/2013 02:34 PM, Edward A. Falk wrote:
>> <> wrote:
>>> I have a simple (16-bit) embedded platform and I have to
>>> make the following calculations.

>>> unsigned int
>>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>>> return (unsigned int)(pow((double)adc, exp) *
>>> (double)pwr / pow((double)pt, exp));

>>> I noticed this calculation takes too much time, most probably
>>> for pow() function call and double conversion.

>>> I'm finding a way to simplify/optimize this function. Consider that adc and pt
>>> parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be
>>> 50-10000.

>> Are you not worried about overflow?

>> Your embedded processor almost certainly does not have a floating-point
>> unit. This explains why your code is so slow.

>> I see all values are guaranteed to be integer values.

> The variable exp is double, and he didn't say that only integral values
> of exp are allowed. If that is the case, he can speed up the code a lot
> by using multiplications, rather than calls to pow().

If only a small number of different exp values occur, a look-up
table would be faster than pow. If I want an int result, I try to do
the whole calculation in integer arithmetic, even if it needs some
shifts to get the binary point in the right place.

Routines for fixed point scaled exp, log, and I believe pow are
in Knuth's "Metafont: The Program."

With only 10 bit input, I presume you don't need 53 bits of double.
With a little work doing scaling, it can all be done in fixed point
arithmetic. (Except that pow comes in as double, but either generate
the appropriate scaled fixed point value or send it in that way.)

-- glen

-- glen

glen herrmannsfeldt, Apr 22, 2013
12. ### Keith ThompsonGuest

"BartC" <> writes:
> "Keith Thompson" <> wrote in message
> news:...
>> James Kuyper <> writes:
>>> On 04/22/2013 03:09 PM, BartC wrote:
>>> ...
>>>> Is there a version of pow() that uses float instead of double? ...
>>>
>>> C99 introduced powf(), which takes float arguments and returns a float
>>> value.

>>
>> But I wouldn't count on it being significantly faster than pow().

>
> Not on a desktop PC with floating point hardware. But the OP is using
> a 'simple' 16-bit processor.

I still wouldn't *count* on it being significantly faster. If it
matters, measure it.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Apr 22, 2013
13. ### glen herrmannsfeldtGuest

Keith Thompson <> wrote:
> James Kuyper <> writes:
>> On 04/22/2013 03:09 PM, BartC wrote:

>>> Is there a version of pow() that uses float instead of double? ...

>> C99 introduced powf(), which takes float arguments and
>> returns a float value.

> But I wouldn't count on it being significantly faster than pow().

In a software only implementation, one might hope that it is
faster, but yes I wouldn't be surprised if it wasn't. It should
use logf() and expf(), which again might not be faster.

-- glen

glen herrmannsfeldt, Apr 23, 2013
14. ### pozzGuest

Il 22/04/2013 20:22, Fred K ha scritto:
> On Monday, April 22, 2013 11:08:24 AM UTC-7, Rich Webb wrote:
>> On Mon, 22 Apr 2013 08:23:51 -0700 (PDT), wrote: >I have a simple (16-bit) embedded platform and I have to make the following calculations. > >unsigned int >func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) { > return (unsigned int)(pow((double)adc, exp) * > (double)pwr / pow((double)pt, exp)); >} > >I noticed this calculation takes too much time, most probably for pow() function call and double conversion. > >I'm finding a way to simplify/optimize this function. Consider that adc and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be 50-10000. > >Any suggestion to avoid pow() function call? You might try asking over in comp.arch.embedded, where this kind of an issue is more often addressed than here in a general C group. Depending on what you do with the result of func(), you might be able to get away with operating on the logarithms, applying Jack Crenshaw's bitlog() function as a fast approximation.

>
> If pt is indeed in the range 0...1023, you have to handle the extreme case where pt=0 somehow, since the result is infinity, unless adc is also zero, in which case the result is indeterminant.
>

I have to make another restriction. pt and adc parameters could be in
the range 0..1023, but I'm using pt=800.
At the moment I don't need other values for pt, but I'm interested to
generalize this calculation with different values for pt in a small
range around 800 (for example, 700-900).

pozz, Apr 23, 2013
15. ### pozzGuest

Il 22/04/2013 17:42, Eric Sosman ha scritto:
> On 4/22/2013 11:23 AM, wrote:
>> I have a simple (16-bit) embedded platform and I have to make the
>> following calculations.
>>
>> unsigned int
>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>> return (unsigned int)(pow((double)adc, exp) *
>> (double)pwr / pow((double)pt, exp));
>> }
>>
>> I noticed this calculation takes too much time, most probably for
>> pow() function call and double conversion.
>>
>> I'm finding a way to simplify/optimize this function. Consider that
>> adc and pt parameters are in the range 0..1023, exp is in the range
>> 1.00-3.00, and pwr can be 50-10000.
>>
>> Any suggestion to avoid pow() function call?

>
> Simple enough to avoid half of them:
>
> return pwr * pow( (double)adc / (double)pt, exp);
>
> This is "algebraically equivalent" to the original, although
> it probably won't give exactly the same result every time.
>

Sure, I have halfed the time in this way, but I'd like to reduce the
process time more, if possible.

pozz, Apr 23, 2013
16. ### pozzGuest

Il 22/04/2013 20:49, James Kuyper ha scritto:
> On 04/22/2013 02:34 PM, Edward A. Falk wrote:
>> In article <>,
>> <> wrote:
>>> I have a simple (16-bit) embedded platform and I have to make the following calculations.
>>>
>>> unsigned int
>>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>>> return (unsigned int)(pow((double)adc, exp) *
>>> (double)pwr / pow((double)pt, exp));
>>> }
>>>
>>> I noticed this calculation takes too much time, most probably for pow() function call
>>> and double conversion.
>>>
>>> I'm finding a way to simplify/optimize this function. Consider that adc and pt
>>> parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be
>>> 50-10000.

>>
>> Are you not worried about overflow?
>>
>> Your embedded processor almost certainly does not have a floating-point
>> unit. This explains why your code is so slow.
>>
>> I see all values are guaranteed to be integer values.

>
> The variable exp is double, and he didn't say that only integral values
> of exp are allowed. If that is the case, he can speed up the code a lot
> by using multiplications, rather than calls to pow().

exp can be in the range 1.00 to 3.00 (it derives from an integer value
in the range 100-300 divided by 100).

> Note: 'exp' is a bad name for a variable, given that it hides the
> <math.h> function of the same name.
>

You are right, thank you for this observation.

pozz, Apr 23, 2013
17. ### pozzGuest

Il 22/04/2013 21:09, BartC ha scritto:
> <> wrote in message
> news:...
>> I have a simple (16-bit) embedded platform and I have to make the
>> following calculations.
>>
>> unsigned int
>> func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
>> return (unsigned int)(pow((double)adc, exp) *
>> (double)pwr / pow((double)pt, exp));
>> }
>>
>> I noticed this calculation takes too much time, most probably for pow()
>> function call and double conversion.
>>
>> I'm finding a way to simplify/optimize this function. Consider that adc
>> and pt parameters are in the range 0..1023, exp is in the range
>> 1.00-3.00,
>> and pwr can be 50-10000.
>>
>> Any suggestion to avoid pow() function call?

>
> Is there a version of pow() that uses float instead of double? That
> might be
> a bit faster if floating point has to be done in software.

I will check.

> Can exp be anything from 1.00 to 3.00?

Yes, it is a sort of "fine calibration".

> How often will it be 1.0, 2.0, and
> 3.0? (These can be optimised.)

The same occurence of 1.03 or 2.43. Anyway, after exp value is
calibrated on the field during first stages of installation/setup, it
remains always the same.

> Will it also only be specified to two
> decimals? That suggests only 201 different values, although making use of a
> 200K lookup table indexed by adc/pt and 100*exp is probably not that
> practical, and the results will be approximate.

I already thought on this approach, but the look-up tables will be too big.

> Are the same adc/pt and exp combinations likely to recur frequently?It
> might be possible to cache the results of the calculations (in a table a
> lot
> smaller than 200K entries), but execution time might become non-linear.

In "real-time" systems, I don't like to have tasks that takes a
"non-linear" time to execute.

> If
> the combinations are predictable, then the table could be precalculated
> (it all depends how this func() function is used).
>

The exp parameter is a calibration value that is defined during first
stages. After that, it will remain always the same value.
I can pre-calculate the look-up table for that particular value. I need
a table of adc/pt combinations length (1023 integers, considering a
fixed pt value), so 2046 bytes... I don't have this space in RAM :-(

pozz, Apr 23, 2013
18. ### Francois GrieuGuest

Pozz wrote:

> I have a simple (16-bit) embedded platform and I have to make
> the following calculations.

unsigned int
func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
return (unsigned int)(pow((double)adc, exp) *
(double)pwr / pow((double)pt, exp));
}

> adc and pt parameters are in the range 0..1023,
> exp is in the range 1.00-3.00, and
> pwr can be 50-10000.

> I have to make another restriction. pt and adc parameters could be
> in the range 0..1023, but I'm using pt=800.
> At the moment I don't need other values for pt, but I'm interested to
> generalize this calculation with different values
> for pt in a small range around 800 (for example, 700-900).

Could you state what in the context is constant, or could be subject to
one-time precomputation, and what is variable (like, I guess, ADC?)

Also: does your environment has powf, which is to float what pow
is to double? It might be much faster.

If pt is constant we can save a little with

#define pt 800

unsigned func(unsigned adc, unsigned pwr, double exp) {
return pwr * pow( (1./pt) * adc, exp );
}

Similarly, if pt is a runtime parameter that seldom vary, we can do

double ginvpt;

// call when pt changes
void reparameterize( unsigned pt ) {
ginvpt = 1./pt;
}

unsigned func(unsigned adc, unsigned pwr, double exp) {
return pwr * pow( ginvpt * adc, exp );
}

We would do significantly better, including getting rid of pow, and perhaps
of double or float, if exp is constant (or seldom vary); and avoid the
final multiplication, and I guess double altogether, if pwr is constant
(or seldom vary).

Francois Grieu

Francois Grieu, Apr 23, 2013
19. ### Guest

On Monday, April 22, 2013 5:23:51 PM UTC+2, wrote:
> Any suggestion to avoid pow() function call?

Ok, it's better to explain what I'm trying to do.

I have a 10-bit ADC that acquires a voltage signal and convert it to a decimal number 0..1023.

This signal measures a power (Watt). The relation between voltage and poweris theorically quadratic:

P = k * x ^ 2

where k is a constant that depends on the system and x is the ADC result.

In order to calibrate k, I measure with an instrument the power at a nominal/reference value, for example I have Pr power measured ad Vr voltage that xr ADC points. So I can write:

P = Pr * (x / xr) ^ 2

Usually I make the reference measurement at xr=800pt.

The quadratic law above is simple to implement, but in practice I have to fine-tune this law, because at low power levels the voltage/power relation seems to deviate from the quadratic law (I don't know exactly why, maybe some non-linear behaviour of some components). So I changed the formula with:

P = Pr * (x / xr) ^ a

where a is in the range 1-3.

As you can see, after calibration/setup/installation/test steps I finally have three constants: Pr, xr and a. For simplicity xr could be considered constant at compile time (800pt). Pr and a is machine dependent.

So my problem is to calculate:

p = (x / 800) ^ a = exp(a * ln(x / 800))

I already tried to create a look-up table for ln(x) with x in the range 0.001 (about 1 / 800) to 1.3 (about 1023 / 800), leaving only the exp() function call. It works on desktop computer, but I couldn't find free space for the table (about 2600 bytes) in the embedded platform, both in RAM and in Flash.

The best would be to have a simple formula (polynomial function?) that approximate well enough the above equation at run-time, without too big tables.

, Apr 23, 2013
20. ### BartCGuest

"glen herrmannsfeldt" <> wrote in message
news:kl4r0b\$64f\$...
> Keith Thompson <> wrote:
>> James Kuyper <> writes:
>>> On 04/22/2013 03:09 PM, BartC wrote:

>
>>>> Is there a version of pow() that uses float instead of double? ...

>
>>> C99 introduced powf(), which takes float arguments and
>>> returns a float value.

>
>> But I wouldn't count on it being significantly faster than pow().

>
> In a software only implementation, one might hope that it is
> faster, but yes I wouldn't be surprised if it wasn't. It should
> use logf() and expf(), which again might not be faster.

For what reasons wouldn't it be faster to emulate only 32-bit floating point
arithmetic instead of 64-bits, on a 16-bit processor?

--
Bartc

BartC, Apr 23, 2013