# Optimizing pow() function

P

#### pozzugno

I have a simple (16-bit) embedded platform and I have to make the following calculations.

unsigned int
func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
(double)pwr / pow((double)pt, exp));
}

I noticed this calculation takes too much time, most probably for pow() function call and double conversion.

I'm finding a way to simplify/optimize this function. Consider that adc and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be 50-10000.

Any suggestion to avoid pow() function call?

E

#### Eric Sosman

I have a simple (16-bit) embedded platform and I have to make the following calculations.

unsigned int
func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
(double)pwr / pow((double)pt, exp));
}

I noticed this calculation takes too much time, most probably for pow() function call and double conversion.

I'm finding a way to simplify/optimize this function. Consider that adc and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be 50-10000.

Any suggestion to avoid pow() function call?

Simple enough to avoid half of them:

return pwr * pow( (double)adc / (double)pt, exp);

This is "algebraically equivalent" to the original, although
it probably won't give exactly the same result every time.

F

#### Fred K

If pt is indeed in the range 0...1023, you have to handle the extreme case where pt=0 somehow, since the result is infinity, unless adc is also zero, in which case the result is indeterminant.

E

#### Edward A. Falk

I have a simple (16-bit) embedded platform and I have to make the following calculations.

unsigned int
func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
(double)pwr / pow((double)pt, exp));
}

I noticed this calculation takes too much time, most probably for pow() function call
and double conversion.

I'm finding a way to simplify/optimize this function. Consider that adc and pt
parameters are in the range 0..1023, exp is in the range 1.00-3.00, and pwr can be
50-10000.

Are you not worried about overflow?

Your embedded processor almost certainly does not have a floating-point
unit. This explains why your code is so slow.

I see all values are guaranteed to be integer values.

If you're not worried about overflow in the intermediate expressions,
or you can simplify your expression to reduce the risk, then consider

J

#### James Kuyper

Are you not worried about overflow?

Your embedded processor almost certainly does not have a floating-point
unit. This explains why your code is so slow.

I see all values are guaranteed to be integer values.

The variable exp is double, and he didn't say that only integral values
of exp are allowed. If that is the case, he can speed up the code a lot
by using multiplications, rather than calls to pow().

Note: 'exp' is a bad name for a variable, given that it hides the
<math.h> function of the same name.

B

#### BartC

I have a simple (16-bit) embedded platform and I have to make the
following calculations.

unsigned int
func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
(double)pwr / pow((double)pt, exp));
}

I noticed this calculation takes too much time, most probably for pow()
function call and double conversion.

I'm finding a way to simplify/optimize this function. Consider that adc
and pt parameters are in the range 0..1023, exp is in the range 1.00-3.00,
and pwr can be 50-10000.

Any suggestion to avoid pow() function call?

Is there a version of pow() that uses float instead of double? That might be
a bit faster if floating point has to be done in software.

Can exp be anything from 1.00 to 3.00? How often will it be 1.0, 2.0, and
3.0? (These can be optimised.) Will it also only be specified to two
decimals? That suggests only 201 different values, although making use of a
200K lookup table indexed by adc/pt and 100*exp is probably not that
practical, and the results will be approximate.

Are the same adc/pt and exp combinations likely to recur frequently? It
might be possible to cache the results of the calculations (in a table a lot
smaller than 200K entries), but execution time might become non-linear. If
the combinations are predictable, then the table could be precalculated (it
all depends how this func() function is used).

J

#### James Kuyper

On 04/22/2013 03:09 PM, BartC wrote:
....
Is there a version of pow() that uses float instead of double? ...

C99 introduced powf(), which takes float arguments and returns a float
value.

M

#### Malcolm McLean

Can exp be anything from 1.00 to 3.00? How often will it be 1.0, 2.0, and
3.0? (These can be optimised.) Will it also only be specified to two
decimals? That suggests only 201 different values, although making use of a
200K lookup table indexed by adc/pt and 100*exp is probably not that
practical, and the results will be approximate.

pow(x, y) = exp(y * log(x)) for positive values.

the expensive part is working out log x. But if you can look it up, you need
far fewer than 200k entries.

K

#### Keith Thompson

James Kuyper said:
On 04/22/2013 03:09 PM, BartC wrote:
...

C99 introduced powf(), which takes float arguments and returns a float
value.

But I wouldn't count on it being significantly faster than pow().

B

#### BartC

Keith Thompson said:
But I wouldn't count on it being significantly faster than pow().

Not on a desktop PC with floating point hardware. But the OP is using a
'simple' 16-bit processor.

G

#### glen herrmannsfeldt

The variable exp is double, and he didn't say that only integral values
of exp are allowed. If that is the case, he can speed up the code a lot
by using multiplications, rather than calls to pow().

If only a small number of different exp values occur, a look-up
table would be faster than pow. If I want an int result, I try to do
the whole calculation in integer arithmetic, even if it needs some
shifts to get the binary point in the right place.

Routines for fixed point scaled exp, log, and I believe pow are
in Knuth's "Metafont: The Program."

With only 10 bit input, I presume you don't need 53 bits of double.
With a little work doing scaling, it can all be done in fixed point
arithmetic. (Except that pow comes in as double, but either generate
the appropriate scaled fixed point value or send it in that way.)

-- glen

-- glen

K

#### Keith Thompson

BartC said:
Not on a desktop PC with floating point hardware. But the OP is using
a 'simple' 16-bit processor.

I still wouldn't *count* on it being significantly faster. If it
matters, measure it.

G

#### glen herrmannsfeldt

But I wouldn't count on it being significantly faster than pow().

In a software only implementation, one might hope that it is
faster, but yes I wouldn't be surprised if it wasn't. It should
use logf() and expf(), which again might not be faster.

-- glen

P

#### pozz

Il 22/04/2013 20:22, Fred K ha scritto:
If pt is indeed in the range 0...1023, you have to handle the extreme case where pt=0 somehow, since the result is infinity, unless adc is also zero, in which case the result is indeterminant.

I have to make another restriction. pt and adc parameters could be in
the range 0..1023, but I'm using pt=800.
At the moment I don't need other values for pt, but I'm interested to
generalize this calculation with different values for pt in a small
range around 800 (for example, 700-900).

P

#### pozz

Il 22/04/2013 17:42, Eric Sosman ha scritto:
Simple enough to avoid half of them:

return pwr * pow( (double)adc / (double)pt, exp);

This is "algebraically equivalent" to the original, although
it probably won't give exactly the same result every time.

Sure, I have halfed the time in this way, but I'd like to reduce the
process time more, if possible.

P

#### pozz

Il 22/04/2013 20:49, James Kuyper ha scritto:
The variable exp is double, and he didn't say that only integral values
of exp are allowed. If that is the case, he can speed up the code a lot
by using multiplications, rather than calls to pow().

exp can be in the range 1.00 to 3.00 (it derives from an integer value
in the range 100-300 divided by 100).

Note: 'exp' is a bad name for a variable, given that it hides the
<math.h> function of the same name.

You are right, thank you for this observation.

P

#### pozz

Il 22/04/2013 21:09, BartC ha scritto:
Is there a version of pow() that uses float instead of double? That
might be
a bit faster if floating point has to be done in software.

I will check.

Can exp be anything from 1.00 to 3.00?

Yes, it is a sort of "fine calibration".

How often will it be 1.0, 2.0, and
3.0? (These can be optimised.)

The same occurence of 1.03 or 2.43. Anyway, after exp value is
calibrated on the field during first stages of installation/setup, it
remains always the same.

Will it also only be specified to two
decimals? That suggests only 201 different values, although making use of a
200K lookup table indexed by adc/pt and 100*exp is probably not that
practical, and the results will be approximate.

I already thought on this approach, but the look-up tables will be too big.

Are the same adc/pt and exp combinations likely to recur frequently?It
might be possible to cache the results of the calculations (in a table a
lot
smaller than 200K entries), but execution time might become non-linear.

In "real-time" systems, I don't like to have tasks that takes a
"non-linear" time to execute.

If
the combinations are predictable, then the table could be precalculated
(it all depends how this func() function is used).

The exp parameter is a calibration value that is defined during first
stages. After that, it will remain always the same value.
I can pre-calculate the look-up table for that particular value. I need
a table of adc/pt combinations length (1023 integers, considering a
fixed pt value), so 2046 bytes... I don't have this space in RAM :-(

F

#### Francois Grieu

Pozz said:
I have a simple (16-bit) embedded platform and I have to make
the following calculations.

unsigned int
func(unsigned int adc, unsigned int pwr, unsigned int pt, double exp) {
(double)pwr / pow((double)pt, exp));
}
adc and pt parameters are in the range 0..1023,
exp is in the range 1.00-3.00, and
pwr can be 50-10000.
I have to make another restriction. pt and adc parameters could be
in the range 0..1023, but I'm using pt=800.
At the moment I don't need other values for pt, but I'm interested to
generalize this calculation with different values
for pt in a small range around 800 (for example, 700-900).

Could you state what in the context is constant, or could be subject to
one-time precomputation, and what is variable (like, I guess, ADC?)

Also: does your environment has powf, which is to float what pow
is to double? It might be much faster.

If pt is constant we can save a little with

#define pt 800

unsigned func(unsigned adc, unsigned pwr, double exp) {
return pwr * pow( (1./pt) * adc, exp );
}

Similarly, if pt is a runtime parameter that seldom vary, we can do

double ginvpt;

// call when pt changes
void reparameterize( unsigned pt ) {
ginvpt = 1./pt;
}

unsigned func(unsigned adc, unsigned pwr, double exp) {
return pwr * pow( ginvpt * adc, exp );
}

We would do significantly better, including getting rid of pow, and perhaps
of double or float, if exp is constant (or seldom vary); and avoid the
final multiplication, and I guess double altogether, if pwr is constant
(or seldom vary).

Francois Grieu

P

#### pozzugno

Any suggestion to avoid pow() function call?

Ok, it's better to explain what I'm trying to do.

I have a 10-bit ADC that acquires a voltage signal and convert it to a decimal number 0..1023.

This signal measures a power (Watt). The relation between voltage and poweris theorically quadratic:

P = k * x ^ 2

where k is a constant that depends on the system and x is the ADC result.

In order to calibrate k, I measure with an instrument the power at a nominal/reference value, for example I have Pr power measured ad Vr voltage that xr ADC points. So I can write:

P = Pr * (x / xr) ^ 2

Usually I make the reference measurement at xr=800pt.

The quadratic law above is simple to implement, but in practice I have to fine-tune this law, because at low power levels the voltage/power relation seems to deviate from the quadratic law (I don't know exactly why, maybe some non-linear behaviour of some components). So I changed the formula with:

P = Pr * (x / xr) ^ a

where a is in the range 1-3.

As you can see, after calibration/setup/installation/test steps I finally have three constants: Pr, xr and a. For simplicity xr could be considered constant at compile time (800pt). Pr and a is machine dependent.

So my problem is to calculate:

p = (x / 800) ^ a = exp(a * ln(x / 800))

I already tried to create a look-up table for ln(x) with x in the range 0.001 (about 1 / 800) to 1.3 (about 1023 / 800), leaving only the exp() function call. It works on desktop computer, but I couldn't find free space for the table (about 2600 bytes) in the embedded platform, both in RAM and in Flash.

The best would be to have a simple formula (polynomial function?) that approximate well enough the above equation at run-time, without too big tables.

B

#### BartC

glen herrmannsfeldt said:
In a software only implementation, one might hope that it is
faster, but yes I wouldn't be surprised if it wasn't. It should
use logf() and expf(), which again might not be faster.

For what reasons wouldn't it be faster to emulate only 32-bit floating point
arithmetic instead of 64-bits, on a 16-bit processor?