float algorithm is slow

Wenfei · Jul 5, 2005

float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

Thanks,

Wenfei

E. Robert Tisdale · Jul 5, 2005

Wenfei said:
float percentage;

for (j = 0; j < 10000000; j++) {
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

> cat main.c

#include <stdlib.h>
#include <math.h>

int main(int argc, char* argv[]) {

const
size_t n = 10000000;
float buffer[n];
size_t totalBytes = 0;
const
float_t frequency = 1.0;
const
float_t sampleFreq = 1.0;
const
float_t pi = 3.14159265358979323846;
const
float_t volume = 1.0;

for (size_t j = 0; j < n; ++j) {
float_t percentage = sinf(frequency*j*2*pi/sampleFreq);
buffer[totalBytes] =ceilf(volume*percentage) + volume;
totalBytes++;
}

return 0;
}

> gcc -Wall -std=c99 -pedantic -O2 -o main main.c -lm
> time ./main

3.694u 0.258s 0:03.92 100.5% 0+0k 0+0io 0pf+0w

Eric Sosman · Jul 5, 2005

Wenfei said:
float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

Yes: Change the iteration count from 10000000 to 0, and
the code will almost certainly run faster.

In other words, micro-benchmarks of this sort are not
very informative. What are you really trying to do?

Gordon Burditt · Jul 5, 2005

float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable,

My guess is that if you GOT RID OF the float variable, it would
take about the same time:

for (j = 0; j < 10000000; j++)

{
buffer[totalBytes] =ceilf(volume * sinf(frequency * j * 2 * 3.14159 / sampleFreq )) + volume;
totalBytes++;
}

the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

You haven't demonstrated why taking two seconds is a problem yet.
Cut down the number of iterations? Get a faster machine?
Doing the calculation in double might make it faster (although on
Intel *86 it probably won't).

Gordon L. Burditt

Anonymous 7843 · Jul 5, 2005

float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

* Use a small table of sine values instead of sinf() function.
It appears that you're downsampling to the width of a char
anyway, so table of somewhat more than 256 sine values probably
won't make things much worse.

* Or just map out one complete waveform and just repeatedly copy that
throughout the rest of the buffer.

* Or Just make one copy of the waveform and index into that as
appropriate when you need the results.

Robert Gamble · Jul 5, 2005

Wenfei said:
float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

<OT>
It is unlikely that using double will be any faster.

On my system, 66% of the time is spent in the first statement inside
the loop, 27% on the second. The sinf and ceilf function calls are by
far the most intensive parts of these statements. If your system is
similiar, you might try replacing the sinf call with a lookup table of
precalculated values with a tradeoff of accuracy and memory usage. You
could also use a macro of inline function in place of ceilf.

You didn't specify how much faster you need it, what tradeoffs are
feasible, what you have tried already, or even what exactly you are
trying to accomplish. Perhaps the algorithm itself could be improved
but it is difficult to attempt to do so without knowing what the
specifications are.
</OT>

Robert Gamble

Erik de Castro Lopo · Jul 5, 2005

Wenfei said:
float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

I assume your linux machine has an x86 based processor. If thats the
case then you problem is probably the ceilf() function. The implementation
of that function requires that the FPU's control word to be modified at
the start of ceilf() and restored afterwards. Every time the FPU control
word is modified, it causes a stall in the FPU pipeline.

You might try replacing ceilf() with the C99 function lrintf().
Unfortunately lrintf() is a round function, not a ceil function so you
might need to replace

ceilf (x)

with

lrintf (x + 0.5).

Also have a look at this paper:

http://www.mega-nerd.com/FPcast/

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo (e-mail address removed) (Yes it's valid)
+-----------------------------------------------------------+
"It has been discovered that C++ provides a remarkable facility
for concealing the trival details of a program -- such as where
its bugs are." -- David Keppel

Anonymous 7843 · Jul 5, 2005

buffer[totalBytes] =ceilf(volume * percentage) + volume;

Sorry for posting twice to the same thread, but ceilf() may not be
strictly really necessary. If buffer is an array of some kind of
integer type then the assignment is going to do a conversion from float
to int anyway. Unless you have some mathematically rigorous standard
to uphold, you might get a "close enough" result by rounding up. E.g.

buffer[totalBytes] = (volume * percentage + 0.5) + volume;

Christian Bau · Jul 5, 2005

"Wenfei said:
float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

I wouldn't worry about the speed, because your results are garbage
anyway.

When j has the maximum value 1e7 - 1, what is the difference between j *
2 * 3.14159 and j * 2 * pi?

Once you've fixed this, just a hint (mathematics is fun): Consecutive
terms of any function of the form

f (n) = a * sin (x * n + b) + c * cos (x * n + d)

can be calculated using one single multiplication and one single
addition.

Lawrence Kirby · Jul 6, 2005

On Tue, 05 Jul 2005 22:57:23 +0100, Christian Bau wrote:

....

Once you've fixed this, just a hint (mathematics is fun): Consecutive
terms of any function of the form

f (n) = a * sin (x * n + b) + c * cos (x * n + d)

can be calculated using one single multiplication and one single
addition.

But you need to be careful about accumulation of errors.

Lawrence

Lawrence Kirby · Jul 6, 2005

You may gain a little by precalculating the loop invariant
frequency * 2 * 3.14159 / sampleFreq outside the loop.

buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

Click to expand...

* Use a small table of sine values instead of sinf() function.
It appears that you're downsampling to the width of a char
anyway, so table of somewhat more than 256 sine values probably
won't make things much worse.

On of the problems is that the type of buffer is not specified

* Or just map out one complete waveform and just repeatedly copy that
throughout the rest of the buffer.

* Or Just make one copy of the waveform and index into that as
appropriate when you need the results.

These only work if the cycle length corresponds to an integral number of
positions in buffer. There's nothing to suggest that this is the case.

Lawrence

Anonymous 7843 · Jul 6, 2005

These only work if the cycle length corresponds to an integral number of
positions in buffer. There's nothing to suggest that this is the case.

Given the choice to use float, round pi off at the 6th digit and the
(likely) stuffing of results into a char it didn't seem like fidelity
was foremost.

One hopes that this project was just the first exercise in a DSP course
and that our intrepid correspondent Wenfei soon outlearns us.

Wenfei · Jul 8, 2005

I am converting Notes format to PCM format to play sounds on the ceil
phone. It is Linux os and it is not FPU, so it is slow.

But finally, I used a sinf lookup table, and the sound can be played
instantly.

Thanks for all your guys idea.

Wenfei

websnarf · Jul 10, 2005

Wenfei said:
float percentage;
for (j = 0; j < 10000000; j++) {
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

sinf() appears to be a Microsoft VC++ extension that computes sin() but
uses only 32-bit floats. Interesting that the zealots in this
newsgroup didn't call you on that.

Ok, anyhow remembering our trigonometry:

sin(a + b) = sin(a)*cos(b) + sin(b)cos(a)
cos(a + b) = cos(a)*cos(b) - sin(b)sin(a)

Now we can strength reduce the argument to sinf:

arg += (freqency*2*3.14159/sampleFreq);

But that value is a constant:

arg += deltaAngle; /* deltaAngle is precomputed as:
freqency*2*3.14159/sampleFreq */

So we can simplify this to:

s1 = percentage*cos(deltaAngle) + c*sin(deltaAngle);
c = c*cos(deltaAngle) + percentage*sin(deltaAngle);
percentage = s1;

And of course we can replace the cos(deltaAngle) and sin(deltaAngle)
with some precomputed values, and we initialize c to 1, and percentage
to 0.

This removes all the trigonometric functions altogether.

The problem is that it will have "accumulating accuracy" problems.
These problems are not trivial, because you will lose the sin^2+cos^2=1
property, which will start scaling your results (either upward or
downward) globally, which could get bad.

A simple way to mitigate this is to split the loop into groups of, say,
100 or 1000 at a time, and reset the parameters with their true
mathematical value with the raw sin/cos functions:

percentage = sinf (frequency * j * 2 * 3.14159 / sampleFreq);
c = cosf (frequency * j * 2 * 3.14159 / sampleFreq);

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

This takes time because you are calling sinf(), not because its
floating point.

CBFalconer · Jul 10, 2005

Wenfei said:
Wenfei said:

float percentage;
for (j = 0; j < 10000000; j++) {
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Click to expand...

sinf() appears to be a Microsoft VC++ extension that computes
sin() but uses only 32-bit floats. Interesting that the zealots
in this newsgroup didn't call you on that.

The usage is proper. The foolish comment is not. From N869:

7.12.4.6 The sin functions

Synopsis

[#1]
#include <math.h>
double sin(double x);
float sinf(float x);
long double sinl(long double x);

Description

[#2] The sin functions compute the sine of x (measured in
radians).

Returns

[#3] The sin functions return the sine value.

Netocrat · Jul 10, 2005

Wenfei said:
Wenfei said:

float percentage;
for (j = 0; j < 10000000; j++) {
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume; totalBytes++;
}
}

Click to expand...

sinf() appears to be a Microsoft VC++ extension that computes sin() but
uses only 32-bit floats. Interesting that the zealots in this newsgroup
didn't call you on that.

Ok, anyhow remembering our trigonometry:

sin(a + b) = sin(a)*cos(b) + sin(b)cos(a) cos(a + b) = cos(a)*cos(b) -
sin(b)sin(a)

Now we can strength reduce the argument to sinf:

arg += (freqency*2*3.14159/sampleFreq);

But that value is a constant:

arg += deltaAngle; /* deltaAngle is precomputed as:
freqency*2*3.14159/sampleFreq */

So we can simplify this to:

s1 = percentage*cos(deltaAngle) + c*sin(deltaAngle);
c = c*cos(deltaAngle) + percentage*sin(deltaAngle);

You meant to subtract there, not add, I believe.

percentage = s1;

And of course we can replace the cos(deltaAngle) and sin(deltaAngle) with
some precomputed values, and we initialize c to 1, and percentage to 0.

This removes all the trigonometric functions altogether.

Good suggestion. That would seem to be the major contributor to the
calculation time, although you'd have to compare the approaches to be
sure. It's theoretically possible that on a specially optimised machine
sin(j*deltaAngle) could be faster than your approach, but I very much
doubt that this is the case on the OP's cell phone...

The problem is that it will have "accumulating accuracy" problems. These
problems are not trivial, because you will lose the sin^2+cos^2=1
property, which will start scaling your results (either upward or
downward) globally, which could get bad.

A simple way to mitigate this is to split the loop into groups of, say,
100 or 1000 at a time, and reset the parameters with their true
mathematical value with the raw sin/cos functions:

percentage = sinf (frequency * j * 2 * 3.14159 / sampleFreq); c
= cosf (frequency * j * 2 * 3.14159 / sampleFreq);

Yes and you could tune the loop grouping number beforehand by looking at
how many cycles it takes before the values grow too inaccurate for
your needs.

<snip>

Christian Bau · Jul 10, 2005

Wenfei said:
Wenfei said:

float percentage;
for (j = 0; j < 10000000; j++) {
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Click to expand...

sinf() appears to be a Microsoft VC++ extension that computes sin() but
uses only 32-bit floats. Interesting that the zealots in this
newsgroup didn't call you on that.

That's no surprise because it is part of C99.

P.J. Plauger · Jul 10, 2005

Wenfei said:
Wenfei said:

float percentage;
for (j = 0; j < 10000000; j++) {
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Click to expand...

sinf() appears to be a Microsoft VC++ extension that computes sin() but
uses only 32-bit floats. Interesting that the zealots in this
newsgroup didn't call you on that.

Click to expand...

That's no surprise because it is part of C99.

It's even defined, but optional, in C89.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Wenfei · Jul 11, 2005

I am converting Notes format to PCM format to play sounds on the ceil
phone. It is Linux os and it is not FPU, so it is slow.

But finally, I used a sinf lookup table, and the sound can be played
instantly.

Thanks for all your guys idea.

Wenfei

Tydr Schnubbis · Jul 12, 2005

E. Robert Tisdale said:
Wenfei said:

float percentage;

for (j = 0; j < 10000000; j++) {
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

Click to expand...

cat main.c

Click to expand...

#include <stdlib.h>
#include <math.h>

int main(int argc, char* argv[]) {

const
size_t n = 10000000;
float buffer[n];
size_t totalBytes = 0;
const
float_t frequency = 1.0;
const
float_t sampleFreq = 1.0;
const
float_t pi = 3.14159265358979323846;
const
float_t volume = 1.0;

for (size_t j = 0; j < n; ++j) {
float_t percentage = sinf(frequency*j*2*pi/sampleFreq);
buffer[totalBytes] =ceilf(volume*percentage) + volume;
totalBytes++;
}

return 0;
}

gcc -Wall -std=c99 -pedantic -O2 -o main main.c -lm
time ./main

Click to expand...

3.694u 0.258s 0:03.92 100.5% 0+0k 0+0io 0pf+0w

If you are posting a version of someone's example that looks different,
but really does the exact same thing in the exact same way, it's best if
you state that that's what you're doing. Don't make people read your
code and compare to figure out you didn't change the logic, nor optimize
it at all. But sure, you version is better style.

-Tydr

Computation slow with float than double.	13	Jun 6, 2005
file seek is slow	11	Mar 9, 2010
A Practical Introduction to Data Structures and Algorithm Analysis2Ed by Shaffer	0	Feb 4, 2010
Why is Java so slow????	58	Nov 19, 2007
Request for help	22	Sep 20, 2007
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 1, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Dec 15, 2007
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Nov 1, 2007

float algorithm is slow

Wenfei

E. Robert Tisdale

Eric Sosman

Gordon Burditt

Anonymous 7843

Robert Gamble

Erik de Castro Lopo

Anonymous 7843

Christian Bau

Lawrence Kirby

Lawrence Kirby

Anonymous 7843

Wenfei

websnarf

CBFalconer

Netocrat

Christian Bau

P.J. Plauger

Wenfei

Tydr Schnubbis

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads