float algorithm is slow

Discussion in 'C Programming' started by Wenfei, Jul 5, 2005.

  1. Wenfei

    Wenfei Guest

    float percentage;

    for (j = 0; j < 10000000; j++)
    {
    percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    buffer[totalBytes] =ceilf(volume * percentage) + volume;
    totalBytes++;
    }

    Because the float variable, the above loop take 2 seconds in c or c++
    on Linux machine. Does anybody has a solution to reduce the time?

    Thanks,

    Wenfei
     
    Wenfei, Jul 5, 2005
    #1
    1. Advertising

  2. Wenfei wrote:

    > float percentage;
    >
    > for (j = 0; j < 10000000; j++) {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }
    >
    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?


    > cat main.c

    #include <stdlib.h>
    #include <math.h>

    int main(int argc, char* argv[]) {

    const
    size_t n = 10000000;
    float buffer[n];
    size_t totalBytes = 0;
    const
    float_t frequency = 1.0;
    const
    float_t sampleFreq = 1.0;
    const
    float_t pi = 3.14159265358979323846;
    const
    float_t volume = 1.0;

    for (size_t j = 0; j < n; ++j) {
    float_t percentage = sinf(frequency*j*2*pi/sampleFreq);
    buffer[totalBytes] =ceilf(volume*percentage) + volume;
    totalBytes++;
    }

    return 0;
    }

    > gcc -Wall -std=c99 -pedantic -O2 -o main main.c -lm
    > time ./main

    3.694u 0.258s 0:03.92 100.5% 0+0k 0+0io 0pf+0w
     
    E. Robert Tisdale, Jul 5, 2005
    #2
    1. Advertising

  3. Wenfei

    Eric Sosman Guest

    Wenfei wrote:
    > float percentage;
    >
    > for (j = 0; j < 10000000; j++)
    > {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }
    >
    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?


    Yes: Change the iteration count from 10000000 to 0, and
    the code will almost certainly run faster.

    In other words, micro-benchmarks of this sort are not
    very informative. What are you really trying to do?

    --
    Eric Sosman
    lid
     
    Eric Sosman, Jul 5, 2005
    #3
  4. >float percentage;
    >
    >for (j = 0; j < 10000000; j++)
    >{
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    >}
    >
    >Because the float variable,


    My guess is that if you GOT RID OF the float variable, it would
    take about the same time:

    >for (j = 0; j < 10000000; j++)

    {
    buffer[totalBytes] =ceilf(volume * sinf(frequency * j * 2 * 3.14159 / sampleFreq )) + volume;
    totalBytes++;
    }

    >the above loop take 2 seconds in c or c++
    >on Linux machine. Does anybody has a solution to reduce the time?


    You haven't demonstrated why taking two seconds is a problem yet.
    Cut down the number of iterations? Get a faster machine?
    Doing the calculation in double might make it faster (although on
    Intel *86 it probably won't).

    Gordon L. Burditt
     
    Gordon Burditt, Jul 5, 2005
    #4
  5. In article <>,
    Wenfei <> wrote:
    >
    >
    > float percentage;
    >
    > for (j = 0; j < 10000000; j++)
    > {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }
    >
    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?


    * Use a small table of sine values instead of sinf() function.
    It appears that you're downsampling to the width of a char
    anyway, so table of somewhat more than 256 sine values probably
    won't make things much worse.

    * Or just map out one complete waveform and just repeatedly copy that
    throughout the rest of the buffer.

    * Or Just make one copy of the waveform and index into that as
    appropriate when you need the results.
    --
    7842++
     
    Anonymous 7843, Jul 5, 2005
    #5
  6. Wenfei wrote:
    > float percentage;
    >
    > for (j = 0; j < 10000000; j++)
    > {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }
    >
    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?


    <OT>
    It is unlikely that using double will be any faster.

    On my system, 66% of the time is spent in the first statement inside
    the loop, 27% on the second. The sinf and ceilf function calls are by
    far the most intensive parts of these statements. If your system is
    similiar, you might try replacing the sinf call with a lookup table of
    precalculated values with a tradeoff of accuracy and memory usage. You
    could also use a macro of inline function in place of ceilf.

    You didn't specify how much faster you need it, what tradeoffs are
    feasible, what you have tried already, or even what exactly you are
    trying to accomplish. Perhaps the algorithm itself could be improved
    but it is difficult to attempt to do so without knowing what the
    specifications are.
    </OT>

    Robert Gamble
     
    Robert Gamble, Jul 5, 2005
    #6
  7. Wenfei wrote:
    >
    > float percentage;
    >
    > for (j = 0; j < 10000000; j++)
    > {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }
    >
    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?


    I assume your linux machine has an x86 based processor. If thats the
    case then you problem is probably the ceilf() function. The implementation
    of that function requires that the FPU's control word to be modified at
    the start of ceilf() and restored afterwards. Every time the FPU control
    word is modified, it causes a stall in the FPU pipeline.

    You might try replacing ceilf() with the C99 function lrintf().
    Unfortunately lrintf() is a round function, not a ceil function so you
    might need to replace

    ceilf (x)

    with

    lrintf (x + 0.5).


    Also have a look at this paper:

    http://www.mega-nerd.com/FPcast/

    Erik
    --
    +-----------------------------------------------------------+
    Erik de Castro Lopo (Yes it's valid)
    +-----------------------------------------------------------+
    "It has been discovered that C++ provides a remarkable facility
    for concealing the trival details of a program -- such as where
    its bugs are." -- David Keppel
     
    Erik de Castro Lopo, Jul 5, 2005
    #7
  8. In article <>,
    Wenfei <> wrote:
    >
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;


    Sorry for posting twice to the same thread, but ceilf() may not be
    strictly really necessary. If buffer is an array of some kind of
    integer type then the assignment is going to do a conversion from float
    to int anyway. Unless you have some mathematically rigorous standard
    to uphold, you might get a "close enough" result by rounding up. E.g.

    buffer[totalBytes] = (volume * percentage + 0.5) + volume;
    --
    7842++
     
    Anonymous 7843, Jul 5, 2005
    #8
  9. In article <>,
    "Wenfei" <> wrote:

    > float percentage;
    >
    > for (j = 0; j < 10000000; j++)
    > {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }
    >
    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?


    I wouldn't worry about the speed, because your results are garbage
    anyway.

    When j has the maximum value 1e7 - 1, what is the difference between j *
    2 * 3.14159 and j * 2 * pi?

    Once you've fixed this, just a hint (mathematics is fun): Consecutive
    terms of any function of the form

    f (n) = a * sin (x * n + b) + c * cos (x * n + d)

    can be calculated using one single multiplication and one single
    addition.
     
    Christian Bau, Jul 5, 2005
    #9
  10. On Tue, 05 Jul 2005 22:57:23 +0100, Christian Bau wrote:

    ....

    > Once you've fixed this, just a hint (mathematics is fun): Consecutive
    > terms of any function of the form
    >
    > f (n) = a * sin (x * n + b) + c * cos (x * n + d)
    >
    > can be calculated using one single multiplication and one single
    > addition.


    But you need to be careful about accumulation of errors.

    Lawrence
     
    Lawrence Kirby, Jul 6, 2005
    #10
  11. On Tue, 05 Jul 2005 21:15:21 +0000, Anonymous 7843 wrote:

    > In article <>,
    > Wenfei <> wrote:
    >>
    >>
    >> float percentage;
    >>
    >> for (j = 0; j < 10000000; j++)
    >> {
    >> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );


    You may gain a little by precalculating the loop invariant
    frequency * 2 * 3.14159 / sampleFreq outside the loop.

    >> buffer[totalBytes] =ceilf(volume * percentage) + volume;
    >> totalBytes++;
    >> }
    >>
    >> Because the float variable, the above loop take 2 seconds in c or c++
    >> on Linux machine. Does anybody has a solution to reduce the time?

    >
    > * Use a small table of sine values instead of sinf() function.
    > It appears that you're downsampling to the width of a char
    > anyway, so table of somewhat more than 256 sine values probably
    > won't make things much worse.


    On of the problems is that the type of buffer is not specified

    > * Or just map out one complete waveform and just repeatedly copy that
    > throughout the rest of the buffer.
    >
    > * Or Just make one copy of the waveform and index into that as
    > appropriate when you need the results.


    These only work if the cycle length corresponds to an integral number of
    positions in buffer. There's nothing to suggest that this is the case.

    Lawrence
     
    Lawrence Kirby, Jul 6, 2005
    #11
  12. In article <>,
    Lawrence Kirby <> wrote:
    >
    > On Tue, 05 Jul 2005 21:15:21 +0000, Anonymous 7843 wrote:
    >
    > > In article <>,
    > > Wenfei <> wrote:
    > >>
    > >> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );

    >
    > > * Or Just make one copy of the waveform and index into that as
    > > appropriate when you need the results.

    >
    > These only work if the cycle length corresponds to an integral number of
    > positions in buffer. There's nothing to suggest that this is the case.


    Given the choice to use float, round pi off at the 6th digit and the
    (likely) stuffing of results into a char it didn't seem like fidelity
    was foremost.

    One hopes that this project was just the first exercise in a DSP course
    and that our intrepid correspondent Wenfei soon outlearns us.
    --
    7842++
     
    Anonymous 7843, Jul 6, 2005
    #12
  13. Wenfei

    Wenfei Guest

    I am converting Notes format to PCM format to play sounds on the ceil
    phone. It is Linux os and it is not FPU, so it is slow.

    But finally, I used a sinf lookup table, and the sound can be played
    instantly.

    Thanks for all your guys idea.

    Wenfei
     
    Wenfei, Jul 8, 2005
    #13
  14. Wenfei

    Guest

    Wenfei wrote:
    > float percentage;
    > for (j = 0; j < 10000000; j++) {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }


    sinf() appears to be a Microsoft VC++ extension that computes sin() but
    uses only 32-bit floats. Interesting that the zealots in this
    newsgroup didn't call you on that.

    Ok, anyhow remembering our trigonometry:

    sin(a + b) = sin(a)*cos(b) + sin(b)cos(a)
    cos(a + b) = cos(a)*cos(b) - sin(b)sin(a)

    Now we can strength reduce the argument to sinf:

    arg += (freqency*2*3.14159/sampleFreq);

    But that value is a constant:

    arg += deltaAngle; /* deltaAngle is precomputed as:
    freqency*2*3.14159/sampleFreq */

    So we can simplify this to:

    s1 = percentage*cos(deltaAngle) + c*sin(deltaAngle);
    c = c*cos(deltaAngle) + percentage*sin(deltaAngle);
    percentage = s1;

    And of course we can replace the cos(deltaAngle) and sin(deltaAngle)
    with some precomputed values, and we initialize c to 1, and percentage
    to 0.

    This removes all the trigonometric functions altogether.

    The problem is that it will have "accumulating accuracy" problems.
    These problems are not trivial, because you will lose the sin^2+cos^2=1
    property, which will start scaling your results (either upward or
    downward) globally, which could get bad.

    A simple way to mitigate this is to split the loop into groups of, say,
    100 or 1000 at a time, and reset the parameters with their true
    mathematical value with the raw sin/cos functions:

    percentage = sinf (frequency * j * 2 * 3.14159 / sampleFreq);
    c = cosf (frequency * j * 2 * 3.14159 / sampleFreq);

    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?


    This takes time because you are calling sinf(), not because its
    floating point.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    , Jul 10, 2005
    #14
  15. Wenfei

    CBFalconer Guest

    wrote:
    > Wenfei wrote:
    >
    >> float percentage;
    >> for (j = 0; j < 10000000; j++) {
    >> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    >> buffer[totalBytes] =ceilf(volume * percentage) + volume;
    >> totalBytes++;
    >> }

    >
    > sinf() appears to be a Microsoft VC++ extension that computes
    > sin() but uses only 32-bit floats. Interesting that the zealots
    > in this newsgroup didn't call you on that.


    The usage is proper. The foolish comment is not. From N869:

    7.12.4.6 The sin functions

    Synopsis

    [#1]
    #include <math.h>
    double sin(double x);
    float sinf(float x);
    long double sinl(long double x);

    Description

    [#2] The sin functions compute the sine of x (measured in
    radians).

    Returns

    [#3] The sin functions return the sine value.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
     
    CBFalconer, Jul 10, 2005
    #15
  16. Wenfei

    Netocrat Guest

    On Sat, 09 Jul 2005 17:35:27 -0700, websnarf wrote:

    > Wenfei wrote:
    >> float percentage;
    >> for (j = 0; j < 10000000; j++) {
    >> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    >> buffer[totalBytes] =ceilf(volume * percentage) + volume; totalBytes++;
    >> }
    >> }

    > sinf() appears to be a Microsoft VC++ extension that computes sin() but
    > uses only 32-bit floats. Interesting that the zealots in this newsgroup
    > didn't call you on that.
    >
    > Ok, anyhow remembering our trigonometry:
    >
    > sin(a + b) = sin(a)*cos(b) + sin(b)cos(a) cos(a + b) = cos(a)*cos(b) -
    > sin(b)sin(a)
    >
    > Now we can strength reduce the argument to sinf:
    >
    > arg += (freqency*2*3.14159/sampleFreq);
    >
    > But that value is a constant:
    >
    > arg += deltaAngle; /* deltaAngle is precomputed as:
    > freqency*2*3.14159/sampleFreq */
    >
    > So we can simplify this to:
    >
    > s1 = percentage*cos(deltaAngle) + c*sin(deltaAngle);
    > c = c*cos(deltaAngle) + percentage*sin(deltaAngle);


    You meant to subtract there, not add, I believe.

    > percentage = s1;
    >
    > And of course we can replace the cos(deltaAngle) and sin(deltaAngle) with
    > some precomputed values, and we initialize c to 1, and percentage to 0.
    >
    > This removes all the trigonometric functions altogether.


    Good suggestion. That would seem to be the major contributor to the
    calculation time, although you'd have to compare the approaches to be
    sure. It's theoretically possible that on a specially optimised machine
    sin(j*deltaAngle) could be faster than your approach, but I very much
    doubt that this is the case on the OP's cell phone...

    > The problem is that it will have "accumulating accuracy" problems. These
    > problems are not trivial, because you will lose the sin^2+cos^2=1
    > property, which will start scaling your results (either upward or
    > downward) globally, which could get bad.
    >
    > A simple way to mitigate this is to split the loop into groups of, say,
    > 100 or 1000 at a time, and reset the parameters with their true
    > mathematical value with the raw sin/cos functions:
    >
    > percentage = sinf (frequency * j * 2 * 3.14159 / sampleFreq); c
    > = cosf (frequency * j * 2 * 3.14159 / sampleFreq);


    Yes and you could tune the loop grouping number beforehand by looking at
    how many cycles it takes before the values grow too inaccurate for
    your needs.

    <snip>
     
    Netocrat, Jul 10, 2005
    #16
  17. In article <>,
    wrote:

    > Wenfei wrote:
    > > float percentage;
    > > for (j = 0; j < 10000000; j++) {
    > > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > > totalBytes++;
    > > }

    >
    > sinf() appears to be a Microsoft VC++ extension that computes sin() but
    > uses only 32-bit floats. Interesting that the zealots in this
    > newsgroup didn't call you on that.


    That's no surprise because it is part of C99.
     
    Christian Bau, Jul 10, 2005
    #17
  18. Wenfei

    P.J. Plauger Guest

    "Christian Bau" <> wrote in message
    news:...

    > In article <>,
    > wrote:
    >
    >> Wenfei wrote:
    >> > float percentage;
    >> > for (j = 0; j < 10000000; j++) {
    >> > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    >> > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    >> > totalBytes++;
    >> > }

    >>
    >> sinf() appears to be a Microsoft VC++ extension that computes sin() but
    >> uses only 32-bit floats. Interesting that the zealots in this
    >> newsgroup didn't call you on that.

    >
    > That's no surprise because it is part of C99.


    It's even defined, but optional, in C89.

    P.J. Plauger
    Dinkumware, Ltd.
    http://www.dinkumware.com
     
    P.J. Plauger, Jul 10, 2005
    #18
  19. Wenfei

    Wenfei Guest

    I am converting Notes format to PCM format to play sounds on the ceil
    phone. It is Linux os and it is not FPU, so it is slow.


    But finally, I used a sinf lookup table, and the sound can be played
    instantly.


    Thanks for all your guys idea.


    Wenfei


    Wenfei wrote:
    > float percentage;
    >
    > for (j = 0; j < 10000000; j++)
    > {
    > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    > buffer[totalBytes] =ceilf(volume * percentage) + volume;
    > totalBytes++;
    > }
    >
    > Because the float variable, the above loop take 2 seconds in c or c++
    > on Linux machine. Does anybody has a solution to reduce the time?
    >
    > Thanks,
    >
    > Wenfei
     
    Wenfei, Jul 11, 2005
    #19
  20. E. Robert Tisdale wrote:
    > Wenfei wrote:
    >
    >> float percentage;
    >>
    >> for (j = 0; j < 10000000; j++) {
    >> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
    >> buffer[totalBytes] =ceilf(volume * percentage) + volume;
    >> totalBytes++;
    >> }
    >>
    >> Because the float variable, the above loop take 2 seconds in c or c++
    >> on Linux machine. Does anybody has a solution to reduce the time?

    >
    > > cat main.c

    > #include <stdlib.h>
    > #include <math.h>
    >
    > int main(int argc, char* argv[]) {
    >
    > const
    > size_t n = 10000000;
    > float buffer[n];
    > size_t totalBytes = 0;
    > const
    > float_t frequency = 1.0;
    > const
    > float_t sampleFreq = 1.0;
    > const
    > float_t pi = 3.14159265358979323846;
    > const
    > float_t volume = 1.0;
    >
    > for (size_t j = 0; j < n; ++j) {
    > float_t percentage = sinf(frequency*j*2*pi/sampleFreq);
    > buffer[totalBytes] =ceilf(volume*percentage) + volume;
    > totalBytes++;
    > }
    >
    > return 0;
    > }
    >
    > > gcc -Wall -std=c99 -pedantic -O2 -o main main.c -lm
    > > time ./main

    > 3.694u 0.258s 0:03.92 100.5% 0+0k 0+0io 0pf+0w

    If you are posting a version of someone's example that looks different,
    but really does the exact same thing in the exact same way, it's best if
    you state that that's what you're doing. Don't make people read your
    code and compare to figure out you didn't change the logic, nor optimize
    it at all. But sure, you version is better style.

    -Tydr
     
    Tydr Schnubbis, Jul 12, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ahmed Moustafa
    Replies:
    0
    Views:
    785
    Ahmed Moustafa
    Nov 15, 2003
  2. Bapaiah Katepalli
    Replies:
    1
    Views:
    1,505
    Mike Treseler
    Jun 23, 2006
  3. bd
    Replies:
    0
    Views:
    642
  4. Wenfei

    float algorithm is slow

    Wenfei, Jul 5, 2005, in forum: C++
    Replies:
    11
    Views:
    533
    Wenfei
    Jul 8, 2005
  5. Carsten Fuchs
    Replies:
    45
    Views:
    1,578
    James Kanze
    Oct 8, 2009
Loading...

Share This Page