floating point calculation problem

Discussion in 'C++' started by bei, Jan 6, 2010.

  1. bei

    bei Guest

    Hello,

    I was bothered by the following problem for a while and could not
    understand how it happened.

    They are a lot of floating point calculation in the code.
    Basically, i am doing an interpolation from a random position to the
    closest grid point position. It is a 1d problem.
    The domain is from (-L, L) and N is the number of grid point at a half
    domain.
    The grid size is dv, where dv = L/N; The grid is cell centered (-0.5
    in the code below).
    L, dv are declared to be double type and N to be integer type.

    Below is a part of the code that bothered me (where v is the position,
    j is the closest grid point index number)

    int j0 = (int)(floor((v/dv+(N-0.5))));
    double a = -2.5*pow(((v/dv+(N-0.5))-j0), 2.0);
    double b = 1.5*pow(((v/dv+(N-0.5))-j0), 3.0);
    double q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+
    (N-0.5))-j0), 3.0);
    double q0 = 1.0 +a + b; (q0 is equal to q in the language of math.)

    the output is
    v = 2.818975726362143e+00;
    dv = 4.394531250000000e-03;
    N = 1146;
    j0 = 1786;
    a = -2.369681598890769e+00;
    b = 1.384255444361780e+00;

    nothing wrong until here, below is the problem
    q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+(N-0.5))-j0),
    3.0)
    = -1.369681598890769e+00; (wrong answer)
    q0 = 1.0 +a + b
    = 1.457384547101137e-02; (correct answer)

    It looked like if a formula involvs a lot of floating point
    calculation, it is safer to save the temporary data into a variable.

    Why this is happened and how could I fix this problem? Any suggestion
    is appreciated. Thank you!
    bei, Jan 6, 2010
    #1
    1. Advertising

  2. bei

    bei Guest

    On Jan 5, 7:09 pm, bei <> wrote:
    > Hello,
    >
    > I was bothered by the following problem for a while and could not
    > understand how it happened.
    >
    > They are a lot of floating point calculation in the code.
    > Basically, i am doing an interpolation from a random position to the
    > closest grid point position. It is a 1d problem.
    > The domain is from (-L, L) and N is the number of grid point at a half
    > domain.
    > The grid size is dv, where dv = L/N; The grid is cell centered (-0.5
    > in the code below).
    > L, dv are declared to be double type and N to be integer type.
    >
    > Below is a part of the code that bothered me (where v is the position,
    > j is the closest grid point index number)
    >
    >  int j0 = (int)(floor((v/dv+(N-0.5))));
    >  double a = -2.5*pow(((v/dv+(N-0.5))-j0), 2.0);
    >  double b = 1.5*pow(((v/dv+(N-0.5))-j0), 3.0);
    >  double q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+
    > (N-0.5))-j0), 3.0);
    >  double q0 = 1.0 +a + b; (q0 is equal to q in the language of math.)
    >
    > the output is
    > v = 2.818975726362143e+00;
    > dv = 4.394531250000000e-03;
    > N = 1146;
    > j0 = 1786;
    > a = -2.369681598890769e+00;
    > b = 1.384255444361780e+00;
    >
    > nothing wrong until here, below is the problem
    > q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+(N-0.5))-j0),
    > 3.0)
    >    = -1.369681598890769e+00; (wrong answer)
    > q0 = 1.0 +a + b
    >     = 1.457384547101137e-02; (correct answer)
    >
    > It looked like if a formula involvs a lot of floating point
    > calculation, it is safer to save the temporary data into a variable.
    >
    > Why this is happened and how could I fix this problem? Any suggestion
    > is appreciated. Thank you!


    btw, i am using g++ 4.3.3 with -O3 optimization turned on. The machine
    is 64bits.
    bei, Jan 6, 2010
    #2
    1. Advertising

  3. bei

    Rune Allnor Guest

    On 6 Jan, 04:11, bei <> wrote:
    > On Jan 5, 7:09 pm, bei <> wrote:
    >
    >
    >
    >
    >
    > > Hello,

    >
    > > I was bothered by the following problem for a while and could not
    > > understand how it happened.

    >
    > > They are a lot of floating point calculation in the code.
    > > Basically, i am doing an interpolation from a random position to the
    > > closest grid point position. It is a 1d problem.
    > > The domain is from (-L, L) and N is the number of grid point at a half
    > > domain.
    > > The grid size is dv, where dv = L/N; The grid is cell centered (-0.5
    > > in the code below).
    > > L, dv are declared to be double type and N to be integer type.

    >
    > > Below is a part of the code that bothered me (where v is the position,
    > > j is the closest grid point index number)

    >
    > >  int j0 = (int)(floor((v/dv+(N-0.5))));
    > >  double a = -2.5*pow(((v/dv+(N-0.5))-j0), 2.0);
    > >  double b = 1.5*pow(((v/dv+(N-0.5))-j0), 3.0);
    > >  double q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+
    > > (N-0.5))-j0), 3.0);
    > >  double q0 = 1.0 +a + b; (q0 is equal to q in the language of math.)

    >
    > > the output is
    > > v = 2.818975726362143e+00;
    > > dv = 4.394531250000000e-03;
    > > N = 1146;
    > > j0 = 1786;
    > > a = -2.369681598890769e+00;
    > > b = 1.384255444361780e+00;

    >
    > > nothing wrong until here, below is the problem
    > > q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+(N-0.5))-j0),
    > > 3.0)
    > >    = -1.369681598890769e+00; (wrong answer)
    > > q0 = 1.0 +a + b
    > >     = 1.457384547101137e-02; (correct answer)

    >
    > > It looked like if a formula involvs a lot of floating point
    > > calculation, it is safer to save the temporary data into a variable.

    >
    > > Why this is happened and how could I fix this problem? Any suggestion
    > > is appreciated. Thank you!

    >
    > btw, i am using g++ 4.3.3 with -O3 optimization turned on. The machine
    > is 64bits.


    The first place to look is data types. You have j0 as an int,
    and all the other factors as doubles. That's the kind of thing
    that almost always causes surprises and trouble. Declare j0 as
    double first and try again. Repeat for all integers in the
    computations.

    Next have a look at the pow function. Since you use powers of
    small integers, do something like

    pbase = (((v/dv+(N-0.5))-j0);
    p2 = pbase*pbase;
    p3 = p2*pbase;

    instead of using the pow() function. This would almost
    certainly be significantly faster than pow().

    As for why the problem occurs, I would *guess* that the
    optimizer recognizes that pow() is called twice in the same
    statement, with the same mantissa and different exponents.
    If this is a template implementation (which might well be
    the case), the optimizer might try to reduce the number of
    repeated computations in the two calls, messing things up
    in the process.

    Rune
    Rune Allnor, Jan 6, 2010
    #3
  4. bei

    bei Guest

    On Jan 5, 10:16 pm, Rune Allnor <> wrote:
    > On 6 Jan, 04:11, bei <> wrote:
    >
    >
    >
    > > On Jan 5, 7:09 pm, bei <> wrote:

    >
    > > > Hello,

    >
    > > > I was bothered by the following problem for a while and could not
    > > > understand how it happened.

    >
    > > > They are a lot of floating point calculation in the code.
    > > > Basically, i am doing an interpolation from a random position to the
    > > > closest grid point position. It is a 1d problem.
    > > > The domain is from (-L, L) and N is the number of grid point at a half
    > > > domain.
    > > > The grid size is dv, where dv = L/N; The grid is cell centered (-0.5
    > > > in the code below).
    > > > L, dv are declared to be double type and N to be integer type.

    >
    > > > Below is a part of the code that bothered me (where v is the position,
    > > > j is the closest grid point index number)

    >
    > > >  int j0 = (int)(floor((v/dv+(N-0.5))));
    > > >  double a = -2.5*pow(((v/dv+(N-0.5))-j0), 2.0);
    > > >  double b = 1.5*pow(((v/dv+(N-0.5))-j0), 3.0);
    > > >  double q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+
    > > > (N-0.5))-j0), 3.0);
    > > >  double q0 = 1.0 +a + b; (q0 is equal to q in the language of math.)

    >
    > > > the output is
    > > > v = 2.818975726362143e+00;
    > > > dv = 4.394531250000000e-03;
    > > > N = 1146;
    > > > j0 = 1786;
    > > > a = -2.369681598890769e+00;
    > > > b = 1.384255444361780e+00;

    >
    > > > nothing wrong until here, below is the problem
    > > > q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+(N-0.5))-j0),
    > > > 3.0)
    > > >    = -1.369681598890769e+00; (wrong answer)
    > > > q0 = 1.0 +a + b
    > > >     = 1.457384547101137e-02; (correct answer)

    >
    > > > It looked like if a formula involvs a lot of floating point
    > > > calculation, it is safer to save the temporary data into a variable.

    >
    > > > Why this is happened and how could I fix this problem? Any suggestion
    > > > is appreciated. Thank you!

    >
    > > btw, i am using g++ 4.3.3 with -O3 optimization turned on. The machine
    > > is 64bits.

    >
    > The first place to look is data types. You have j0 as an int,
    > and all the other factors as doubles. That's the kind of thing
    > that almost always causes surprises and trouble. Declare j0 as
    > double first and try again. Repeat for all integers in the
    > computations.
    >
    > Next have a look at the pow function. Since you use powers of
    > small integers, do something like
    >
    > pbase = (((v/dv+(N-0.5))-j0);
    > p2 = pbase*pbase;
    > p3 = p2*pbase;
    >
    > instead of using the pow() function. This would almost
    > certainly be significantly faster than pow().
    >
    > As for why the problem occurs, I would *guess* that the
    > optimizer recognizes that pow() is called twice in the same
    > statement, with the same mantissa and different exponents.
    > If this is a template implementation (which might well be
    > the case), the optimizer might try to reduce the number of
    > repeated computations in the two calls, messing things up
    > in the process.
    >
    > Rune


    Rune, very helpful suggestion. Thank you.
    bei, Jan 6, 2010
    #4
  5. bei

    bei Guest

    On Jan 6, 9:27 am, "io_x" <> wrote:
    > "bei" <> ha scritto nel messaggionews:...
    >
    >
    >
    > > Hello,

    >
    > > I was bothered by the following problem for a while and could not
    > > understand how it happened.

    > .......
    > > the output is
    > > v = 2.818975726362143e+00;
    > > dv = 4.394531250000000e-03;
    > > N = 1146;
    > > j0 = 1786;
    > > a = -2.369681598890769e+00;
    > > b = 1.384255444361780e+00;

    >
    > > nothing wrong until here, below is the problem
    > > q = 1.0-2.5*pow(((v/dv+(N-0.5))-j0), 2.0)+1.5*pow(((v/dv+(N-0.5))-j0),
    > > 3.0)
    > >   = -1.369681598890769e+00; (wrong answer)
    > > q0 = 1.0 +a + b
    > >    = 1.457384547101137e-02; (correct answer)

    >
    > > It looked like if a formula involvs a lot of floating point
    > > calculation, it is safer to save the temporary data into a variable.

    >
    > > Why this is happened and how could I fix this problem? Any suggestion
    > > is appreciated. Thank you!

    >
    > there is no problem here:
    > #include  <iostream>
    > using namespace std;
    >
    > int  main(void)
    > {double     v = 2.818975726362143e+00;
    >  double    dv = 4.394531250000000e-03;
    >  double     a = -2.369681598890769e+00;
    >  double     b = 1.384255444361780e+00;
    >  double    q, q0;
    >  unsigned  j0 = 1786;
    >  unsigned   N = 1146;
    >
    >  q = 1.0-2.5*pow(v/dv+N-0.5-j0, 2.0)
    >         +1.5*pow(v/dv+N-0.5-j0, 3.0);
    >  q0= 1.0 +a + b;
    >  cout.setf(ios::scientific);
    >  cout.precision(15);
    >
    >  cout <<"q==" << q << " ; q0=" << q0 << "\n";
    >
    >  return  0;}
    >
    > has result
    > q==1.457384547101137e-02 ; q0=1.457384547101115e-02


    I got the same answer. I didn't get any problem if I cut the port of
    the trouble code and ran it as above. Rune might be right. The problem
    could be the pow() and v/dv+N-0.5-j0, 2.0, I modified the code as he
    suggested, so far it looks ok.
    bei, Jan 7, 2010
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. bei
    Replies:
    0
    Views:
    224
  2. Saraswati lakki
    Replies:
    0
    Views:
    1,295
    Saraswati lakki
    Jan 6, 2012
  3. Hyun chul Park

    Floating point calculation problem

    Hyun chul Park, Jul 8, 2008, in forum: Ruby
    Replies:
    3
    Views:
    99
    Axel Etzold
    Jul 8, 2008
  4. Chris Angelico

    Re: Floating point calculation problem

    Chris Angelico, Feb 2, 2013, in forum: Python
    Replies:
    16
    Views:
    229
    Michael Torrie
    Feb 3, 2013
  5. Chris Rebert

    Re: Floating point calculation problem

    Chris Rebert, Feb 2, 2013, in forum: Python
    Replies:
    0
    Views:
    104
    Chris Rebert
    Feb 2, 2013
Loading...

Share This Page