conversion problem int <-> double ?

Discussion in 'C++' started by Markus Dehmann, Apr 1, 2008.

  1. I have two integers i1 and i2, the second of which is guaranteed to be
    between 0 and 99, and I encode them into one double:

    double encoded = (double)i1 + (double)i2 / (double)100;

    So, for example, 324 and 2 become 324.02. Now I want to decode them
    using the function given below but it decodes the example as 324 and
    1, instead of 324 and 2.

    Can anyone tell me what's wrong and how to do this right? (my code see
    below)

    Thanks!
    Markus

    #include <iostream>

    void decode(double n, int& i1, int& i2){
    i1 = int(n);
    double rest = n - int(n);
    i2 = int(rest * 100.0); // i2 is 1, should be
    2
    }

    int main(int argc, char** argv){
    double n = 324.02;
    int p;
    int i;
    decode(n, p, i);
    std::cerr << "n=" << n <<", p=" << p << ", i=" << i << std::endl;
    return EXIT_SUCCESS;
    }
     
    Markus Dehmann, Apr 1, 2008
    #1
    1. Advertising

  2. Markus Dehmann

    Greg Herlihy Guest

    On Mar 31, 7:04 pm, Markus Dehmann <> wrote:
    > I have two integers i1 and i2, the second of which is guaranteed to be
    > between 0 and 99, and I encode them into one double:
    >
    > double encoded = (double)i1 + (double)i2 / (double)100;
    >
    > So, for example, 324 and 2 become 324.02. Now I want to decode them
    > using the function given below but it decodes the example as 324 and
    > 1, instead of 324 and 2.


    The problem is that floating point values usually have a binary
    representation. So precise decimal values (such as 324.02) often
    cannot be exactly represented with a floating point type. Instead, the
    floating point type stores the representable value nearest to the
    value specified (for example, the nearest representable value to
    324.02 is likely 324.01999).

    One solution would be to use decimal floating point arithmetic.
    Decimal floating point arithmetic would be able to represent 324.02
    exactly. But although support for decimal floating arithmetic is
    likely coming to the C++ library, actual implementations of this
    feature are not that common.

    A more likely (and practical) solution might be to use "fixed-point"
    arithmetic. Fixed point arithmetic is completely accurate - up to the
    specified resolution. For example, performing the above calculation
    with fixed point arithmetic (with 1/100 resolution) might look
    something like this:

    typedef long Fixed; // in 1/100ths of a unit

    Fixed n = 32402;
    long p = n/100;
    long i = n%100;

    Greg
     
    Greg Herlihy, Apr 1, 2008
    #2
    1. Advertising

  3. Markus Dehmann

    Jack Klein Guest

    On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
    <> wrote in comp.lang.c++:

    > I have two integers i1 and i2, the second of which is guaranteed to be
    > between 0 and 99, and I encode them into one double:
    >
    > double encoded = (double)i1 + (double)i2 / (double)100;


    This is not a very good idea, as you have found. The floating point
    data types in C, and just about every other computer language, use a
    fixed number of bits, and that limits their precision.

    In particular, the floating point representation in almost all
    computer systems, and certainly in all common ones programmed in C++,
    use binary fractions. That means a value like .125, or .25, or .5 is
    exactly representable in the fractional part of floating point values,
    but fractions that are not 1/(a power of 2) are not. They get rounded
    to the nearest binary fraction.

    > So, for example, 324 and 2 become 324.02. Now I want to decode them
    > using the function given below but it decodes the example as 324 and
    > 1, instead of 324 and 2.


    Actually, it does not become 324.02, it becomes some value slightly
    greater or smaller than 324.02, because .02 cannot be exactly
    represented in a binary fraction.

    > Can anyone tell me what's wrong and how to do this right? (my code see
    > below)


    Your basic idea is wrong.

    > Thanks!
    > Markus
    >
    > #include <iostream>
    >
    > void decode(double n, int& i1, int& i2){
    > i1 = int(n);
    > double rest = n - int(n);
    > i2 = int(rest * 100.0); // i2 is 1, should be
    > 2
    > }
    >
    > int main(int argc, char** argv){
    > double n = 324.02;
    > int p;
    > int i;
    > decode(n, p, i);
    > std::cerr << "n=" << n <<", p=" << p << ", i=" << i << std::endl;
    > return EXIT_SUCCESS;
    > }
    >


    Look at this short program:

    #include <iostream>
    #include <iomanip>

    int main()
    {
    double d = 304.0;
    d += (2 / 100.0);

    std::cout << "The value is " << std::setprecision(20) << d <<
    std::endl;
    return 0;
    }

    Here is the output of that program on my computer:

    The value is 304.01999999999998

    If you can't think of any better idea than trying to stick two integer
    values into a double, and there is almost certainly a better way, here
    are a few possible approaches:

    1. Since one number is always between 0 and 99, you could multiply
    the other number by 100 and add the second one. This will work if the
    first value is not too large to fit into a double when multiplied by
    100. You can calculate this by using the value of the macro DBL_DIG
    in the <cfloat> or <float.h> header.

    In my implementation, this value is 15, which means that a double can
    hold a whole number value up to 999,999,999,999,999 with no loss of
    precision. So if the first number is guaranteed not to be greater
    than 1/100 of this value, approach 1 will work.

    2. If you must stick integer values into floating point fractions, do
    not simply multiply them back up and assign them to an int. Assignment
    to an integer type causes truncation, any fractional portion is just
    chopped off. So .01999999999998 * 100 equals 1.999999999998 which
    gets truncated to 1.

    Instead, if you know the fraction is positive, pass it to the
    std::ceil() function before converting to int.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://c-faq.com/
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Apr 1, 2008
    #3
  4. Markus Dehmann wrote:
    > I have two integers i1 and i2, the second of which is guaranteed to be
    > between 0 and 99, and I encode them into one double:
    >
    > double encoded = (double)i1 + (double)i2 / (double)100;
    > ...


    As others already noted, floating-point numbers are normally represented
    in binary internally. For this reason, in order to keep your 'i2'
    encoded precisely in the fractional part of the floating-point number,
    you should use a power of 2 as a divisor. Since your 'i2' is in 0..99
    range, use 128 as a divisor in the encoder (and multiplier in the decoder)

    double encoded = i1 + (double) i2 / 128;

    This is still a pretty thin ice you'd be walking on, so you might be
    better off following the other suggestions. Yet replacing 100 with 128
    would fix the very basic error in your implementation of your original
    approach.

    --
    Best regards,
    Andrey Tarasevich
     
    Andrey Tarasevich, Apr 1, 2008
    #4
  5. Markus Dehmann

    James Kanze Guest

    On Apr 1, 4:44 am, Jack Klein <> wrote:
    > On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
    > <> wrote in comp.lang.c++:


    [Just a few odd comments, since the basic problem has
    already been addressed...]

    > > I have two integers i1 and i2, the second of which is
    > > guaranteed to be between 0 and 99, and I encode them into
    > > one double:


    > > double encoded = (double)i1 + (double)i2 / (double)100;


    The obvious question is: why? If you really do receive a value
    in this format (i.e. integer part and hundredths as two separate
    values), and want to treat it as a single value, fine, but then
    I don't understand why you want to go the other direction later.
    And I can't think of any other reason why one would want to do
    this.

    > This is not a very good idea, as you have found. The floating
    > point data types in C, and just about every other computer
    > language, use a fixed number of bits, and that limits their
    > precision.


    > In particular, the floating point representation in almost all
    > computer systems, and certainly in all common ones programmed
    > in C++, use binary fractions.


    At least one architecture that is relatively common (IBM
    mainframes) used base 16, and there's at least one base 8 out
    there still being sold, but that doesn't change anything---all
    of your comments which follow apply to any base which is a power
    of 2.

    So does anyone know of a machine for which there existed a C++
    compiler (or even a C compiler) which doesn't use a base which
    is a power of 2. I know that machines using base 10 existed in
    the past, but the ones I know of were out of production long
    before even C came along. Or maybe there is a compiler for IBM
    mainframes which uses their decimal arithmetic, rather than
    their floating point, for float and double (but I'd be very
    surprised).

    > That means a value like .125, or .25, or .5 is exactly
    > representable in the fractional part of floating point values,
    > but fractions that are not 1/(a power of 2) are not. They get
    > rounded to the nearest binary fraction.


    Just a nit, but that should be fractions that are not n/(a power
    of 2), where n is an integer. Something like .75 is no problem
    either. (Of course, if the power of 2 is greater than something
    like 51, you might get problems with some of those as well.)

    > > So, for example, 324 and 2 become 324.02. Now I want to
    > > decode them using the function given below but it decodes
    > > the example as 324 and 1, instead of 324 and 2.


    > Actually, it does not become 324.02, it becomes some value
    > slightly greater or smaller than 324.02, because .02 cannot be
    > exactly represented in a binary fraction.


    > > Can anyone tell me what's wrong and how to do this right?
    > > (my code see below)


    > Your basic idea is wrong.


    Hard to say without really knowing what his basic idea is:).
    Why does he want to do this? Anyway, two "obvious" solutions
    come to mind:

    -- pass through a textual representation:

    std::eek:stringstream s1 ;
    s1.precision( 2 ) ;
    s1.setf( std::ios::fixed, std::ios::floatfield ) ;
    s1 << encoded ;
    std::istringstream s2( s1.str() ) ;
    char dummyForDecimal ;
    s1 >> i1 >> dummyForDecimal >> i2 ;

    -- use the correct functions from C:

    double i1d ;
    i2 = nearbyint( 100.0 * modf( encoded, &i1d ) ) ;
    i1 = i1d ;

    Modf is in C90, and thus in C++ (in <cmath>). Nearbyint is an
    addition of C99, and thus will be in the next version of C++,
    and is possibly already available in your current C++ compiler.
    If not, replace the line with
    i2 = floor( 100.0 * modf( encoded, &i1d ) + 0.5 ) ;
    Although less robust, it should work for positive values
    constructed as above.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Apr 1, 2008
    #5
  6. Markus Dehmann

    Jack Klein Guest

    On Tue, 1 Apr 2008 01:10:24 -0700 (PDT), James Kanze
    <> wrote in comp.lang.c++:

    > On Apr 1, 4:44 am, Jack Klein <> wrote:
    > > On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
    > > <> wrote in comp.lang.c++:

    >
    > [Just a few odd comments, since the basic problem has
    > already been addressed...]
    >
    > > > I have two integers i1 and i2, the second of which is
    > > > guaranteed to be between 0 and 99, and I encode them into
    > > > one double:

    >
    > > > double encoded = (double)i1 + (double)i2 / (double)100;

    >
    > The obvious question is: why? If you really do receive a value
    > in this format (i.e. integer part and hundredths as two separate
    > values), and want to treat it as a single value, fine, but then
    > I don't understand why you want to go the other direction later.
    > And I can't think of any other reason why one would want to do
    > this.


    I rather think I covered this in my next sentence:

    > > This is not a very good idea, as you have found. The floating
    > > point data types in C, and just about every other computer
    > > language, use a fixed number of bits, and that limits their
    > > precision.

    >
    > > In particular, the floating point representation in almost all
    > > computer systems, and certainly in all common ones programmed
    > > in C++, use binary fractions.

    >
    > At least one architecture that is relatively common (IBM
    > mainframes) used base 16, and there's at least one base 8 out
    > there still being sold, but that doesn't change anything---all
    > of your comments which follow apply to any base which is a power
    > of 2.


    I've never actually programmed an IBM mainframe, but in the dim and
    distant past (> .25 century), I did use a C compiler for an early
    microprocessor (without hardware floating point) that used base 16 as
    well.

    But base 16 would add complications and not really change the problem.
    You can't represent 02 exactly in a base 16 fraction, either.

    > So does anyone know of a machine for which there existed a C++
    > compiler (or even a C compiler) which doesn't use a base which
    > is a power of 2. I know that machines using base 10 existed in
    > the past, but the ones I know of were out of production long
    > before even C came along. Or maybe there is a compiler for IBM
    > mainframes which uses their decimal arithmetic, rather than
    > their floating point, for float and double (but I'd be very
    > surprised).


    Can't help you there, never used (or seen) anything other than base 2
    and base 16, and the base 16 was before C++ was even a bright idea in
    Bjarne's mind, I think.

    > > That means a value like .125, or .25, or .5 is exactly
    > > representable in the fractional part of floating point values,
    > > but fractions that are not 1/(a power of 2) are not. They get
    > > rounded to the nearest binary fraction.

    >
    > Just a nit, but that should be fractions that are not n/(a power
    > of 2), where n is an integer. Something like .75 is no problem
    > either. (Of course, if the power of 2 is greater than something
    > like 51, you might get problems with some of those as well.)


    You're right actually, n/(a power of 2) is better than my wording.

    > > > So, for example, 324 and 2 become 324.02. Now I want to
    > > > decode them using the function given below but it decodes
    > > > the example as 324 and 1, instead of 324 and 2.

    >
    > > Actually, it does not become 324.02, it becomes some value
    > > slightly greater or smaller than 324.02, because .02 cannot be
    > > exactly represented in a binary fraction.

    >
    > > > Can anyone tell me what's wrong and how to do this right?
    > > > (my code see below)

    >
    > > Your basic idea is wrong.

    >
    > Hard to say without really knowing what his basic idea is:).
    > Why does he want to do this? Anyway, two "obvious" solutions
    > come to mind:


    Without knowing his reasoning, I gave him the benefit of the doubt,
    and still decided that he was wrong. If he worked for my company, he
    wouldn't write code that way a second time after the first code
    review.

    > -- pass through a textual representation:
    >
    > std::eek:stringstream s1 ;
    > s1.precision( 2 ) ;
    > s1.setf( std::ios::fixed, std::ios::floatfield ) ;
    > s1 << encoded ;
    > std::istringstream s2( s1.str() ) ;
    > char dummyForDecimal ;
    > s1 >> i1 >> dummyForDecimal >> i2 ;
    >
    > -- use the correct functions from C:
    >
    > double i1d ;
    > i2 = nearbyint( 100.0 * modf( encoded, &i1d ) ) ;
    > i1 = i1d ;
    >
    > Modf is in C90, and thus in C++ (in <cmath>). Nearbyint is an
    > addition of C99, and thus will be in the next version of C++,
    > and is possibly already available in your current C++ compiler.
    > If not, replace the line with
    > i2 = floor( 100.0 * modf( encoded, &i1d ) + 0.5 ) ;
    > Although less robust, it should work for positive values
    > constructed as above.


    There are no "non-icky" ways to do this. If its a space issue of some
    type, I will bet there are very few platforms where sizeof(std::div_t)
    is greater than sizeof(double).

    Putting the two values into a std::div_t would retain all the integer
    bits with no loss, and still allow easy conversion to a double if
    actually needed for some arcane purpose.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://c-faq.com/
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Apr 2, 2008
    #6
  7. Markus Dehmann

    James Kanze Guest

    On Apr 2, 6:47 am, Jack Klein <> wrote:
    > On Tue, 1 Apr 2008 01:10:24 -0700 (PDT), James Kanze
    > <> wrote in comp.lang.c++:
    > > On Apr 1, 4:44 am, Jack Klein <> wrote:
    > > > On Mon, 31 Mar 2008 19:04:01 -0700 (PDT), Markus Dehmann
    > > > <> wrote in comp.lang.c++:


    > > > > double encoded = (double)i1 + (double)i2 / (double)100;


    > > The obvious question is: why? If you really do receive a value
    > > in this format (i.e. integer part and hundredths as two separate
    > > values), and want to treat it as a single value, fine, but then
    > > I don't understand why you want to go the other direction later.
    > > And I can't think of any other reason why one would want to do
    > > this.


    > I rather think I covered this in my next sentence:


    Which is:

    > > > This is not a very good idea, as you have found. The
    > > > floating point data types in C, and just about every other
    > > > computer language, use a fixed number of bits, and that
    > > > limits their precision.


    I can see reasons why one might want to do this on input. Some
    external source is providing an integral value, followed by an
    integral number of 100ths, and you want to do various
    calculations on those values. Since the input is with an
    accuracy of at most a 100th, you'll normally only output with
    this accuracy as well, and for most trivial compuations, you can
    pretty much ignore the rounding errors (which will be far
    smaller).

    I can't see a reason why one would want to go back, however.
    (Maybe outputting to the same device?)

    [...]
    > > > > Can anyone tell me what's wrong and how to do this
    > > > > right? (my code see below)


    > > > Your basic idea is wrong.


    > > Hard to say without really knowing what his basic idea
    > > is:). Why does he want to do this? Anyway, two "obvious"
    > > solutions come to mind:


    > Without knowing his reasoning, I gave him the benefit of the doubt,
    > and still decided that he was wrong. If he worked for my company, he
    > wouldn't write code that way a second time after the first code
    > review.


    Even if it was what the requirements spefication demanded?

    > There are no "non-icky" ways to do this. If its a space issue
    > of some type, I will bet there are very few platforms where
    > sizeof(std::div_t) is greater than sizeof(double).


    I can't really believe that it's a space issue, since a double
    generally is the size of two int, and his input is two ints.

    > Putting the two values into a std::div_t would retain all the
    > integer bits with no loss, and still allow easy conversion to
    > a double if actually needed for some arcane purpose.


    I wouldn't call computing a new value an "arcane purpose". And
    if some external device is providing input in this format, then
    you have to deal with it. The question is why the round trip.
    Why does he want to go back to the original format?

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Apr 2, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sydex
    Replies:
    12
    Views:
    6,652
    Victor Bazarov
    Feb 17, 2005
  2. Schnoffos
    Replies:
    2
    Views:
    1,252
    Martien Verbruggen
    Jun 27, 2003
  3. Hal Styli
    Replies:
    14
    Views:
    1,713
    Old Wolf
    Jan 20, 2004
  4. J.M.
    Replies:
    5
    Views:
    809
  5. Shriramana Sharma
    Replies:
    8
    Views:
    295
    Gerhard Fiedler
    Jun 18, 2013
Loading...

Share This Page