double -> text -> double

Discussion in 'C++' started by Ole Nielsby, Nov 29, 2006.

  1. Ole Nielsby

    Ole Nielsby Guest

    First, sorry if this is off-topic, not strictly being a C++ issue.
    I could not find a ng on numerics or serialization and I figure
    this ng is the closest I can get.

    Now the question:

    I want to serialize doubles in human-readable decimal form
    and be sure I get the exact same binary values when I read
    them back. (Right now, I don't care about NaN, infinities etc.)

    In essense, this boils down to converting a double to a
    (large) integer mantissa and a decimal exponent, and back,
    so that 3.1416 would be represented as {31416, -4}.

    I wrote a converter that always calculates the scaling
    separately and does it exactly the same way when reading
    and writing, using up to 17-digit mantiassa which should
    be sufficient precision. I then tested it with "dirty" numbers
    generated by trigonometry and exp functions, and found
    that it seems to work OK for exponents in the range +/-20
    approximately, but outside of that range, a few percent of
    the numbers come out different.

    Does anybody know of an algorithm that is known to
    work?

    I tried an algorithm that, when converting double->text,
    would convert back to double and try to adjust the mantissa
    if this double was different from the original, but this only
    made things worse.

    Regards/Ole Nielsby
    Ole Nielsby, Nov 29, 2006
    #1
    1. Advertising

  2. Ole Nielsby wrote:
    > First, sorry if this is off-topic, not strictly being a C++ issue.
    > I could not find a ng on numerics or serialization and I figure
    > this ng is the closest I can get.


    It's good enough. Every language will have a solution, I am guessing,
    but it would be language-specific.

    > Now the question:
    >
    > I want to serialize doubles in human-readable decimal form
    > and be sure I get the exact same binary values when I read
    > them back. (Right now, I don't care about NaN, infinities etc.)
    >


    Output more digits than the precision of the 'double'. See the
    'std::numeric_limits' template.

    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
    Victor Bazarov, Nov 29, 2006
    #2
    1. Advertising

  3. Ole Nielsby

    Noah Roberts Guest

    Ole Nielsby wrote:

    > I wrote a converter that always calculates the scaling
    > separately and does it exactly the same way when reading
    > and writing, using up to 17-digit mantiassa which should
    > be sufficient precision. I then tested it with "dirty" numbers
    > generated by trigonometry and exp functions, and found
    > that it seems to work OK for exponents in the range +/-20
    > approximately, but outside of that range, a few percent of
    > the numbers come out different.


    How are you comparing results?

    How are you doing the conversion from two ints into a double?

    Maybe there is room in those two things for some minor errors.
    Noah Roberts, Nov 29, 2006
    #3
  4. Ole Nielsby

    Geo Guest

    Ole Nielsby wrote:
    > First, sorry if this is off-topic, not strictly being a C++ issue.
    > I could not find a ng on numerics or serialization and I figure
    > this ng is the closest I can get.
    >
    > Now the question:
    >
    > I want to serialize doubles in human-readable decimal form
    > and be sure I get the exact same binary values when I read
    > them back. (Right now, I don't care about NaN, infinities etc.)
    >
    > In essense, this boils down to converting a double to a
    > (large) integer mantissa and a decimal exponent, and back,
    > so that 3.1416 would be represented as {31416, -4}.
    >
    > I wrote a converter that always calculates the scaling
    > separately and does it exactly the same way when reading
    > and writing, using up to 17-digit mantiassa which should
    > be sufficient precision. I then tested it with "dirty" numbers
    > generated by trigonometry and exp functions, and found
    > that it seems to work OK for exponents in the range +/-20



    > approximately, but outside of that range, a few percent of
    > the numbers come out different.
    >
    > Does anybody know of an algorithm that is known to
    > work?
    >
    > I tried an algorithm that, when converting double->text,
    > would convert back to double and try to adjust the mantissa
    > if this double was different from the original, but this only
    > made things worse.
    >
    > Regards/Ole Nielsby


    Do you mean on the same platform? Are you reading and writing from the
    same application?
    If so then it probably is possible, though you would need to outpout
    the full precision for the number. Saving in binary would be better.

    If you want to transfer the data between different platforms then it
    may not be possible. The two sets of hardware may not be able to
    exactly represent an identical set of doubles, you will have to live
    with some loss of accuracy. Anyway you should be prepared for this in
    all your double calculations, 'exact' with floating point numbers is
    not a very meaningful concept.
    Geo, Nov 29, 2006
    #4
  5. Floating point numbers that are not in the BCD format are usually stored
    so that the mantissa and the exponent are binary numbers and the
    exponent is a power of 2 instead of a power of 10.

    AFAIK the mantissa is not usually binary integer but a fractional binary
    number. I.e. if you have a 64-bit mantissa with binary representation B
    the real value of the mantissa is something like B/(2^64).

    Suppose that we have a floating point number R, the binary
    representation of the mantissa of R is B (an unsigned integer) and the
    exponent of R is E. Suppose that the width of the mantissa is W bits.
    Then R = S * B/(2^W) * (2^E) where S=+-1 is the sign of the number.

    We can write R = S * B * 2^(E-W).

    If E-W >= 0 then R is an integer. We may find the largest nonnegative
    integer E' so that R is divisible by 10^E' and represent R as
    R = S * M' * 10 ^E'.

    If E-W < 0 we can write

    R = S * (B * 5^(W-E)) * 10^(E-W)

    where W-E > 0 and 5^(W-E) is an integer.
    If it is necessary we may check if B * 5^(W-E) is divisible by some
    power of 10 and write

    R = S * (B * 5^(W-E) * 10^(-E')) * 10^(E-W+E')

    where B * 5^(W-E) is divisible by 10^E' and E' is a nonnegative integer.

    You should check the details of the floating point format that you are
    using. AFAIK the exponent is not always represented in 2's complement
    representation and it is possible that you have to add 1 in front of the
    binary representation B of the mantissa (the real value of the
    mantissa would be S*(1 + B/(2^W))).

    See also

    http://en.wikipedia.org/wiki/IEEE_Floating_Point_Standard

    --
    Tommi Höynälänmaa
    sähköposti / e-mail:
    kotisivu / homepage: http://www.iki.fi/tohoyn/
    =?ISO-8859-1?Q?Tommi_H=F6yn=E4l=E4nmaa?=, Nov 29, 2006
    #5
  6. If you do not need to have the serialized number in a human readable
    format (and you use floating point numbers whose exponent has base of 2,
    such as IEEE) it is far more easier and more efficient to serialize the
    number so that the base of the exponent is not converted from 2 to 10.

    So if you have R = S * (1 + B / (2^W)) * E you only need to print the
    integers S, B, and E (and W if it is not assumed to be a constant).
    These integers can also be printed in hexadecimal format.

    --
    Tommi Höynälänmaa
    sähköposti / e-mail:
    kotisivu / homepage: http://www.iki.fi/tohoyn/
    =?ISO-8859-1?Q?Tommi_H=F6yn=E4l=E4nmaa?=, Nov 29, 2006
    #6
  7. If you do not need to have the serialized number in a human readable
    format (and you use floating point numbers whose exponent has base of 2,
    such as IEEE) it is far more easier and more efficient to serialize the
    number so that the base of the exponent is not converted from 2 to 10.

    So if you have R = S * (1 + B / (2^W)) * 2^E you only need to print the
    integers S, B, and E (and W if it is not assumed to be a constant).
    These integers can also be printed in hexadecimal format.

    --
    Tommi Höynälänmaa
    sähköposti / e-mail:
    kotisivu / homepage: http://www.iki.fi/tohoyn/
    =?ISO-8859-1?Q?Tommi_H=F6yn=E4l=E4nmaa?=, Nov 29, 2006
    #7
  8. "Ole Nielsby" <> wrote in message
    news:456db67f$0$49204$...

    > I want to serialize doubles in human-readable decimal form
    > and be sure I get the exact same binary values when I read
    > them back. (Right now, I don't care about NaN, infinities etc.)


    > In essense, this boils down to converting a double to a
    > (large) integer mantissa and a decimal exponent, and back,
    > so that 3.1416 would be represented as {31416, -4}.


    > I wrote a converter that always calculates the scaling
    > separately and does it exactly the same way when reading
    > and writing, using up to 17-digit mantiassa which should
    > be sufficient precision. I then tested it with "dirty" numbers
    > generated by trigonometry and exp functions, and found
    > that it seems to work OK for exponents in the range +/-20
    > approximately, but outside of that range, a few percent of
    > the numbers come out different.


    > Does anybody know of an algorithm that is known to
    > work?


    Such an algorithm exists, but it's not easy.

    If I remember correctly, the IEEE 754 floating-point standard requires that
    when you convert a character string to floating-point, the result must be
    equal to what you would get if you correctly rounded the infinite-precision
    representation of that character string. When you convert a floating-point
    value to a string with enough digits, the result must be within 0.47 LSB of
    the exact binary value. This latter constraint guarantees that converting a
    floating-point number to character and back to floating-point will give you
    exactly the same result, provided that there are enough digits in the
    character version. Proving that the constraint was sufficient was Jerome
    Coonen's PhD thesis, which suggests how difficult the problem is.

    So if your implementation meets the IEEE 754 standard, the problem is easy
    to solve :)

    If it doesn't meet the standard, you have to figure it out yourself. Either
    you have to implement something that's as good as the standard, which isn't
    easy, or you're going to have to come up with another way of doing it that
    you can prove is as good, which is even harder.
    Andrew Koenig, Nov 29, 2006
    #8
  9. On 2006-11-29 17:34, Ole Nielsby wrote:
    > First, sorry if this is off-topic, not strictly being a C++ issue.
    > I could not find a ng on numerics or serialization and I figure
    > this ng is the closest I can get.
    >
    > Now the question:
    >
    > I want to serialize doubles in human-readable decimal form
    > and be sure I get the exact same binary values when I read
    > them back. (Right now, I don't care about NaN, infinities etc.)


    Perhaps I'm missing something here but for each value of a double there
    exists a real number, so step 1 would be to output all of the double as
    text (base 10 is nice but any would do). Step 2 would then be to read
    the double in again. If you have written the exact value of the double
    then when parsing the text into a double there should exist only one
    possible representation of that value which is the one that ought to be
    chosen.

    You could run into trouble when reading in a double if there exists no
    exact representation for it (as others have pointed out) but since the
    value was a double from the beginning an exact representation must exist.

    I've thrown together a small program that does just this using
    stringstreams to convert to and from strings which seems to work. It's
    not something I'm proud of (put together from pieces of code from other
    projects and some found on the net) but it should give you an idea of
    how to do it. Of course, this depends on the stringstreams to correctly
    translate from double to text and back again, if they don't you have a
    problem, but I expect that any compliant implementation can do this
    correctly.

    Code here: http://www.chalmers.it/~eriwik/main.cpp

    --
    Erik Wikström
    =?ISO-8859-1?Q?Erik_Wikstr=F6m?=, Nov 29, 2006
    #9
  10. Ole Nielsby

    Ole Nielsby Guest

    Andrew Koenig <> wrote:

    > "Ole Nielsby" <> wrote in message
    > news:456db67f$0$49204$...
    >
    >> I want to serialize doubles in human-readable decimal form
    >> and be sure I get the exact same binary values when I read
    >> them back. (Right now, I don't care about NaN, infinities etc.)

    >
    >> In essense, this boils down to converting a double to a
    >> (large) integer mantissa and a decimal exponent, and back,
    >> so that 3.1416 would be represented as {31416, -4}.

    >
    >> I wrote a converter [...] but [...] a few percent of
    >> the numbers come out different.

    >
    >> Does anybody know of an algorithm that is known to
    >> work?

    >
    > Such an algorithm exists, but it's not easy.
    >
    > If I remember correctly, the IEEE 754 floating-point standard requires
    > that when you convert a character string to floating-point, the result
    > must be equal to what you would get if you correctly rounded the
    > infinite-precision representation of that character string. When you
    > convert a floating-point value to a string with enough digits, the result
    > must be within 0.47 LSB of the exact binary value. This latter constraint
    > guarantees that converting a floating-point number to character and back
    > to floating-point will give you exactly the same result, provided that
    > there are enough digits in the character version. Proving that the
    > constraint was sufficient was Jerome Coonen's PhD thesis, which suggests
    > how difficult the problem is.
    >
    > So if your implementation meets the IEEE 754 standard, the problem is easy
    > to solve :)
    >
    > If it doesn't meet the standard, you have to figure it out yourself.
    > Either you have to implement something that's as good as the standard,
    > which isn't easy, or you're going to have to come up with another way of
    > doing it that you can prove is as good, which is even harder.


    The setting is this: I am implementing a homebrew fp language (PILS)
    by writing an interpreter in C++. Like Lisp, simple data can be serialized
    by outputting them in the syntax of the language. It is important that
    this doesn't change numbers, i.e. if a number is printed and re-read by
    the same process, it must be the same.

    The current implementation is in VC8/Win32 and stores numbers as
    double, i.e. 64 bit fpu format. The precision model is set to "high"
    which means the FPU uses 64 bit mantissa internally, but the mantissa
    is rounded to 52 bits when stored in a double variable. I use up to 18
    digit integers, which should be a few digits more than required for a
    52 bit mantissa.

    To isolate the precision issues from formatting details, I wrote
    a small class that does the conversion to/from a long long
    mantiassa and a decimal exponent.

    My conversion class looks as follows (please bear with my
    less-than-perfect C++ habits, I took up C++ to implement
    PILS because it seems next to impossible to interface asm
    to .NET...). Note: the power of 10 to multiply or divide is
    constructed naively by mulitiplying tens; this is not the
    optimal solution for large exponents, but this shouldn't
    make the numbers differ - the scale is constructed in
    the same way for reading/writing.


    class FloatSplit {
    public:
    long long mantissa;
    long exponent;
    double get(); //FloatSplit -> double
    void set(double value); //double -> FloatSplit
    };

    double FloatSplit::get() //after reading a number, convert to double
    {
    double scale = 1;
    double value = (double)mantissa;
    if (exponent > 0) {
    // Naive scale computation
    for (long e = 0; e < exponent; e++) scale *= 10;
    value = mantissa * scale;
    }
    else if (exponent < 0) {
    // Naive scale computation
    for (long e = 0; e > exponent; e--) scale *= 10;
    value = mantissa / scale;
    }
    return value;
    }

    void FloatSplit::set(double value) //Split the value for writing
    {
    mantissa = (long long)value;
    exponent = 0;
    if ((double)mantissa == value) return; /*integral values*/
    double absValue = value < 0 ? -value : value;
    double scale = 1;
    if (absValue >= 1e18) {
    exponent++;
    // Naive scale computation
    scale *= 10;
    while (absValue / scale >= 1e18 && exponent < 1000) {
    exponent++;
    // Naive scale computation
    scale *= 10;
    }
    mantissa = (long long)(absValue / scale);
    /* try to adjust mantissa - disabled, made things worse */
    // if (absValue > (double)mantissa * scale) mantissa++;
    // if (absValue < (double)mantissa * scale) mantissa--;
    }
    else if (absValue < 1e17) {
    while (absValue * scale < 1e17
    && absValue != (double)mantissa / scale
    && exponent > -1000)
    {
    // Naive scale computation
    scale *= 10;
    exponent--;
    mantissa = (long long)(absValue * scale);
    /* try to adjust mantissa - disabled, made things worse */
    // if (absValue > (double)mantissa / scale) mantissa++;
    // if (absValue < (double)mantissa / scale) mantissa--;
    }
    }
    if (value < 0) mantissa = -mantissa;
    }

    I tested like this:
    for (int i = -300; i <= 300; i++) testConvert(exp(i) * sin(i));
    and it failed for i = -278, -210, 61, 109, 129, 144, 160, 161,
    167, 172, 187, 200, 209, 220, 223, 245, 249, 253, 259, 262, 269,
    280, 299, 300.

    ---end---
    Ole Nielsby, Nov 30, 2006
    #10
  11. Ole Nielsby

    Kai-Uwe Bux Guest

    Ole Nielsby wrote:

    > First, sorry if this is off-topic, not strictly being a C++ issue.
    > I could not find a ng on numerics or serialization and I figure
    > this ng is the closest I can get.
    >
    > Now the question:
    >
    > I want to serialize doubles in human-readable decimal form
    > and be sure I get the exact same binary values when I read
    > them back. (Right now, I don't care about NaN, infinities etc.)


    Try something like this:

    #include <limits>
    #include <sstream>
    #include <string>
    #include <stdexcept>
    #include <cmath>
    #include <iomanip>

    template < typename Float >
    std::string to_string ( Float f ) {
    std::stringstream in;
    unsigned long const digits =
    static_cast< unsigned long >
    ( - std::log( std::numeric_limits<Float>::epsilon() )
    / std::log( 10.0 ) );
    if ( in << std::dec << std::setprecision(2+digits) << f ) {
    return ( in.str() );
    } else {
    throw ( std::invalid_argument( "conversion float to string failed" ) );
    }
    }

    template < typename Float >
    Float to_float ( std::string const & str ) {
    std::stringstream out ( str );
    Float result;
    if ( out >> result ) {
    return ( result );
    } else {
    throw ( std::invalid_argument( "conversion string to float failed" ) );
    }
    }

    #include <iostream>

    int main ( void ) {
    volatile double pi = 3.141592658539793234;
    std::string rep = to_string( pi );
    std::cout << rep << '\n';
    volatile double x = to_float<double>( rep );
    std::cout << ( x == pi ) << '\n';
    }


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Nov 30, 2006
    #11
  12. The errors you get may occur because you use floating point arithmetic
    for handling the mantissa and the exponent. Try to extract the mantissa
    and exponent as integer numbers out of the binary representation of the
    floating point number and use integer arithmetics for them in
    FloatSplit::set.
    You should also use integer arithmetics in FloatSplit::Get.
    If you divide an integer number with some power of 10 you may get a
    number whose representation as binary number has an infinitely long
    fractional part (i.e. something like 0.33333... with decimal numbers).
    This causes small rounding errors when the result is represented as a
    floating point number.

    Note that 64-bit integers may not be sufficient for this. OTOH, the
    width of the mantissa for IEEE double is less than 64 bits so it may be
    possible to handle that with 64-bit integers.

    See also

    http://www.gnu.org/software/libc/ma...zation-Functions.html#Normalization-Functions


    --
    Tommi Höynälänmaa
    sähköposti / e-mail:
    kotisivu / homepage: http://www.iki.fi/tohoyn/
    =?ISO-8859-1?Q?Tommi_H=F6yn=E4l=E4nmaa?=, Nov 30, 2006
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Web learner

    from List <double> to double[]

    Web learner, Apr 25, 2006, in forum: ASP .Net
    Replies:
    3
    Views:
    464
  2. sb
    Replies:
    4
    Views:
    297
    Alberto Barbati
    Feb 19, 2004
  3. Jacek Dziedzic
    Replies:
    5
    Views:
    372
    Old Wolf
    Apr 8, 2004
  4. ferran
    Replies:
    9
    Views:
    3,009
    Kevin Goodsell
    Apr 12, 2004
  5. Sydex
    Replies:
    12
    Views:
    6,453
    Victor Bazarov
    Feb 17, 2005
Loading...

Share This Page