converting floating point types round off error ....

Discussion in 'C++' started by ma740988, Oct 5, 2008.

  1. ma740988

    ma740988 Guest

    Consider the equation (flight dynamics stuff):

    Yaw (Degrees) = Azimuth Angle(Radians) * 180 (Degrees) /
    3.1415926535897932384626433832795 (Radians)

    There's a valid reason to use single precision floating point types.
    The number of decimal digits guaranteed to be correct on my
    implementation is 6. (i.e numeric_limits < float >::digits10 = 6 )

    If I'm reading the IEEE standard, I'd could paraphrase the issue
    surrounding conversion to a string and back _without_ loss of
    precision as follows:

    If a float is correct to a decimal string with a least 6 significant
    decimal digits, and then converted back to a float, then the final
    number must match the original.

    IOW: given
    float a = 1. F ;
    float aa = 0. ;
    std::stringstream s ;
    s. precision ( 6 ) ;
    s << std::scientific << a ;
    s >> aa;
    assert ( a != aa ) ;

    No sweat

    I have to serialize the Yaw answer above. The question: Is it safe to
    state that my PI representation is useless beyond six significant
    digits? I'd like for the C++ source to reflect my Matlab models but
    I'm starting to get concerned here with the conversion aspect.

    Is there a good source out there that will show me how far out I could
    represent a value ( say PI ) for both single and double precision
    before truncation/round off loss kicks in? ( I tend to struggle with
    numeric_limits at times + coupled with all the idiosyncrasies of
    machines and floating point types )
     
    ma740988, Oct 5, 2008
    #1
    1. Advertising

  2. ma740988

    Hans Bos Guest

    "ma740988" <> schreef in bericht
    news:...
    > Consider the equation (flight dynamics stuff):
    >
    > Yaw (Degrees) = Azimuth Angle(Radians) * 180 (Degrees) /
    > 3.1415926535897932384626433832795 (Radians)

    Note that here 3.141... is converted to a double not a float.
    >
    > There's a valid reason to use single precision floating point types.
    > The number of decimal digits guaranteed to be correct on my
    > implementation is 6. (i.e numeric_limits < float >::digits10 = 6 )
    >

    ....
    >
    > I have to serialize the Yaw answer above. The question: Is it safe to
    > state that my PI representation is useless beyond six significant
    > digits? I'd like for the C++ source to reflect my Matlab models but
    > I'm starting to get concerned here with the conversion aspect.



    The 6 digits is a minimal guarantee. That is if you convert a number with 6
    decimal digits to a float and then back to a string, the result is the same.

    When converting PI to a float, you want the floating point number that is
    closest to the number PI, not a a float number that when converting to a
    string will have the same 6 digits as the PI.
    So using more digits can give you a single precision floating point number
    closer to PI.

    Note also that every operation can result in a rounding error. So if you
    divide by a approximation of PI, and the result cannot be represented by a
    floating point, the result will be rounded.
    So even if your input is accurate in 6 digits, you should use double (or
    even long double) to perform your calculations.

    See "27 bits are not enough for 8-digit accuracy" from Bennet Goldberg ,
    "What every computer scientist should know about floating-point arithmetic"
    from David Goldberg, and the home page of William Kahan
    (http://www.cs.berkeley.edu/~wkahan/) for more info.

    Greetings,
    Hans.
     
    Hans Bos, Oct 5, 2008
    #2
    1. Advertising

  3. ma740988

    James Kanze Guest

    On Oct 5, 1:13 am, ma740988 <> wrote:
    > Consider the equation (flight dynamics stuff):


    > Yaw (Degrees) = Azimuth Angle(Radians) * 180 (Degrees) /
    > 3.1415926535897932384626433832795 (Radians)


    > There's a valid reason to use single precision floating point
    > types.
    > The number of decimal digits guaranteed to be correct on my
    > implementation is 6. (i.e numeric_limits < float >::digits10 = 6 )


    I'm not quite sure what you mean by "number of decimal digits
    guaranteed to be correct". Correct compared to what. To
    represent an IEEE floating point exactly in decimal, you need
    something like 24 digits. Typically, however, there's
    absolutely no need to represent it exactly.

    What numeric_limits<>::digits10 guarantees is that any decimal
    number with that many digits, converted to the floating point
    type and back to decimal with the same number of digits, will
    result in the same decimal number. (Indirectly, this also
    guarantees that two different decimal numbers with no more
    digits will result in two different floating point numbers in
    the machine.)

    > If I'm reading the IEEE standard, I'd could paraphrase the
    > issue surrounding conversion to a string and back _without_
    > loss of precision as follows:


    > If a float is correct to a decimal string with a least 6
    > significant decimal digits, and then converted back to a
    > float, then the final number must match the original.


    > IOW: given
    > float a = 1. F ;
    > float aa = 0. ;
    > std::stringstream s ;
    > s. precision ( 6 ) ;
    > s << std::scientific << a ;
    > s >> aa;
    > assert ( a != aa ) ;


    Except that numeric_limits<>::digits10 doesn't make any
    guarantees about converting to a string and back. For this, you
    need numeric_limits<>::max_digits10, which will only be
    available in the next version of the standard. (For an IEEE
    float, the value is 9.)

    Note that even a little bit of reasoning will reveal that 6
    isn't enough. The mantissa of an IEEE floating point is 24
    bits; since the high order bit is always 1, this means that it
    can take on 2^23 different values. since 2^23 > 10^6, quite
    clearly some different values will map to the same 6 digit
    decimal value.

    > No sweat


    > I have to serialize the Yaw answer above. The question: Is it
    > safe to state that my PI representation is useless beyond six
    > significant digits?


    Certainly not. Nine digits may be sufficient, however.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Oct 6, 2008
    #3
  4. ma740988

    ma740988 Guest

    I'd like thank you all (James - as always) for clearing up my
    confusion here. The claim was made that since the data is being
    serialized I'll be subject to round off and truncation errors above 6
    digits (for single precision floating point types). As Hans pointed
    out the 6 digits is a minimal guarantee. As you all pointed out (and
    I'm paraphrasing) the excess precision doesn't hurt. I have a follow-
    on question: Now given this source.


    # include <iostream>

    class Serializer {

    template <typename T>
    void Swap( T& var ) {
    char* start,*end;
    char tmp;
    start = (char * ) &var;
    end = (char *)&var;
    end += sizeof(T)-1;
    while(start < end) {
    tmp = *start;
    *start = *end;
    *end = tmp;
    start++;end--;
    }
    }

    public :


    //////////////////////////////////////////
    /// @name Overloaded functions using "double"
    //////////////////////////////////////////
    char* put_data(char* out, const double& source) {
    *(double *)out = source;
    return out + sizeof(double);
    }
    char* get_data(double& target, char* source) {
    target = *(double *)source;
    return source + sizeof(double);
    }
    char* put_swapped_data(char* out, const double& source) {
    *(double *)out = source;
    Swap(*(double *)out);
    return out + sizeof(double);
    }
    char* get_swapped_data(double& target, char* source) {
    target = *(double *)source;
    Swap(target);
    return source + sizeof(double);
    }
    //////////////////////////////////////////
    /// @name Overloaded functions using "float"
    //////////////////////////////////////////
    char* put_data(char* out, const float& source) {
    *(float *)out = source;
    return out + sizeof(float);
    }
    char* get_data(float& target, char* source) {
    target = *(float *)source;
    return source + sizeof(float);
    }
    char* put_swapped_data(char* out, const float& source) {
    *(float *)out = source;
    Swap(*(float *)out);
    return out + sizeof(float);
    }
    char* GetSwappedData(float& target, char* source) {
    target = *(float *)source;
    Swap(target);
    return source + sizeof(float);
    }

    ///////////////////////////////////////////////
    /// @name Overloaded functions using "unsigned char"
    ///////////////////////////////////////////////
    char* put_data(char* out, const unsigned char& source) {
    *(unsigned char *)out = source;
    return out + sizeof(unsigned char);
    }
    char* get_data(unsigned char& target, char* source) {
    target = *(unsigned char *)source;
    return source + sizeof(unsigned char);
    }
    char* put_swapped_data(char* out, const unsigned char& source) {
    *(unsigned char *)out = source;
    return out + sizeof(unsigned char);
    }
    char* GetSwappedData(unsigned char& target, char* source) {
    target = *(unsigned char *)source;
    return source + sizeof(unsigned char);
    }

    ///////////////////////////////////////////////
    /// @name Overloaded functions using short*
    ///////////////////////////////////////////////
    char* put_data(char* out, short* source,unsigned length16BitUnits)
    {
    memcpy(out, (char *)source, length16BitUnits*2);
    return out + length16BitUnits*2;
    }

    char* get_data(short* target, char* source,unsigned
    length16BitUnits) {
    memcpy((char *)target,source,length16BitUnits*2);
    return source+length16BitUnits*2;
    }

    char* put_swapped_data(char* out, short* source,unsigned
    length16BitUnits) {
    unsigned i;
    char* tmp = put_data(out,source,length16BitUnits);
    short* tmpShort = (short *)out;
    for(i = 0; i < length16BitUnits; i++) {
    Swap(*tmpShort);
    tmpShort++;
    }
    return tmp;
    }
    char* GetSwappedData(short* target, char* source,unsigned
    length16BitUnits) {
    unsigned i;
    char* tmp = get_data(target,source,length16BitUnits);
    short* tmpShort=target;
    for(i = 0; i < length16BitUnits; i++) {
    Swap(*tmpShort);
    tmpShort++;
    }
    return tmp;
    }

    };

    int main () {
    Serializer is ;
    int const size_of_type = 40 ;
    double source = 3.1415926535897932384626433832795;
    char buffer [ size_of_type ] = { 0 };
    char *ptr = is.put_data ( buffer, source ) ;
    //std::cout << *ptr << std::endl;
    std::cin.get() ;
    }

    First I think this ought to be written to use generic programming.
    That aside (I'm not a fan of char* but the vendor string facility -
    from what i understand is lacking), how would I verify that the
    contents of source is in the buffer and what is the required buffer
    size? (i.e based on the function prototype - should the buffer size be
    size of type - double in this case )
     
    ma740988, Oct 6, 2008
    #4
  5. ma740988

    James Kanze Guest

    On Oct 6, 2:38 pm, ma740988 <> wrote:
    > I'd like thank you all (James - as always) for clearing up my
    > confusion here. The claim was made that since the data is
    > being serialized I'll be subject to round off and truncation
    > errors above 6 digits (for single precision floating point
    > types). As Hans pointed out the 6 digits is a minimal
    > guarantee. As you all pointed out (and I'm paraphrasing) the
    > excess precision doesn't hurt. I have a follow- on question:
    > Now given this source.


    > # include <iostream>


    > class Serializer {
    >
    > template <typename T>
    > void Swap( T& var ) {
    > char* start,*end;
    > char tmp;
    > start = (char * ) &var;
    > end = (char *)&var;


    That's a reinterpret_cast. That should tell you immediately
    that something is wrong.

    > end += sizeof(T)-1;
    > while(start < end) {
    > tmp = *start;
    > *start = *end;
    > *end = tmp;
    > start++;end--;
    > }
    > }


    And the entire function looks very much like std::reverse< char* >,
    called with a reintpret_cast, e.g.:

    template< typename T >
    void swap( T& var )
    {
    std::reverse( reintpret_cast< char* >( &var ),
    reintpret_cast< char* >( &var + 1 ) ) ;
    }

    Any attempt to access the argument after having called this
    function (unless T is a character type) is undefined behavior.
    If T is an integral type, it will simply give an unspecified
    value on most modern machines; if T is a floating point type,
    there's a good chance of a core dump.

    > public :


    > //////////////////////////////////////////
    > /// @name Overloaded functions using "double"
    > //////////////////////////////////////////
    > char* put_data(char* out, const double& source) {
    > *(double *)out = source;


    And this will core domp 7 times in 8 on my machine (Sun Sparc).

    > return out + sizeof(double);
    > }
    > char* get_data(double& target, char* source) {
    > target = *(double *)source;


    As will this.

    > return source + sizeof(double);
    > }
    > char* put_swapped_data(char* out, const double& source) {
    > *(double *)out = source;


    And this.

    > Swap(*(double *)out);
    > return out + sizeof(double);
    > }
    > char* get_swapped_data(double& target, char* source) {
    > target = *(double *)source;


    And this.

    > Swap(target);
    > return source + sizeof(double);
    > }
    > //////////////////////////////////////////
    > /// @name Overloaded functions using "float"
    > //////////////////////////////////////////
    > char* put_data(char* out, const float& source) {
    > *(float *)out = source;
    > return out + sizeof(float);
    > }
    > char* get_data(float& target, char* source) {
    > target = *(float *)source;
    > return source + sizeof(float);
    > }
    > char* put_swapped_data(char* out, const float& source) {
    > *(float *)out = source;
    > Swap(*(float *)out);
    > return out + sizeof(float);
    > }
    > char* GetSwappedData(float& target, char* source) {
    > target = *(float *)source;
    > Swap(target);
    > return source + sizeof(float);
    > }


    As above, except these will only core dump 3 times in 4, rather
    than 7 in 8.

    You can't take a char*, and assign a float or a double to it;
    there's no guarantee that it is a legal address for a float or a
    double.

    > ///////////////////////////////////////////////
    > /// @name Overloaded functions using "unsigned char"
    > ///////////////////////////////////////////////
    > char* put_data(char* out, const unsigned char& source) {
    > *(unsigned char *)out = source;
    > return out + sizeof(unsigned char);
    > }
    > char* get_data(unsigned char& target, char* source) {
    > target = *(unsigned char *)source;
    > return source + sizeof(unsigned char);
    > }
    > char* put_swapped_data(char* out, const unsigned char& source) {
    > *(unsigned char *)out = source;
    > return out + sizeof(unsigned char);
    > }
    > char* GetSwappedData(unsigned char& target, char* source) {
    > target = *(unsigned char *)source;
    > return source + sizeof(unsigned char);
    > }


    > ///////////////////////////////////////////////
    > /// @name Overloaded functions using short*
    > ///////////////////////////////////////////////
    > char* put_data(char* out, short* source,unsigned length16BitUnits)
    > {
    > memcpy(out, (char *)source, length16BitUnits*2);
    > return out + length16BitUnits*2;
    > }


    > char* get_data(short* target, char* source,unsigned
    > length16BitUnits) {
    > memcpy((char *)target,source,length16BitUnits*2);
    > return source+length16BitUnits*2;
    > }


    > char* put_swapped_data(char* out, short* source,unsigned
    > length16BitUnits) {
    > unsigned i;
    > char* tmp = put_data(out,source,length16BitUnits);
    > short* tmpShort = (short *)out;
    > for(i = 0; i < length16BitUnits; i++) {
    > Swap(*tmpShort);
    > tmpShort++;
    > }
    > return tmp;
    > }
    > char* GetSwappedData(short* target, char* source,unsigned
    > length16BitUnits) {
    > unsigned i;
    > char* tmp = get_data(target,source,length16BitUnits);
    > short* tmpShort=target;
    > for(i = 0; i < length16BitUnits; i++) {
    > Swap(*tmpShort);
    > tmpShort++;
    > }
    > return tmp;
    > }
    > };


    > int main () {
    > Serializer is ;
    > int const size_of_type = 40 ;
    > double source = 3.1415926535897932384626433832795;
    > char buffer [ size_of_type ] = { 0 };
    > char *ptr = is.put_data ( buffer, source ) ;
    > //std::cout << *ptr << std::endl;
    > std::cin.get() ;
    > }


    > First I think this ought to be written to use generic
    > programming. That aside (I'm not a fan of char* but the
    > vendor string facility - from what i understand is lacking),
    > how would I verify that the contents of source is in the
    > buffer and what is the required buffer size? (i.e based on the
    > function prototype - should the buffer size be size of type -
    > double in this case )


    I'm not too sure what you're trying to do, but it looks like
    you're playing funny games with types, which will get you into
    trouble in the long run. (There are a few that you can
    sometimes play, if you have to for performance reasons, but you
    really have to know what you are doing.)

    If the problem is just serialization, I'd go with your original
    attempt to use textual formatting. It's a lot easier to debug,
    for starters. For IEEE floating point, it is guaranteed that 9
    decimal digits suffice for a round trip conversion for float,
    and 17 for double (provided the conversion routines are
    correct). See http://www.validlab.com/goldberg/paper.pdf, in
    particular the section "Binary to Decimal Conversion".

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Oct 6, 2008
    #5
  6. ma740988

    ma740988 Guest

    On Oct 6, 11:06 am, James Kanze <> wrote:
    > You can't take a char*, and assign a float or a double to it;
    > there's no guarantee that it is a legal address for a float or a
    > double.


    Point taken.

    > I'm not too sure what you're trying to do, but it looks like
    > you're playing funny games with types, which will get you into
    > trouble in the long run.  (There are a few that you can
    > sometimes play, if you have to for performance reasons, but you
    > really have to know what you are doing.)


    Well, the developer hired to do this came up with source shown. I'm a
    huge fan of std::string so when I see 'char *' to the extent that I
    saw in this code I became nervous( Sadly, support for std::string is
    always limited or non-existent when you go the embedded route. Not
    sure why). Long story short he had to take off on a 3 week vacation
    and I was trying to understand what his issue was with serializing the
    data given the requirements I gave him.

    >
    > If the problem is just serialization, I'd go with your original
    > attempt to use textual formatting.  It's a lot easier to debug,
    > for starters.  


    Got it. I'll have him figure out the right way to do this using
    (used sparingly) the 'C' way since since stringstream and string is
    off limits
     
    ma740988, Oct 7, 2008
    #6
  7. ma740988

    Guest

    On Oct 6, 10:06 am, James Kanze <> wrote:
    > On Oct 6, 2:38 pm,ma740988<> wrote:


    > >    char* put_swapped_data(char* out, const float& source) {
    > >      *(float *)out = source;
    > >      Swap(*(float *)out);
    > >      return out + sizeof(float);
    > >    }
    > >    char* GetSwappedData(float& target, char* source) {
    > >      target = *(float *)source;
    > >      Swap(target);
    > >      return source + sizeof(float);
    > >    }

    >
    > As above, except these will only core dump 3 times in 4, rather
    > than 7 in 8.

    Curiosity question. How were you able to arrive at the ratios 3/4
    (float), 7/8(double)?

    > You can't take a char*, and assign a float or a double to it;
    > there's no guarantee that it is a legal address for a float or a
    > double.


    What do you mean by 'no guarantee it is a legal address for a float or
    a double'? Now assume the problem is binary serialization, I suspect
    converting to an unsigned integer large enough for a float or double
    then playing games with bit shifting might work?
     
    , Dec 1, 2008
    #7
  8. ma740988

    Kai-Uwe Bux Guest

    wrote:

    > On Oct 6, 10:06 am, James Kanze <> wrote:
    >> On Oct 6, 2:38 pm,ma740988<> wrote:

    >
    >> > char* put_swapped_data(char* out, const float& source) {
    >> > *(float *)out = source;
    >> > Swap(*(float *)out);
    >> > return out + sizeof(float);
    >> > }
    >> > char* GetSwappedData(float& target, char* source) {
    >> > target = *(float *)source;
    >> > Swap(target);
    >> > return source + sizeof(float);
    >> > }

    >>
    >> As above, except these will only core dump 3 times in 4, rather
    >> than 7 in 8.

    > Curiosity question. How were you able to arrive at the ratios 3/4
    > (float), 7/8(double)?
    >
    >> You can't take a char*, and assign a float or a double to it;
    >> there's no guarantee that it is a legal address for a float or a
    >> double.

    >
    > What do you mean by 'no guarantee it is a legal address for a float or
    > a double'?


    Presumably, he means that the address designated by the char* does not
    satisfy the alignment requirements for float and double.

    > Now assume the problem is binary serialization, I suspect
    > converting to an unsigned integer large enough for a float or double
    > then playing games with bit shifting might work?


    a) There is not guarantee that such an integer type exists.

    b) Even if, the char* is not required to satisfy the alignment requirements
    of that mythical integer type.


    Instead of casting a char* to a float* and assigning from source, one could
    go the other way and cast &source, which is a float*, to a char* and copy
    sizeof(float) chars from there.

    Anyway, is there a reason to do binary serialization and not go through
    text?


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Dec 1, 2008
    #8
  9. ma740988

    James Kanze Guest

    On Dec 1, 4:11 am, wrote:
    > On Oct 6, 10:06 am, James Kanze <> wrote:


    > > On Oct 6, 2:38 pm,ma740988<> wrote:
    > > > char* put_swapped_data(char* out, const float& source) {
    > > > *(float *)out = source;
    > > > Swap(*(float *)out);
    > > > return out + sizeof(float);
    > > > }
    > > > char* GetSwappedData(float& target, char* source) {
    > > > target = *(float *)source;
    > > > Swap(target);
    > > > return source + sizeof(float);
    > > > }


    > > As above, except these will only core dump 3 times in 4,
    > > rather than 7 in 8.


    > Curiosity question. How were you able to arrive at the ratios
    > 3/4 (float), 7/8(double)?


    Alignment considerations. A float must be aligned on a multiple
    of four, a double on a multiple of eight.

    > > You can't take a char*, and assign a float or a double to
    > > it; there's no guarantee that it is a legal address for a
    > > float or a double.


    > What do you mean by 'no guarantee it is a legal address for a
    > float or a double'?


    Just that. The value in a char* may not be a legal address for
    a float or a double. (Of course, if you're messing around with
    reinterpret_cast, the value in a float* or a double* may not be
    a legal address for a float or a double. Don't use
    reinterpret_cast unless you really know what you're doing.)

    > Now assume the problem is binary serialization, I suspect
    > converting to an unsigned integer large enough for a float or
    > double then playing games with bit shifting might work?


    That's the way I usually do it:). Strictly speaking, it's not
    100% portable; for starters, you're not even guaranteed that
    such an unsigned integral type exists. (There is, in fact, at
    least one platform where it doesn't.) And even if it does,
    there's no guarantee concerning the format of a float. For
    maximum portability, you should define your serialized floating
    point format, and play games with frexp and ldexp to create it.
    Something like:

    bool isNeg = source < 0 ;
    if ( isNeg ) {
    source = - source ;
    }
    int exp ;
    if ( source == 0.0 ) {
    exp = 0 ;
    } else {
    source = ldexp( frexp( source, &exp ), 24 ) ;
    exp += 126 ;
    }
    uint32_t mant = source ;
    dest.put( (isNeg ? 0x80 : 0x00) | exp >> 1 ) ;
    dest.put( ((exp << 7) & 0x80) | ((mant >> 16) & 0x7F) ) ;
    dest.put( mant >> 8 ) ;
    dest.put( mant ) ;

    and

    uint32_t tmp ;
    operator>>( tmp ) ; // shifts and or's...
    if ( *this ) {
    float f = 0.0 ;
    if ( (tmp & 0x7FFFFFFF) != 0 ) {
    f = ldexp( ((tmp & 0x007FFFFF) | 0x00800000),
    (int)((tmp & 0x7F800000) >> 23) - 126 - 24 ) ;
    }
    if ( (tmp & 0x80000000) != 0 ) {
    f = -f ;
    }
    dest = f ;
    }

    (This results in XDR representation for floats.)

    If your portability needs are limited to machines supporting
    IEEE floating point, however, memcpy'ing the floating point
    value into an unsigned integral type of the same size, then
    shifting an or'ing, is sufficient, and may be slightly faster.
    (At least on a Sparc, however, the above is not outrageously
    slow.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 1, 2008
    #9
  10. ma740988

    Guest


    > Just that.  The value in a char* may not be a legal address for
    > a float or a double.  (Of course, if you're messing around with
    > reinterpret_cast, the value in a float* or a double* may not be
    > a legal address for a float or a double.  Don't use
    > reinterpret_cast unless you really know what you're doing.)


    I hate to become a nuisance but do you have a source where I obtain
    some more information on all this? I'm having a hard time
    understanding how a char* address could be illegal.


    > > Now assume the problem is binary serialization, I suspect
    > > converting to an unsigned integer large enough for a float or
    > > double then playing games with bit shifting might work?

    >
    > That's the way I usually do it:).  


    OK! I sent a private message to the OP a few days ago to see if/how
    he/she resolved this. He hasn't. The OP said he's seen a handful of
    your posts where you convert the value to an unsigned integer type
    large enough then parse it to the appropriate floating point type,
    then suggested I ask you to show an example of this.


    > (This results in XDR representation for floats.)
    >
    > If your portability needs are limited to machines supporting
    > IEEE floating point, however, memcpy'ing the floating point
    > value into an unsigned integral type of the same size, then
    > shifting an or'ing, is sufficient, and may be slightly faster.
    > (At least on a Sparc, however, the above is not outrageously
    > slow.)


    Even so there's no guarantees in all this correct?

    It seems to me that you're there's no portable way to reinterpret_cast
    a T* to a char* or vice versa. True/False?
     
    , Dec 1, 2008
    #10
  11. ma740988

    Rolf Magnus Guest

    wrote:

    >> Just that. The value in a char* may not be a legal address for
    >> a float or a double. (Of course, if you're messing around with
    >> reinterpret_cast, the value in a float* or a double* may not be
    >> a legal address for a float or a double. Don't use
    >> reinterpret_cast unless you really know what you're doing.)

    >
    > I hate to become a nuisance but do you have a source where I obtain
    > some more information on all this? I'm having a hard time
    > understanding how a char* address could be illegal.


    The address is not illegal as a char*, but as a double*, it can be. On some
    machines, some types have alignment requirements, like e.g. that a double
    can only be at an address that is a multiple of sizeof(double). If you try
    to dereference a pointer that doesn't meet this alignment requirement,
    a CPU exception might be the result. On some other machines mis-aligned
    values can still be accessed, but at reduced performance.

    >> (This results in XDR representation for floats.)
    >>
    >> If your portability needs are limited to machines supporting
    >> IEEE floating point, however, memcpy'ing the floating point
    >> value into an unsigned integral type of the same size, then
    >> shifting an or'ing, is sufficient, and may be slightly faster.
    >> (At least on a Sparc, however, the above is not outrageously
    >> slow.)

    >
    > Even so there's no guarantees in all this correct?
    >
    > It seems to me that you're there's no portable way to reinterpret_cast
    > a T* to a char* or vice versa. True/False?


    The cast is not the problem. It's possible to portably cast the T* to
    char* and back, as long as you don't modify the pointer's value in between.
     
    Rolf Magnus, Dec 1, 2008
    #11
  12. ma740988

    James Kanze Guest

    On Dec 1, 9:19 pm, wrote:
    > > Just that. The value in a char* may not be a legal address for
    > > a float or a double. (Of course, if you're messing around with
    > > reinterpret_cast, the value in a float* or a double* may not be
    > > a legal address for a float or a double. Don't use
    > > reinterpret_cast unless you really know what you're doing.)


    > I hate to become a nuisance but do you have a source where I
    > obtain some more information on all this? I'm having a hard
    > time understanding how a char* address could be illegal.


    Modern byte addressed hardware usually requires floats to be at
    an address that is a multiple of four, and doubles at an address
    which is a multiple of eight. A char* can be any address. What
    happens when you try to access a float or a double with a
    misaligned pointer depends on the machine, but it's usually not
    good.

    > > > Now assume the problem is binary serialization, I suspect
    > > > converting to an unsigned integer large enough for a float or
    > > > double then playing games with bit shifting might work?


    > > That's the way I usually do it:).


    > OK! I sent a private message to the OP a few days ago to see
    > if/how he/she resolved this. He hasn't. The OP said he's
    > seen a handful of your posts where you convert the value to an
    > unsigned integer type large enough then parse it to the
    > appropriate floating point type, then suggested I ask you to
    > show an example of this.


    For output, it's pretty straight forward. With most compilers
    (but I think g++ no), you can get by with a reinterpret_cast
    between double/float and the equivalently sized unsigned
    integer, at least for writing. (For reading, you have to
    consider the possibility that you'll end up with a signaling
    NaN.) Or if the reinterpret_cast doesn't always work with your
    compiler, you can resort to memcpy. Having gotten a 4/8 byte
    unsigned integer, it's simply a matter of shifting and masking
    to output the desired value, PROVIDED your system uses the same
    floating point format as that used externally. (In practice,
    this generally means IEEE. So you're OK on PC's, Sun Sparcs,
    and most other mainstream Unix machines. But not on any of the
    mainframes I know of.) Input is similar: you read bytes,
    shifting and or'ing them into an appropriately sized unsigned
    integer. Then you check to ensure that it isn't a signaling
    NaN, and if not, you can safely move it into the float/double.

    > > (This results in XDR representation for floats.)


    > > If your portability needs are limited to machines supporting
    > > IEEE floating point, however, memcpy'ing the floating point
    > > value into an unsigned integral type of the same size, then
    > > shifting an or'ing, is sufficient, and may be slightly
    > > faster. (At least on a Sparc, however, the above is not
    > > outrageously slow.)


    > Even so there's no guarantees in all this correct?


    Which of all this? The code I posted is guaranteed to output a
    floating point value in XDR format, regardless of the machine,
    provided that the value is representable.

    > It seems to me that you're there's no portable way to
    > reinterpret_cast a T* to a char* or vice versa. True/False?


    There's nothing you can portably do with reinterpret_cast
    (except maybe casting null pointers). What you generally can do
    is reinterpret_cast between types of the same size, and get the
    bit pattern from one interpreted as if it were the other. What
    that means is also rather implementation dependant, however.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 2, 2008
    #12
  13. ma740988

    Guest

    On Dec 1, 4:55 am, James Kanze <> wrote:

    > That's the way I usually do it:).  Strictly speaking, it's not
    > 100% portable; for starters, you're not even guaranteed that
    > such an unsigned integral type exists.  (There is, in fact, at
    > least one platform where it doesn't.)  And even if it does,
    > there's no guarantee concerning the format of a float.  For
    > maximum portability, you should define your serialized floating
    > point format, and play games with frexp and ldexp to create it.
    > Something like:


    >     uint32_t            tmp ;
    >     operator>>( tmp ) ;     //  shifts and or's...


    Referencing your thread found here:

    http://groups.google.com/group/comp...568e455a6?hl=en&q=ByteGetter#91fa1c0c87a67f34

    From the looks of it you're taking the contents of a stream and
    parsing it accordingly. Now lets take the value 5.14245: ie
    std::istringstream iss ( "5.14245" ) ;

    Would it be safe to say that after the call to operator>> tmp* would
    be equal to: 0x352e313432343500 ( which is the ASCII hex
    representation for the value 5.14245)?


    * I'm assuming 'ixdrstream& ixdrstream::eek:perator>>( GB_uint64_t&
    dest )' gets invoked.
     
    , Dec 14, 2008
    #13
  14. ma740988

    James Kanze Guest

    On Dec 14, 3:55 am, wrote:
    > On Dec 1, 4:55 am, James Kanze <> wrote:


    > > That's the way I usually do it:). Strictly speaking, it's not
    > > 100% portable; for starters, you're not even guaranteed that
    > > such an unsigned integral type exists. (There is, in fact, at
    > > least one platform where it doesn't.) And even if it does,
    > > there's no guarantee concerning the format of a float. For
    > > maximum portability, you should define your serialized floating
    > > point format, and play games with frexp and ldexp to create it.
    > > Something like:
    > > uint32_t tmp ;
    > > operator>>( tmp ) ; // shifts and or's...


    > Referencing your thread found here:


    > http://groups.google.com/group/comp.lang.c .moderated/browse_thread/...


    > From the looks of it you're taking the contents of a stream
    > and parsing it accordingly. Now lets take the value 5.14245:
    > ie std::istringstream iss ( "5.14245" ) ;


    > Would it be safe to say that after the call to operator>> tmp* would
    > be equal to: 0x352e313432343500 ( which is the ASCII hex
    > representation for the value 5.14245)?


    Certainly not. First, the operator cited above is part of an
    ixdrstream, not an std::istream, so you couldn't even read a
    istringstream with it. Second, it reads four bytes, not eight.
    If the context of the input stream were the four bytes F3 8E A4
    40, in that order, the variable tmp will contain 0xF38EA440
    (regardless of the byte order of the input stream). And after
    the appropriate manipulations, the float (not double) will
    contain the the closest possible representation of 5.14245.

    > * I'm assuming 'ixdrstream& ixdrstream::eek:perator>>(
    > GB_uint64_t& dest )' gets invoked.


    Which won't happen if the type is uint32_t. (Or
    GB_uint32_t---the GB_ is because for portability reasons, I
    define my own, and need to avoid clashes. And the code predates
    namespaces.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 14, 2008
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. H aka N
    Replies:
    15
    Views:
    15,669
    Ben Jones
    Mar 2, 2006
  2. John Smith

    floating point round in C++

    John Smith, Dec 14, 2003, in forum: C++
    Replies:
    2
    Views:
    40,752
    Juergen Heinzl
    Dec 14, 2003
  3. Will Rocisky
    Replies:
    7
    Views:
    356
    Mensanator
    Aug 9, 2008
  4. Saraswati lakki
    Replies:
    0
    Views:
    1,348
    Saraswati lakki
    Jan 6, 2012
  5. Dermot Moynihan
    Replies:
    9
    Views:
    346
    Dermot Moynihan
    Jan 6, 2007
Loading...

Share This Page