Portable marshalling of floating point data

Brian · Dec 10, 2009

In said:
It's possible for the server to systematically send and receive data in
the internal format of the client, yes. Provided it knows this format
(the client tells it). Does it buy you anything? I'm not really sure.

The issue is particularly important with regards to floating
point; if the protocol specifies a format with a fixed maximum
precision (e.g. IEEE double), and the machines at both ends
both support higher precision, then information is lost
unnecessarily.

I'm working on the floating point support now and wondering
how to check if "both support higher precision." My intent
is to support IEEE floating point at the minimum. I assume
there's something the client tells the server similar to how
it tells it it's byte order.

On c.l.c++.m, Bart van Ingen Schenau posted the
following functions.

ostream& write_ieee(ostream& os, double val)
{
int power;
double significand;
unsigned char sign;
unsigned long long mantissa;
unsigned char bytes[8];

if(val<0)
{
sign=1;
val = -val;
}
else
{
sign=0;
}
significand = frexp(val,&power);

if (power < -1022 || power > 1023)
{
cerr << "ieee754: exponent out of range" << endl;
os.setstate(ios::failbit);
}
else
{
power += 1022;
}
mantissa = (significand-0.5) * pow(2,53);

bytes[0] = ((sign & 0x01) << 7) | ((power & 0x7ff) >> 4);
bytes[1] = ((power & 0xf)) << 4 |
((mantissa & 0xfffffffffffffLL) >> 48);
bytes[2] = (mantissa >> 40) & 0xff;
bytes[3] = (mantissa >> 32) & 0xff;
bytes[4] = (mantissa >> 24) & 0xff;
bytes[5] = (mantissa >> 16) & 0xff;
bytes[6] = (mantissa >> 8) & 0xff;
bytes[7] = mantissa & 0xff;
return os.write(reinterpret_cast<const char*>(bytes), 8);

}

istream& read_ieee(istream& is, double& val)
{
unsigned char bytes[8];

is.read(reinterpret_cast<char*>(bytes), 8);
if (is)
{
int power;
double significand;
unsigned char sign;
unsigned long long mantissa;

mantissa = ( ((unsigned long long)bytes[7]) |
(((unsigned long long)bytes[6]) << 8) |
(((unsigned long long)bytes[5]) << 16) |
(((unsigned long long)bytes[4]) << 24) |
(((unsigned long long)bytes[3]) << 32) |
(((unsigned long long)bytes[2]) << 40) |
(((unsigned long long)bytes[1]) << 48) )
& 0xfffffffffffffLL;
significand = (mantissa/pow(2,53)) + 0.5;
power = (((bytes[1] >> 4) |
(((unsigned int)bytes[0]) << 4)) & 0x7ff) - 1022;
sign = bytes[0] >> 7;
val = ldexp(significand, power);
if (sign) val = -val;
}
return is;

}

---------------------------------------

I plan to use them as the basis of the floating point
support I'm working on. In the write function he has:

bytes[1] = ((power & 0xf)) << 4 |
((mantissa & 0xfffffffffffffLL) >> 48);

Would it be equivalent to write it like this:
bytes[1] = ((power & 0xf)) << 4 |
((mantissa >> 48) & 0xf);

?

Please let me know if anyone detects problems
with the above functions.

Brian Wood
http://webEbenezer.net

James Kanze · Dec 11, 2009

In a previous thread James Kanze wrote:

I'm working on the floating point support now and wondering
how to check if "both support higher precision." My intent
is to support IEEE floating point at the minimum. I assume
there's something the client tells the server similar to how
it tells it it's byte order.

Again, it's a question of how portable you want to be. For
integers, it's generally sufficient to know byte order, but
there are exceptions, and if you really want to support the
client format, you need to know the number of bits, whether
there are reserved bits, and where they are, and the
representation of negative numbers. (There's at least one 1's
complement 36 bit machine still in production, and one which
uses 48 bit signed magnitude, with 8 reserved bits which must be
0.) For floating point, in addition, you need to know the base,
the exponent representation (excess what, except that at least
one machine uses signed magnitude for the exponent as well),
normalization policies, how many bits in each field and where
the field is, etc., in addition to the information you need for
integers. (FWIW, of the formats I know, all use signed
magnitude for the mantissa, all but one use excess something for
the exponent, and the only bases I've seen are 2, 8 and 16. And
there are at least three different normalization policies.)

On c.l.c++.m, Bart van Ingen Schenau posted the
following functions.

Note that IMHO, this is more complicated than needed, because it
works directly on the bytes. In my implementation, at least
when reading, I leverage off the fact that I already have a routine
which can read 64 bit unsigned ints. (I also do a little less
error checking at present.) Otherwise, my code is more or less
similar.

ostream& write_ieee(ostream& os, double val)
{
int power;
double significand;
unsigned char sign;
unsigned long long mantissa;
unsigned char bytes[8];

if(val<0)
{
sign=1;
val = -val;
}
else
{
sign=0;
}
significand = frexp(val,&power);

if (power < -1022 || power > 1023)
{
cerr << "ieee754: exponent out of range" << endl;
os.setstate(ios::failbit);
}
else
{
power += 1022;
}
mantissa = (significand-0.5) * pow(2,53);

Just a nit, but I think using ldexp would be faster (and
possibly more precise) here. My own code for the above is:

bool isNeg = source < 0 ;
if ( isNeg ) {
source = - source ;
}
int exp ;
if ( source == 0.0 ) {
exp = 0 ;
} else {
source = ldexp( frexp( source, &exp ), 53 ) ;
exp += 1022 ;
}
unsigned long long mant = static_cast< unsigned long long >
( value ) ;

A check that the exp is in range wouldn't be out of order here
(although I'm not sure what's the best thing to do about it if
it is---set failbit, as your example does, or consider it a
precondition error, i.e. a contract violation).

bytes[0] = ((sign & 0x01) << 7) | ((power & 0x7ff) >> 4);
bytes[1] = ((power & 0xf)) << 4 |
((mantissa & 0xfffffffffffffLL) >> 48);

Another nit: the '&' here aren't necessary, since the values are
guaranteed to fit, by construction.

bytes[2] = (mantissa >> 40) & 0xff;
bytes[3] = (mantissa >> 32) & 0xff;
bytes[4] = (mantissa >> 24) & 0xff;
bytes[5] = (mantissa >> 16) & 0xff;
bytes[6] = (mantissa >> 8) & 0xff;
bytes[7] = mantissa & 0xff;

return os.write(reinterpret_cast<const char*>(bytes), 8);
}

istream& read_ieee(istream& is, double& val)
{
unsigned char bytes[8];

is.read(reinterpret_cast<char*>(bytes), 8);
if (is)
{
int power;
double significand;
unsigned char sign;
unsigned long long mantissa;

mantissa = ( ((unsigned long long)bytes[7]) |
(((unsigned long long)bytes[6]) << 8) |
(((unsigned long long)bytes[5]) << 16) |
(((unsigned long long)bytes[4]) << 24) |
(((unsigned long long)bytes[3]) << 32) |
(((unsigned long long)bytes[2]) << 40) |
(((unsigned long long)bytes[1]) << 48) )
& 0xfffffffffffffLL;
significand = (mantissa/pow(2,53)) + 0.5;
power = (((bytes[1] >> 4) |
(((unsigned int)bytes[0]) << 4)) & 0x7ff) - 1022;
sign = bytes[0] >> 7;
val = ldexp(significand, power);
if (sign) val = -val;
}
return is;
}

And this is slightly shorter if you read an unsigned long long
and mask and shift it.

Neither his code nor mine take NaN's and Inf into consideration.
(What should happen if you read a trapping NaN?)

---------------------------------------

I plan to use them as the basis of the floating point support
I'm working on. In the write function he has:

bytes[1] = ((power & 0xf)) << 4 |
((mantissa & 0xfffffffffffffLL) >> 48);

Would it be equivalent to write it like this:
bytes[1] = ((power & 0xf)) << 4 |
((mantissa >> 48) & 0xf);
?

Yes. For that matter, you could write:

bytes[1] = ((power << 4) | (mantissa >> 48)) & 0xFF;

The way mantissa is constructed, it's guaranteed that the bits
higher than 52 are all 0 (in his case, *not* in mine---I do need
the &0x0F).

Brian · Dec 13, 2009

Again, it's a question of how portable you want to be. For
integers, it's generally sufficient to know byte order, but
there are exceptions, and if you really want to support the
client format, you need to know the number of bits, whether
there are reserved bits, and where they are, and the
representation of negative numbers. (There's at least one 1's
complement 36 bit machine still in production, and one which
uses 48 bit signed magnitude, with 8 reserved bits which must be
0.)

My intent is to support machines with 2's complement, 8 bit
bytes, big or little endian byte order, and that have uint8_t,
uint16_t, uint32_t, uint64_t, int8_t, int16_t, int32_t and
int64_t. Possibly in the future I'll relax those
restrictions.

For floating point, in addition, you need to know the base,
the exponent representation (excess what, except that at least
one machine uses signed magnitude for the exponent as well),
normalization policies, how many bits in each field and where
the field is, etc., in addition to the information you need for
integers. (FWIW, of the formats I know, all use signed
magnitude for the mantissa, all but one use excess something for
the exponent, and the only bases I've seen are 2, 8 and 16. And
there are at least three different normalization policies.)

I think by posting copies of those functions posted by Bart
van Ingen Schenau I've confused matters. Since then I
was reading a Dec., 2008 post by you (Kanze) which says:

"If your portability needs are limited to machines supporting
IEEE floating point, however, memcpy'ing the floating point
value into an unsigned integral type of the same size, then
shifting an or'ing, is sufficient, and may be slightly faster.
(At least on a Sparc, however, the above is not outrageously
slow.)" At this point I want to add IEEE floating point
support, so iiuc I don't need to use those functions now.
Sorry for the confusion. If at a later date I want to beef
things up in this area, I'll return to this thread.

I still am not sure about how to go about avoiding what
you wrote here:

"The issue is particularly important with regards to floating
point; if the protocol specifies a format with a fixed
maximum precision (e.g. IEEE double), and the machines at
both ends both support higher precision, then information is
lost unnecessarily."

Maybe you are alluding to what the ieee spec talks about
with real*8 and real*10... both ends are using real*10
but the marshalling uses real*8. If that's correct,
how do you figure out what the ends are using?

Brian Wood
http://webEbenezer.net

James Kanze · Dec 14, 2009

My intent is to support machines with 2's complement, 8 bit
bytes, big or little endian byte order, and that have uint8_t,
uint16_t, uint32_t, uint64_t, int8_t, int16_t, int32_t and
int64_t. Possibly in the future I'll relax those
restrictions.

That already covers a lot of machines. Except for the two
Unisys mainframe architectures, I don't know of any modern
machine which doesn't use 2's complement, and isn't 8 bit byte
based (which leads to integral types of 8, 16, 32, 64... bytes).

I think by posting copies of those functions posted by Bart
van Ingen Schenau I've confused matters. Since then I was
reading a Dec., 2008 post by you (Kanze) which says:

"If your portability needs are limited to machines supporting
IEEE floating point, however, memcpy'ing the floating point
value into an unsigned integral type of the same size, then
shifting an or'ing, is sufficient, and may be slightly faster.
(At least on a Sparc, however, the above is not outrageously
slow.)" At this point I want to add IEEE floating point
support, so iiuc I don't need to use those functions now.

No. With perhaps some caveats with regards to NaNs and Inf. (I
seem to recall reading somewhere that there were some
incompatibilities in there regard, due to different
interpretations of the IEEE standard. But it's something I've
never had to deal with, so I don't know.)

Note that restricting your portability to IEEE is a lot more
limiting than restricting int's to 2's complement---as far as I
know, no mainframe today uses IEEE (although IBM has added it as
an option---last time I looked, however, it was significantly
slower than the traditional IBM format).

Sorry for the confusion. If at a later date I want to beef
things up in this area, I'll return to this thread.

I still am not sure about how to go about avoiding what
you wrote here:

"The issue is particularly important with regards to floating
point; if the protocol specifies a format with a fixed
maximum precision (e.g. IEEE double), and the machines at
both ends both support higher precision, then information is
lost unnecessarily."

Maybe you are alluding to what the ieee spec talks about with
real*8 and real*10... both ends are using real*10 but the
marshalling uses real*8. If that's correct, how do you figure
out what the ends are using?

No. I'm alluding to the fact that an IBM double has more
precision than IEEE. There are values of IBM doubles which
can't be converted precisely IEEE doubles, and there are sets of
values which round to the same IEEE double.

Javascript programming in TheThingsNetwork	1	May 12, 2022
Portable IEEE754 write routine	7	Feb 17, 2010
Adding adressing of IPv6 to program	1	Feb 16, 2023
VHDL Floating Point Multiplier	0	Jan 9, 2011
Floating-point promotion behaviour.	25	Nov 19, 2010
Floating point load-store behaviour.	7	Aug 23, 2006
error in simulation of floating point adder	0	Nov 13, 2009
Checking for invalid floating point numbers	3	Jun 8, 2007

Portable marshalling of floating point data

Brian

James Kanze

Brian

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads