Type-punning / casting problem

Phil Endecott · Sep 14, 2007

Dear Experts,

I need a function that takes a float, swaps its endianness (htonl) in
place, and returns a char* pointer to its first byte. This is one of a
family of functions that prepare different data types for passing to
another process.

I have got confused by the rules about what won't work, what will work,
and what might work, when casting. Specifically, I have an
implementation that works until I remove my debugging, at which point
the compiler seems to decide that it can optimise away the writes to the
bytes other than the first, or something like that. Here it is:

template <typename T>
inline const char* encode_arg(T& t); // linker error if you try to
// encode a type for which there
// is no implementation

// This one works:
template <>
inline const char* encode_pq_arg<int>(int& i) {
i = htonl(i);
return reinterpret_cast<const char*>(&i);
}

// This one doesn't:
template <>
inline const char* encode_arg<float>(float& f) {
uint32_t* ptr = reinterpret_cast<uint32_t*>(&f);
*ptr = htonl(*ptr);
const char* cptr = reinterpret_cast<const char*>(ptr);
return cptr;
}

When I dump cptr[0] to cptr[3] before the return, it works. Without the
debug, it fails; it's obviously hard to see what ends up in the result
in that case, but it looks undefined.

So, is there some tweak that will make this work, and be certain to
work? Am I better off using a union, or is that even less defined?
Does anyone know what the rules actually are?

Many thanks,

Phil.

Victor Bazarov · Sep 14, 2007

Phil said:
I need a function that takes a float, swaps its endianness (htonl) in
place, and returns a char* pointer to its first byte. This is one of
a family of functions that prepare different data types for passing to
another process.

I have got confused by the rules about what won't work, what will
work, and what might work, when casting. Specifically, I have an
implementation that works until I remove my debugging, at which point
the compiler seems to decide that it can optimise away the writes to
the bytes other than the first, or something like that. Here it is:
[..]
So, is there some tweak that will make this work, and be certain to
work? Am I better off using a union, or is that even less defined?
Does anyone know what the rules actually are?

You need to make it platform-specific. When you need to reorder, do:

template<> char* encode<float>(float& f)
{
char* pc = reinterpret_cast<char*>(&f);
for (int i = 0, s = sizeof(float); i < s/2; ++i)
std::swap(pc, pc[s - i - 1]);
return pc;
}

When you don't need to reorder, don't.

V

Phil Endecott · Sep 14, 2007

Victor said:
Phil said:

I need a function that takes a float, swaps its endianness (htonl) in
place, and returns a char* pointer to its first byte. This is one of
a family of functions that prepare different data types for passing to
another process.

I have got confused by the rules about what won't work, what will
work, and what might work, when casting. Specifically, I have an
implementation that works until I remove my debugging, at which point
the compiler seems to decide that it can optimise away the writes to
the bytes other than the first, or something like that. Here it is:
[..]
So, is there some tweak that will make this work, and be certain to
work? Am I better off using a union, or is that even less defined?
Does anyone know what the rules actually are?

Click to expand...

You need to make it platform-specific. When you need to reorder, do:

template<> char* encode<float>(float& f)
{
char* pc = reinterpret_cast<char*>(&f);
for (int i = 0, s = sizeof(float); i < s/2; ++i)
std::swap(pc, pc[s - i - 1]);
return pc;
}

When you don't need to reorder, don't.

Hi Victor,

htonl is already a "no-op" on platforms where host-order==network-order.
In glibc it expands to an asm statement on x86 which does the
byte-swap in (I think) one instruction.

Can anyone see which type-punning rules are broken in the code that I
posted?

Phil.

Ian Collins · Sep 14, 2007

Phil said:
Dear Experts,

I need a function that takes a float, swaps its endianness (htonl) in
place, and returns a char* pointer to its first byte. This is one of a
family of functions that prepare different data types for passing to
another process.

I have got confused by the rules about what won't work, what will work,
and what might work, when casting. Specifically, I have an
implementation that works until I remove my debugging, at which point
the compiler seems to decide that it can optimise away the writes to the
bytes other than the first, or something like that. Here it is:

template <typename T>
inline const char* encode_arg(T& t); // linker error if you try to
// encode a type for which there
// is no implementation

// This one works:
template <>
inline const char* encode_pq_arg<int>(int& i) {

Typo? ^^

i = htonl(i);
return reinterpret_cast<const char*>(&i);
}

Did you intend the bytes of the passed value to be swapped?

// This one doesn't:
template <>
inline const char* encode_arg<float>(float& f) {
uint32_t* ptr = reinterpret_cast<uint32_t*>(&f);
*ptr = htonl(*ptr);
const char* cptr = reinterpret_cast<const char*>(ptr);
return cptr;
}

When I dump cptr[0] to cptr[3] before the return, it works. Without the
debug, it fails; it's obviously hard to see what ends up in the result
in that case, but it looks undefined.

What doesn't work?

Phil Endecott · Sep 15, 2007

Ian said:
Typo? ^^

Yes, copy&pastism, sorry!

Did you intend the bytes of the passed value to be swapped?

Yes; the caller doesn't need the value any more, so I swap it in-place
rather than making a copy. If I made a copy then I would have to
allocate memory for it and free it later.

// This one doesn't:
template <>
inline const char* encode_arg<float>(float& f) {
uint32_t* ptr = reinterpret_cast<uint32_t*>(&f);
*ptr = htonl(*ptr);
const char* cptr = reinterpret_cast<const char*>(ptr);
return cptr;
}

When I dump cptr[0] to cptr[3] before the return, it works. Without the
debug, it fails; it's obviously hard to see what ends up in the result
in that case, but it looks undefined.

Click to expand...

What doesn't work?

The four bytes pointed to by the const char* that this function returns
eventually get sent to a socket, and I can observe them at the other
end. They are wrong, i.e. if I pass 2.0 then the value at the other end
might be 8.923461290e-44. There is some non-determinism in the values I
see, which makes me think that I'm looking at uninitialised memory
rather than a systematic corruption. My first thought was to add
somethng like this:

std::cout << "cptr[0] = " << static_cast<int>(cptr[0]) << ....etc for
[1] to [3] ... << "\n";

However, as soon as I add this (just before the return statement), it
works: the correct value is seen in the other process. My feeling is
that the cptr[n] expressions in the debugging tell the compiler that
these bytes are needed; when the debuging is not there, it thinks that
they are not used, and optimises them away. Can you think of any other
explanation?

Regards,

Phil.

Victor Bazarov · Sep 15, 2007

Phil said:
[..]
The four bytes pointed to by the const char* that this function
returns eventually get sent to a socket, and I can observe them at
the other end. They are wrong, i.e. if I pass 2.0 then the value at
the other end might be 8.923461290e-44. There is some
non-determinism in the values I see, which makes me think that I'm
looking at uninitialised memory rather than a systematic corruption. My
first thought was to add somethng like this:

std::cout << "cptr[0] = " << static_cast<int>(cptr[0]) << ....etc for
[1] to [3] ... << "\n";

However, as soon as I add this (just before the return statement), it
works: the correct value is seen in the other process. My feeling is
that the cptr[n] expressions in the debugging tell the compiler that
these bytes are needed; when the debuging is not there, it thinks that
they are not used, and optimises them away. Can you think of any
other explanation?

Compilers are written by humans. Errare humanum est. Hence all
compilers have bugs, known and unknown. If you want to know whether
the bug you're encountering is known, contact the compiler writers.
If you just want a work-around, disable optimization for the small
module in which this function (these functions) is (are), and see if
it makes any difference. I've seen a significant improvement from
seemingly random behaviour with the HP and Sun compilers before, and
even <gasp!> with Microsoft's VC++ compiler, if optimizations are
disabled locally.

V

terminator · Sep 15, 2007

Phil said:
Phil said:

[..]
The four bytes pointed to by the const char* that this function
returns eventually get sent to a socket, and I can observe them at
the other end. They are wrong, i.e. if I pass 2.0 then the value at
the other end might be 8.923461290e-44. There is some
non-determinism in the values I see, which makes me think that I'm
looking at uninitialised memory rather than a systematic corruption. My
first thought was to add somethng like this:

Click to expand...

std::cout << "cptr[0] = " << static_cast<int>(cptr[0]) << ....etc for
[1] to [3] ... << "\n";

Click to expand...

However, as soon as I add this (just before the return statement), it
works: the correct value is seen in the other process. My feeling is
that the cptr[n] expressions in the debugging tell the compiler that
these bytes are needed; when the debuging is not there, it thinks that
they are not used, and optimises them away. Can you think of any
other explanation?

Click to expand...

Compilers are written by humans. Errare humanum est. Hence all
compilers have bugs, known and unknown. If you want to know whether
the bug you're encountering is known, contact the compiler writers.
If you just want a work-around, disable optimization for the small
module in which this function (these functions) is (are), and see if
it makes any difference. I've seen a significant improvement from
seemingly random behaviour with the HP and Sun compilers before, and
even <gasp!> with Microsoft's VC++ compiler, if optimizations are
disabled locally.

It looks like an auto optimization error.Maybe - as a result of
pointer algebra on small argument type - the enregisterd parameter is
copied to stack and operated upon ,and filled with rubish at
return(you are returning a pointer to the automatic variable).
can you write this:

union exg
{
float f;
char cl[sizeof(float)];
exg& IndianSwap();
};

exg e={value};
e.IndianSwap();

and leave the optimization to the compiler(I think modern compilers
can do fine on this special case)?

regards,
FM.

Victor Bazarov · Sep 15, 2007

terminator said:
[..]
can you write this:

union exg
{
float f;
char cl[sizeof(float)];
exg& IndianSwap();
};

exg e={value};
e.IndianSwap();

and leave the optimization to the compiler(I think modern compilers
can do fine on this special case)?

Initialising (or assigning to) one part of a union and using another
has undefined behaviour. Or at least it used to. So, I'd stay away
from such code, it is definitely non-portable, although probably just
as reinterpret_cast to an array of chars.

V

James Kanze · Sep 15, 2007

I need a function that takes a float, swaps its endianness (htonl) in
place, and returns a char* pointer to its first byte. This is one of a
family of functions that prepare different data types for passing to
another process.

I have got confused by the rules about what won't work, what will work,
and what might work, when casting. Specifically, I have an
implementation that works until I remove my debugging, at which point
the compiler seems to decide that it can optimise away the writes to the
bytes other than the first, or something like that. Here it is:

template <typename T>
inline const char* encode_arg(T& t); // linker error if you try to
// encode a type for which there
// is no implementation
// This one works:
template <>
inline const char* encode_pq_arg<int>(int& i) {
i = htonl(i);
return reinterpret_cast<const char*>(&i);
}

// This one doesn't:
template <>
inline const char* encode_arg<float>(float& f) {
uint32_t* ptr = reinterpret_cast<uint32_t*>(&f);
*ptr = htonl(*ptr);

What type does htonl take. And what does it do with it. This
looks like undefined behavior to me.

const char* cptr = reinterpret_cast<const char*>(ptr);
return cptr;
}

When I dump cptr[0] to cptr[3] before the return, it works. Without the
debug, it fails; it's obviously hard to see what ends up in the result
in that case, but it looks undefined.

So, is there some tweak that will make this work, and be certain to
work? Am I better off using a union, or is that even less defined?

Even less defined.

Does anyone know what the rules actually are?

The rules are that if the variable is declared as a float, the
only way you can access it is as a float, or as an array of char
or unsigned char. Anything else is unsigned behavior.

In practice, most compilers allow more; I think that g++
explicitly guarantees type punning with a union, and not a few
compilers guarantee it when a cast is used. Still, if you
really want to use htonl, the "correct" solution is to memcpy
the bytes of the float into a uint32_t, and call the function on
that. (Technically, even that can fail, but in practice, you're
probably safe.)

The "correct" way to stream a float, of course, is to extract
the exponent, sign and mantissa using functions like frexp and
ldexp. I've experimented with doing so; it's less complex than
it sounds, and at least on a Sun Sparc under Solaris, it's not
outrageously expensive in runtime either (which, I'll admit,
surprised me). My current code for writing a float, for
example, looks something like:

bool isNeg = source < 0 ;
if ( isNeg ) {
source = - source ;
}
int exp ;
if ( source == 0.0 ) {
exp = 0 ;
} else {
source = ldexp( frexp( source, &exp ), 24 ) ;
exp += 126 ;
}
unsigned long mant = source ;
dest.put( (isNeg ? 0x80 : 0x00) | exp >> 1 ) ;
dest.put( ((exp << 7) & 0x80) | ((mant >> 16) & 0x7F) ) ;
dest.put( mant >> 8 ) ;
dest.put( mant ) ;

Note that this code is independent of the host byte order, or
even it's floating point representation. It will always output
an IEEE float, high byte first, regardless of what the local
hardware does.

James Kanze · Sep 15, 2007

Phil said:
Phil said:

[..]
The four bytes pointed to by the const char* that this function
returns eventually get sent to a socket, and I can observe them at
the other end. They are wrong, i.e. if I pass 2.0 then the value at
the other end might be 8.923461290e-44. There is some
non-determinism in the values I see, which makes me think that I'm
looking at uninitialised memory rather than a systematic corruption. My
first thought was to add somethng like this:
std::cout << "cptr[0] = " << static_cast<int>(cptr[0]) << ....etc for
[1] to [3] ... << "\n";
However, as soon as I add this (just before the return statement), it
works: the correct value is seen in the other process. My feeling is
that the cptr[n] expressions in the debugging tell the compiler that
these bytes are needed; when the debuging is not there, it thinks that
they are not used, and optimises them away. Can you think of any
other explanation?

Click to expand...

Compilers are written by humans. Errare humanum est. Hence all
compilers have bugs, known and unknown. If you want to know whether
the bug you're encountering is known, contact the compiler writers.
If you just want a work-around, disable optimization for the small
module in which this function (these functions) is (are), and see if
it makes any difference. I've seen a significant improvement from
seemingly random behaviour with the HP and Sun compilers before, and
even <gasp!> with Microsoft's VC++ compiler, if optimizations are
disabled locally.

Click to expand...

It looks like an auto optimization error.Maybe - as a result of
pointer algebra on small argument type - the enregisterd parameter is
copied to stack and operated upon ,and filled with rubish at
return(you are returning a pointer to the automatic variable).

It's not a bug. The compiler is allowed to assume that a
variable is only modified/read through expressions of its type,
or through expressions of char or unsigned char type. The fact
that something is modified through an uint32_t* is irrelevant
when the compiler is optimizing accesses to a float, for
example.

can you write this:

union exg
{
float f;
char cl[sizeof(float)];
exg& IndianSwap();
};

exg e={value};
e.IndianSwap();

and leave the optimization to the compiler(I think modern
compilers can do fine on this special case)?

If "IndianSwap" accesses cl, then it's undefined behavior.

James Kanze · Sep 15, 2007

terminator said:
terminator said:

[..]
can you write this:
union exg
{
float f;
char cl[sizeof(float)];
exg& IndianSwap();
};
exg e={value};
e.IndianSwap();
and leave the optimization to the compiler(I think modern compilers
can do fine on this special case)?

Click to expand...

Initialising (or assigning to) one part of a union and using another
has undefined behaviour. Or at least it used to.

It still does, although some compilers may guarantee this as an
extension. (I think g++ does.)

So, I'd stay away
from such code, it is definitely non-portable, although probably just
as reinterpret_cast to an array of chars.

reinterpret_cast to an array of char's is probably OK. The
problem in the original code was assigning through the
uint32_t*; that's undefined behavior if, as was the case, the
pointer was the result of a reinterpret_cast from a float*.

Greg Herlihy · Sep 16, 2007

Dear Experts,

I need a function that takes a float, swaps its endianness (htonl) in
place, and returns a char* pointer to its first byte. This is one of a
family of functions that prepare different data types for passing to
another process.

I have got confused by the rules about what won't work, what will work,
and what might work, when casting. Specifically, I have an
implementation that works until I remove my debugging, at which point
the compiler seems to decide that it can optimise away the writes to the
bytes other than the first, or something like that. Here it is:

template <typename T>
inline const char* encode_arg(T& t); // linker error if you try to
// encode a type for which there
// is no implementation

// This one works:
template <>
inline const char* encode_pq_arg<int>(int& i) {
i = htonl(i);
return reinterpret_cast<const char*>(&i);

}

// This one doesn't:
template <>
inline const char* encode_arg<float>(float& f) {
uint32_t* ptr = reinterpret_cast<uint32_t*>(&f);
*ptr = htonl(*ptr);
const char* cptr = reinterpret_cast<const char*>(ptr);
return cptr;

}

And it shouldn't. The encode_arg() function is effectively returning a
pointer to a local variable (the parameter f). So the caller of
encode_arg() receives a pointer to an object that no longer exists;
and therefore the value of the bytes obtained by dereferencing the
returned pointer - could be anything.

Greg

Victor Bazarov · Sep 16, 2007

Greg said:
And it shouldn't. The encode_arg() function is effectively returning a
pointer to a local variable (the parameter f).

A pointer to the argument 'f'? You mean the address of it? The arg
is a reference. The address of a reference is the address of the
referenced object. Nothing is destroyed. Please revise your analysis.

So the caller of
encode_arg() receives a pointer to an object that no longer exists;
and therefore the value of the bytes obtained by dereferencing the
returned pointer - could be anything.

Greg

V

Phil Endecott · Sep 16, 2007

Hi James, thanks for your reply.

James said:
On Sep 14, 9:02 pm, Phil Endecott <[email protected]>
wrote:

The rules are that if the variable is declared as a float, the
only way you can access it is as a float, or as an array of char
or unsigned char. Anything else is unsigned behavior.
Thanks.

In practice, most compilers allow more; I think that g++
explicitly guarantees type punning with a union, and not a few
compilers guarantee it when a cast is used. Still, if you
really want to use htonl, the "correct" solution is to memcpy
the bytes of the float into a uint32_t, and call the function on
that. (Technically, even that can fail, but in practice, you're
probably safe.)

Right, I'll try that:

static inline float htonfloat(float f) {
uint32_t i;
memcpy(&i,&f,4);
i = htonl(i);
float r;
memcpy(&r,&i,4);
return r;
}

template <>
inline const char* encode_arg<float>(float& f) {
f = htonfloat(f);
return reinterpret_cast<const char*>(&f);
}

....it works! hurray!
(Yes, I could do it with just one memcpy, but I think they're optimised
away anyway.)

The "correct" way to stream a float, of course, is to extract
the exponent, sign and mantissa using functions like frexp and
ldexp. I've experimented with doing so; it's less complex than
it sounds, and at least on a Sun Sparc under Solaris, it's not
outrageously expensive in runtime either (which, I'll admit,
surprised me). My current code for writing a float, for
example, looks something like:

bool isNeg = source < 0 ;
if ( isNeg ) {
source = - source ;
}
int exp ;
if ( source == 0.0 ) {
exp = 0 ;
} else {
source = ldexp( frexp( source, &exp ), 24 ) ;
exp += 126 ;
}
unsigned long mant = source ;
dest.put( (isNeg ? 0x80 : 0x00) | exp >> 1 ) ;
dest.put( ((exp << 7) & 0x80) | ((mant >> 16) & 0x7F) ) ;
dest.put( mant >> 8 ) ;
dest.put( mant ) ;

Hmm. Interesting, but I think I prefer what I have above.

Regards,

Phil.

James Kanze · Sep 16, 2007

On Sep 16, 6:56 pm, Phil Endecott <[email protected]>
wrote:

[...]

Hmm. Interesting, but I think I prefer what I have above.

You prefer something which isn't portable, and which contains a
lot of undocumented dependencies on how the types are
represented, to something that is guaranteed to work everywhere?

Phil Endecott · Sep 16, 2007

James said:
On Sep 16, 6:56 pm, Phil Endecott <[email protected]>
wrote:

[...]

Hmm. Interesting, but I think I prefer what I have above.

Click to expand...

You prefer something which isn't portable, and which contains a
lot of undocumented dependencies on how the types are
represented, to something that is guaranteed to work everywhere?

I'm only interested in platforms with IEEE 754 floating point. I think
that my memcpy-then-htonl code will work on all such platforms, won't it?

Regards,

Phil.

Victor Bazarov · Sep 17, 2007

Phil said:
James said:

[..]
You prefer something which isn't portable, and which contains a
lot of undocumented dependencies on how the types are
represented, to something that is guaranteed to work everywhere?

Click to expand...

I'm only interested in platforms with IEEE 754 floating point. I
think that my memcpy-then-htonl code will work on all such platforms,
won't it?

You could simply convert the float into its hex representation using
the %a format (IIRC it's the lossless one), then sscanf it back...

V

James Kanze · Sep 17, 2007

James said:
James said:

On Sep 16, 6:56 pm, Phil Endecott <[email protected]>
wrote:
[...]

The "correct" way to stream a float, of course, is to extract
the exponent, sign and mantissa using functions like frexp and
ldexp. I've experimented with doing so; it's less complex than
it sounds, and at least on a Sun Sparc under Solaris, it's not
outrageously expensive in runtime either (which, I'll admit,
surprised me). My current code for writing a float, for
example, looks something like:
bool isNeg = source < 0 ;
if ( isNeg ) {
source = - source ;
}
int exp ;
if ( source == 0.0 ) {
exp = 0 ;
} else {
source = ldexp( frexp( source, &exp ), 24 ) ;
exp += 126 ;
}
unsigned long mant = source ;
dest.put( (isNeg ? 0x80 : 0x00) | exp >> 1 ) ;
dest.put( ((exp << 7) & 0x80) | ((mant >> 16) & 0x7F) ) ;
dest.put( mant >> 8 ) ;
dest.put( mant ) ;
Hmm. Interesting, but I think I prefer what I have above.

Click to expand...

You prefer something which isn't portable, and which contains a
lot of undocumented dependencies on how the types are
represented, to something that is guaranteed to work everywhere?

Click to expand...

I'm only interested in platforms with IEEE 754 floating point.
I think that my memcpy-then-htonl code will work on all such
platforms, won't it?

I think that the memcpy should work. htonl, on the other hand,
is a Berkley'ism. It's made its way into Posix, but I imagine
that a lot of non-Posix systems don't have it. And it's an
unnecessary hack; there are cleaner ways of getting the job
done.

James Kanze · Sep 17, 2007

Phil said:
Phil said:

James said:

[..]
You prefer something which isn't portable, and which contains a
lot of undocumented dependencies on how the types are
represented, to something that is guaranteed to work everywhere?

Click to expand...

I'm only interested in platforms with IEEE 754 floating point. I
think that my memcpy-then-htonl code will work on all such platforms,
won't it?

Click to expand...

You could simply convert the float into its hex representation using
the %a format (IIRC it's the lossless one), then sscanf it back...

file << std::setprecision( 7 ) << value ;

is also guaranteed to be lossless (for IEEE float). For that
matter, he *is* allowed to cast the pointer to un unsigned
char*, and output that (as hex, or otherwise). However, if the
protocol says that floating point numbers are in IEEE float
format, with the sign on the top bit, then the exponent, and the
mantissa on the bottom bits, and output with the high byte
first, then outputting hex or whatever won't help. When
outputting a protocol, you absolutely must conform to the
requirements of the protocol. Obviously, a protocol is easier
to implement and debug if it is all text, but not all protocols
are all text.

terminator · Sep 17, 2007

terminator said:
terminator said:

[..]
can you write this:

Click to expand...

union exg
{
float f;
char cl[sizeof(float)];
exg& IndianSwap();
};

Click to expand...

exg e={value};
e.IndianSwap();

Click to expand...

and leave the optimization to the compiler(I think modern compilers
can do fine on this special case)?

Click to expand...

Initialising (or assigning to) one part of a union and using another
has undefined behaviour. Or at least it used to. So, I'd stay away
from such code, it is definitely non-portable, although probably just
as reinterpret_cast to an array of chars.

V

as the program is about to deal with changing the
representation(endians) of a number,I see little chance for
portabality .We are just looking for a program that works.

regards,
FM.

type punning	9	Sep 4, 2008
Type punning question	21	Jun 17, 2009
Type punning, code reordering and overloaded operator new() withcustom allocator	0	Nov 23, 2007
Casting template argument	3	Jun 2, 2011
Type Punning	3	May 17, 2009
type-punning?	15	May 6, 2008
Type punning	13	Nov 30, 2005
Problem with depracated casting method (down casting)	3	Nov 20, 2008

Type-punning / casting problem

Phil Endecott

Victor Bazarov

Phil Endecott

Ian Collins

Phil Endecott

Victor Bazarov

terminator

Victor Bazarov

James Kanze

James Kanze

James Kanze

Greg Herlihy

Victor Bazarov

Phil Endecott

James Kanze

Phil Endecott

Victor Bazarov

James Kanze

James Kanze

terminator

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads