Converting float to long bits

Y

ylegoc

The following code doesn't work with gcc 4.3.2:

long toI32Bits(float value) {
return *reinterpret_cast<long*>(&value);
}

but this one works:

long toI32Bits2(float value) {
float* av = new float(value);
long v = *reinterpret_cast<long*>(av);
delete av;
return v;
}

Is this due to the float value argument lifetime that is "released"
before the conversion in toI32Bits?
Is there a way to avoid a copy of value?
 
Ö

Öö Tiib

The following code doesn't work with gcc 4.3.2:

long toI32Bits(float value) {
        return *reinterpret_cast<long*>(&value);

}

but this one works:

long toI32Bits2(float value) {
        float* av = new float(value);
        long v = *reinterpret_cast<long*>(av);
        delete av;
        return v;

}

Is this due to the float value argument lifetime that is "released"
before the conversion in toI32Bits?
Is there a way to avoid a copy of value?

Hacking bits in float? Bits in float are implementation specific. If
you really need to reinterpret float as bits use unsigned types (that
are better for manipulating with bits) as bits from <stdint.h>. If
your compiler does not have <stdint.h> use <boost/cstdint.hpp>

#include <boost/cstdint.hpp>
#include <boost/static_assert.hpp>

typedef boost::uint32_t Bits32;

Bits32 toBits32(float value)
{
BOOST_STATIC_ASSERT( sizeof( float ) == sizeof( Bits32 ) )
union Turner
{
float from_;
Bits32 to_;
} turner;

turner.from_ = value;
return turner.to_;
}
 
M

Marcel Müller

Hi,
The following code doesn't work with gcc 4.3.2:

long toI32Bits(float value) {
return *reinterpret_cast<long*>(&value);
}

well, undefined behavior anyway.

In fact compiler problem with optimizations is also involved.

printf("%x\n", toI32Bits(1.5));

with option -O3 and above

subl $36, %esp
movl -8(%ebp), %eax <-- !!!
movl $0x3fc00000, -8(%ebp) <-- !!!
movl $.LC1, (%esp)
movl %eax, 4(%esp)
call printf
addl $36, %esp

with option -O

subl $20, %esp
movl $0x3fc00000, (%esp)
call _Z9toI32Bitsf
movl %eax, 4(%esp)
movl $.LC1, (%esp)
call printf
movl $0, %eax
addl $20, %esp

Note that the instruction scheduler did not handle the dependency of the
two movl instructions correctly. Most probably because the types of the
two pointers are different. Note the pointer aliasing rules!

Swapping the two assembler lines cures the problem.
Using the option -fno-strict-aliasing also. But that prevent a bunch of
optimizations.
Is this due to the float value argument lifetime that is "released"
before the conversion in toI32Bits?
No.

Is there a way to avoid a copy of value?

Fix the bug above and avoid dirty hacks like that.

The C functions frexp and ldexp provide a defined way to operate with
floating point values.


Marcel
 
O

orz

The following code doesn't work with gcc 4.3.2:

long toI32Bits(float value) {
        return *reinterpret_cast<long*>(&value);

}

but this one works:

long toI32Bits2(float value) {
        float* av = new float(value);
        long v = *reinterpret_cast<long*>(av);
        delete av;
        return v;

}

Is this due to the float value argument lifetime that is "released"
before the conversion in toI32Bits?
Is there a way to avoid a copy of value?

IIRC I've used this successfully with some compilers, for both reading
and writting:
long &toI32Bits2(float value) {return (long&)value;}
 
J

James Kanze

The following code doesn't work with gcc 4.3.2:
long toI32Bits(float value) {
return *reinterpret_cast<long*>(&value);
}

It should. Although formally undefined behavior, the
"undefined" is there to cover cases where 1) long has trapping
representations, or some other invalid bit patterns, and the bit
pattern in a float might correspond to them (the case on a very
few mainframes), or 2) long and float have different sizes (the
case on most modern machines, where long is 64 bits, and float
only 32).
but this one works:
long toI32Bits2(float value) {
float* av = new float(value);
long v = *reinterpret_cast<long*>(av);
delete av;
return v;
}
Is this due to the float value argument lifetime that is
"released" before the conversion in toI32Bits?

No.

Define "works", or at least how you're determining that one
works, and the other not. And at least throw in an
assert(sizeof(float) == sizeof(long)) at the start of the
function; this isn't usually the case except under Windows.
 
J

James Kanze

well, undefined behavior anyway.

Undefined, because there really isn't anything the standard can
say about this. (Imagine what would happen on a tagged
architecture, for example.) On the other hand, the intent is
clearly that the results be "unsurprising" for someone familiar
with the architecture of the machine in question. (But the
issues are very subtle at times.)
In fact compiler problem with optimizations is also involved.
printf("%x\n", toI32Bits(1.5));
with option -O3 and above
subl $36, %esp
movl -8(%ebp), %eax <-- !!!
movl $0x3fc00000, -8(%ebp) <-- !!!
movl $.LC1, (%esp)
movl %eax, 4(%esp)
call printf
addl $36, %esp

Apparently, the compiler has inlined the code for the function.
Putting it in a separate translation unit should solve this
problem, most of the time.

Of course, it's also a bit of perversity on the part of the
compiler writers: a reinterpret_cast clearly results in aliases
of different types, and any responsible compiler author will
recognize it, and handle it correctly. When the
reinterpret_cast is visible, of course; if you have a function
taking an int* and a float*, and you generate one of its
arguments using a reinterpret_cast, then all bets are off, but
given toI32Bits, defined as above, the only reason it might not
work is because the compiler authors are being perverse, and are
trying intentionally to trip you up, even at the expense of
ignoring the intent of the standard.
with option -O
subl $20, %esp
movl $0x3fc00000, (%esp)
call _Z9toI32Bitsf
movl %eax, 4(%esp)
movl $.LC1, (%esp)
call printf
movl $0, %eax
addl $20, %esp
Note that the instruction scheduler did not handle the
dependency of the two movl instructions correctly. Most
probably because the types of the two pointers are different.
Note the pointer aliasing rules!
Swapping the two assembler lines cures the problem. Using the
option -fno-strict-aliasing also. But that prevent a bunch of
optimizations.

Including some that aren't legal anyway:).

[...]
The C functions frexp and ldexp provide a defined way to
operate with floating point values.

If you want to be really, really portable, such functions are
the only way. But suppose you need to output floats in IEEE
binary format (e.g. for XDR), and your portability requirements
only include Windows and the major Unix platforms. All of which
use IEEE internally. Some ugly type punning, like the above,
can be significantly faster (and results in a lot shorter code).

Anyway, one safe way of doing it is by using memcpy; the
compiler isn't allowed to mungle that one, since the integer and
the float are in fact two different variables, and the copy uses
void*, which the compiler must consider as a possible alias,
regardless of the other type.
 
S

SG

The following code doesn't work with gcc 4.3.2:

long toI32Bits(float value) {
        return *reinterpret_cast<long*>(&value);
}

You're accessing the float object via an lvalue of type long.
According to the C++ standard section 3.10 paragraph 15 this is
undefined behaviour. This rule allows a compiler to assume that a
pointer of type long* does not refer to bytes where your float object
is located. Under the as-if rule a compiler is allowed to transform
the code (reordering, eliminating things, etc). It's possible that if
you violate 3.10/15 some kind of (legal) code transformation done by
the compiler can have a visible effect on your program's output. I
guess this is what has happened here.

If I remember correctly I wrote a similar function and it stopped
working when I turned on GCC's optimizations. The compiler is not to
blame here, though. Turning on optimizations just made my violation of
3.10/15 visible.
but this one works:

long toI32Bits2(float value) {
        float* av = new float(value);
        long v = *reinterpret_cast<long*>(av);
        delete av;
        return v;
}

Technically, still undefined behaviour. In this case, however, you can
simply write

unsigned long floatbits(float x) {
unsigned long r;
std::memcpy(&r,&x,sizeof r);
return r;
}

which doesn't violate 3.10/15 anymore and should work as expected. You
might want to use uint32_t (stdint.h, cstdint) instead of "unsigned
long" and check whether sizeof(uint32_t)==sizeof(float) holds as a
minimal sanity check. numeric_limits<float>::is_iec559 should also be
true if you rely on the implementation providing IEEE 754 floats.

Cheers,
SG
 
T

tni

Of course, it's also a bit of perversity on the part of the
compiler writers: a reinterpret_cast clearly results in aliases
of different types, and any responsible compiler author will
recognize it, and handle it correctly.

GCC has a long history with being very aggressive about the aliasing
rules (and breaking stuff that isn't strictly legal/well defined
according to the standard).
Anyway, one safe way of doing it is by using memcpy; the
compiler isn't allowed to mungle that one, since the integer and
the float are in fact two different variables, and the copy uses
void*, which the compiler must consider as a possible alias,
regardless of the other type.

That works quite well (including GCC) and is generally optimized away
(no performance penalty compared to using the cast).
 
M

Marcel Müller

James said:
Apparently, the compiler has inlined the code for the function.
Putting it in a separate translation unit should solve this
problem, most of the time.

As long as the function implementation does not share a similar problem.
In fact it does not.

[...]
given toI32Bits, defined as above, the only reason it might not
work is because the compiler authors are being perverse, and are
trying intentionally to trip you up, even at the expense of
ignoring the intent of the standard.

:)

A bug report would not be that bad.

[...]
The C functions frexp and ldexp provide a defined way to
operate with floating point values.

If you want to be really, really portable, such functions are
the only way. But suppose you need to output floats in IEEE
binary format (e.g. for XDR), and your portability requirements
only include Windows and the major Unix platforms. All of which
use IEEE internally. Some ugly type punning, like the above,
can be significantly faster (and results in a lot shorter code).

Anyway, one safe way of doing it is by using memcpy; the
compiler isn't allowed to mungle that one, since the integer and
the float are in fact two different variables, and the copy uses
void*, which the compiler must consider as a possible alias,
regardless of the other type.

A union should do the job as well. And it results more or less the same
UB as before.

long toI32Bits(float value)
{ union
{ float f;
long l;
} data;
data.f = value;
return data.l;
}

does the job. Even when the function is inlined. Of course, the code is
a bit less efficient.


Marcel
 
J

James Kanze

As long as the function implementation does not share a similar problem.
In fact it does not.
given toI32Bits, defined as above, the only reason it might not
work is because the compiler authors are being perverse, and are
trying intentionally to trip you up, even at the expense of
ignoring the intent of the standard.

A bug report would not be that bad.

The behavior is intentional.
[...]
The C functions frexp and ldexp provide a defined way to
operate with floating point values.
If you want to be really, really portable, such functions are
the only way. But suppose you need to output floats in IEEE
binary format (e.g. for XDR), and your portability requirements
only include Windows and the major Unix platforms. All of which
use IEEE internally. Some ugly type punning, like the above,
can be significantly faster (and results in a lot shorter code).
Anyway, one safe way of doing it is by using memcpy; the
compiler isn't allowed to mungle that one, since the integer and
the float are in fact two different variables, and the copy uses
void*, which the compiler must consider as a possible alias,
regardless of the other type.
A union should do the job as well. And it results more or less
the same UB as before.

There's a long history in this. Historically, before ISO C,
a union was the "approved" way of doing such type punning
(although I've used pre-standard compilers on which it didn't
work). For some reason, the ISO C committee more or less
blessed the pointer cast approach---still undefined behavior,
but the intent was that the behavior be what someone familiar
with the architecture would expect. The idea was, I think, to
allow some sort of "discriminating" implementation of a union,
for debugging; one which would crash the program if you accessed
a different member than the last one assigned to.

Practically, the actual wording was such that it guarantees
things I don't think the committee meant to guarantee, e.g.:

int f(float* f, int* i)
{
int retval = *i;
*f = 3.14159;
}

and in a different translation unit:

union U { float f; int l; };
U u;
u.i = 42;
printf( "%d", f( &u.f, &u.i ) );

As currently worded, both C and C++ claim that this is
guaranteed to work. G++ breaks it, and practically speaking, if
the non-aliasing guarantees are to be of any use at all, it
should be legally broken; in this case, I'd go with g++, and say
that it is the standard that is broken in requiring the above to
work.
long toI32Bits(float value)
{ union
{ float f;
long l;
} data;
data.f = value;
return data.l;

}
does the job. Even when the function is inlined. Of course,
the code is a bit less efficient.

It shouldn't be, once the compiler gets through with it. It
offers no more guarantees that the original version; in fact, it
offers less, in the sense that there isn't even an intent in the
standard to support it. G++ does guarantee it (provided all of
the accesses to the union are in the same function, or some
similar restriction), but I've used other compilers where it
didn't work, and the cast did.

From a practical point of view, a responsible compiler writer
will make both the cast and the union work, *provided* all use
is local to the function, where the compiler can easily detect
that the aliasing guarantees are broken.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top