Conversion between float and long

D

d major

I was very puzzled about the conversion between float and long, I
cann't understand why a long val can convert to a float, as the below
codes show:

typedef unsigned long u_long;
float val = 3.14159;
u_long nw_val = *((u_long *) &val);

than the nw_val equal to 1078530000, I made such conversion:
float d_val = *((float*)&nw_val);

than I got d_val = 3.14159

Can anybody help me explains this sentence *((u_long*) &val) ?

best regards!
 
M

Michael DOUBEZ

d major a écrit :
I was very puzzled about the conversion between float and long, I
cann't understand why a long val can convert to a float, as the below
codes show:

typedef unsigned long u_long;
float val = 3.14159;

A float has given bit representation.
In your case 0x40490FD0.
u_long nw_val = *((u_long *) &val);

In this line, you reinterpret the bits has being an unsigned long.
The values of 0x40490FD0 in unsigned long coding is 1078530000 on your
system.

than the nw_val equal to 1078530000, I made such conversion:
float d_val = *((float*)&nw_val);

Here you do the same but the other way around: you reinterpret the bits
as being a float.
than I got d_val = 3.14159
The values of 0x40490FD0 in float coding is 3.14159 on your system
(should be IEEE 754 standard).
Can anybody help me explains this sentence *((u_long*) &val) ?
In C++:
*reinterpret_cast<u_long*>(&val);

In plain langage, you take the address of val (&val) and reinterpret
this address as a pointer on an unsigned long and then deference this
pointer to get the value.

You would get the same result with a union:
union
{
unsigned long ul;
float f;
} value;
value.f=3.14159;
assert(value.ul==1078530000);
 
T

Tim Love

d major said:
I was very puzzled about the conversion between float and long,
http://www-h.eng.cam.ac.uk/help/tpl/languages/C++/strongtyping.html
might help.

Can anybody help me explains this sentence *((u_long*) &val) ?

You're creating a pointer of type u_long* and pointing it at a place
where a float has been stored. When you dereference this pointer (using
the first of the "*" symbols) the bit-pattern representation of the
float's value is being treated as if it were the representation of a u_long.
floats and ints are stored in different formats, so you get a strange value.
 
G

gpderetta

 http://www-h.eng.cam.ac.uk/help/tpl/languages/C++/strongtyping.html
might help.


You're creating a pointer of type u_long* and pointing it at a place
where a float has been stored. When you dereference this pointer (using
the first of the "*" symbols) the bit-pattern representation of the
float's value is being treated as if it were the representation of a u_long.
floats and ints are stored in different formats, so you get a strange value.

And, BTW, reading from a memory location using a type different than
its dynamic
type is undefined behavior as it breaks strict aliasing, and will
actually
break on real modern compilers.

Note that the union trick, explained else post, is also undefined but
it is a common
extension to the language.

The correct way to implement this type of type punning is using
std::memcpy.

HTH,
 
F

Frank Birbacher

Hi!

d said:
I was very puzzled about the conversion between float and long, I
cann't understand why a long val can convert to a float,

It can. And apart from the discussion about the *pointer* cast I want to
show the simple numeric cast:

#include <iostream>
#include <ostream>
int main()
{
const float f = 3.1415962f;
const long l = f;
const float f2 = l;
std::cout << f2 << std::endl;
}

Which simply prints "3". "l" is created from "f" which is a float. "l =
f" is a numeric conversion and will likely produce a compiler warning.
"f2 = l" is another conversion in the reverse direction: from long to
float. It does not produce a warning as it is a standard "promotion" (I
think): the range of a float is assumed to be larger than the range of a
long, so there shouldn't be a problem (hehe, loss of precision probably,
but anyway).

Frank
 
J

James Kanze

http://www-h.eng.cam.ac.uk/help/tpl/languages/C++/strongtyping.html
might help.
You're creating a pointer of type u_long* and pointing it at
a place where a float has been stored. When you dereference
this pointer (using the first of the "*" symbols) the
bit-pattern representation of the float's value is being
treated as if it were the representation of a u_long.
floats and ints are stored in different formats, so you get
a strange value.
[/QUOTE]
And, BTW, reading from a memory location using a type
different than its dynamic type is undefined behavior as it
breaks strict aliasing, and will actually break on real modern
compilers.

It's well defined if the target type is a character pointer;
you're allowed to read the raw bytes of an "object". It's also
fairly clearly the intent of the standard that it should work
more or less as expected for other basic types.

It also happens that making it work wrecks havoc with the
optimizer, and can slow code down considerably, so compilers
don't normally do it unless the casts are very local and very
visible. (A good compiler will turn off optimizing if it sees a
reinterpret_cast in a block. That still won't help if you pass
the converted pointer to another function, however.)
Note that the union trick, explained else post, is also
undefined but it is a common extension to the language.

The advantage of the union trick (from a compiler author's point
of view) is that the aliasing is immediately visible. But it
doesn't necessarily work either if you take the address of each
of the members, and pass those addresses to another function.
The correct way to implement this type of type punning is
using std::memcpy.

The more correct thing to do is not to implement it at all:).
 
M

Michael DOUBEZ

gpderetta a écrit :
And, BTW, reading from a memory location using a type different than
its dynamic
type is undefined behavior as it breaks strict aliasing,

As soon as you have a reinterpret_cast said:
and will actually break on real modern compilers.

IIRC on gcc, strict aliasing is only activated from -02 optimisation and
you can always pass a -fno-strict-aliasing to avoid the optimisation.

Does it really break when only reading values or passing a pointer
around (float*->long*->float*) ? I though aliasing was a problem only
upon writing (i.e. the value is not propagated to heterogeneous readers).
 
G

gpderetta

It's well defined if the target type is a character pointer;
you're allowed to read the raw bytes of an "object".  

Yes, forgot to mention that.
It's also
fairly clearly the intent of the standard that it should work
more or less as expected for other basic types.

I'm not convinced about that. In fact I have read experts discussing
this that made it clear that strict aliasing also applies to all
types
(modulo exceptions like chars). In particular the guaranteed freedom
of
aliasing between integral and floating point types could very well
speed
up real numeric code (which uses integers for indexes and floats for
computation). Or at least, this is what I've read.

Type aliasing is still a very controversial topic, as you can see
by browsing gcc bugzilla :)

There are also a couple of open issues on the C standard regarding
this
topic.
[...]
The correct way to implement this type of type punning is
using std::memcpy.

The more correct thing to do is not to implement it at all:).

;)

Of course, but in real life it is sometime necessary for some system
specific operations...

... or optimizations *ducks*.
 
G

gpderetta

gpderetta a écrit :


As soon as you have a reinterpret_cast<>(), you have however UB.

I think that it is actually implementation defined.

For example the POSIX standard practically requires is it do deal
with
sockaddr_t and friends.
IIRC on gcc, strict aliasing is only activated from -02 optimisation

In current releases, in future it could been enabled even at lower
optimization levels...
and
you can always pass a -fno-strict-aliasing to avoid the optimisation.

... and, as the man page says, -fno-strict-aliasing is not guaranteed
to be supported in the future. I would be surprised if it were to be
dropped though, way too much software would break.
Does it really break when only reading values or passing a pointer
around (float*->long*->float*) ? I though aliasing was a problem only
upon writing (i.e. the value is not propagated to heterogeneous readers).

I do not know, it might or might not. I think the gcc developers have
explicitly refused to guarantee it.
 
D

d major

gpderetta a écrit :




IIRC on gcc, strict aliasing is only activated from -02 optimisation and
you can always pass a -fno-strict-aliasing to avoid the optimisation.

Does it really break when only reading values or passing a pointer
around (float*->long*->float*) ? I though aliasing was a problem only
upon writing (i.e. the value is not propagated to heterogeneous readers).

Thanks to all.
I think Michael's answer is most accurate. I try again in this
morning.
when float convert to float, it first make float value to byte order:
u_long nw_val = htonl(*((u_long *) &val));
this time, I define val = 0.5
I get nw_val = 63, and the byte of val is 00111111 00000000 00000000
00000000 just IEEE754 standard.
Then I do
u_long h_val = ntohl(nw_val);
float d_val = *((float*)&h_val);

get h_val = 1056964608 = *((u_long *) &val)
so that d_val = 0.5

Now I don't understand 1056964608 come out?
 
M

Michael DOUBEZ

d major a écrit :
I get nw_val = 63, and the byte of val is 00111111 00000000 00000000
00000000 just IEEE754 standard.
Then I do
u_long h_val = ntohl(nw_val);
float d_val = *((float*)&h_val);

get h_val = 1056964608 = *((u_long *) &val)
so that d_val = 0.5

Now I don't understand 1056964608 come out?

h_val contains 00111111 00000000 00000000 00000000 (MSB first).
In unsigned integer representation,this means the value is (taking ^ as
the power operator)

hval=0*2^0+0*2^1+...+0*2^23+1*2^24+1*2^25+...+1*2^29+0*0^30+0*2^31;
=2^24*(1+2+4+8+16+32)=2^24*(2^6-1)
=63*2^24
=1056964608

The bits signifies 0.5 in float and 1056964608 in integer (in binary
base, if we had used another system such as Gray code, it would be
another value).

The underlying idea is that a sequence of bits have no signification
except the one you give to it. In the case of integer and float, the
signification is coded into the processor.
 
J

James Kanze

Yes, forgot to mention that.

Note that we're talking about the standard here. Any
relationship between what the standard requires and what any
particular implementation does is purely coincidental.
I'm not convinced about that.

The note in §5.2.10/4, concerning the mapping done by
reinterpret_cast: "it is intended to be unsurprising to those
who know the addressing structure of the underlying machine."
Strictly speaking, this note concerns the mapping between
integers and pointer types, but it seems reasonable to expect it
to further apply between two pointer types.
In fact I have read experts discussing this that made it clear
that strict aliasing also applies to all types (modulo
exceptions like chars). In particular the guaranteed freedom
of aliasing between integral and floating point types could
very well speed up real numeric code (which uses integers for
indexes and floats for computation). Or at least, this is what
I've read.

It's a case of the left hand not knowing what the right hand is
doing:). The rules concerning aliasing are very important for
optimizing, and C++ very clearly does say in its object model
that accessing an object via an lvalue of a different type
(other than a character type) is undefined behavior. Regardless
of how you do it. On the other hand, reinterpret_cast is
useless unless you can do it. Clearly, reinterpret_cast is not
meant for portable code, but arguably, it should be usable in an
implementation defined manner. And the "undefined behavior" in
the object model is not because of optimizing, but because
accessing an int as a double (for example) might result in a
trapping representation. (But we don't have a rationale for the
C++ standard, so we don't know the real motivations.)

Practically, from a quality of implementation point of view, I'd
expect such accesses to behave in a manner "unsurprising to
those who know the addressing structure of the underlying
machine", and the exact representations of the types involved,
if, but only if, the reinterpret_cast is clearly visible to the
compiler, i.e. if I write something like:

void
f( double* d )
{
unsigned long long* p
= reinterpret_cast< unsigned long long* >( d ) ;
*p = 0x4000000000000000ULL ;
}

I expect the double pointed to by d to be modified to contain
the specified bit pattern, and that code calling this function,
say:

void
g()
{
double d = 0.0 ;
f( &d ) ;
std::cout << d << std::endl ;
}

will output the expected value (1.0, if I'm not mistaken with my
integral literal). Either the compiler knows what is in f()
(e.g. because it is inline), can see the reinterpret_cast, and
so knows that strict aliasing no longer applies, or it doesn't
know, in which case, it has to assume that f modifies d, and
thus reread the value before calling operator<<.

I don't expect it to work in a function which gets the two
pointers (one double*, one unsigned long long*) from some
external, unknown source.
Type aliasing is still a very controversial topic, as you can
see by browsing gcc bugzilla :)

It's essential for good optimization. All that one can
reasonably ask is that the compiler drop it when it sees a
reinterpret_cast.
There are also a couple of open issues on the C standard
regarding this topic.

I can believe it.

My understanding of the intent in C90 was that casting, and not
unions should be used for type punning. Admittedly, however,
this is based on somewhat uncertain memories of vague
discussions many years ago, so I'm not sure how reliable it is.
Still, it seems clear to me that a union introduces still an
additional constraint:

union U
{
double d1 ;
double d2 ;
} u ;

Formally, if you write to u.d1, and read from u.d2, you have
undefined behavior. Supposedly, in theory at least, an
implementation could keep a tag cached somewhere hidden, and
check it when you accessed. (How such an implementation would
deal with something like *(double*)(&u), I don't know, since
I think there is a guarantee that casting the address of a union
to the type of an address to one its members results in a
pointer to the member.)

Practically (again from a quality of implemenation point of
view), I'd expect type punning with a union to work as long as
the accesses are all directly to the union---again, if you take
the address of two members of different types, and pass them to
another function, I think that that function has the right to
suppose that different types means non-overlapping objects. But
I've used at least one C compiler where this was NOT the case.
[...]
The correct way to implement this type of type punning is
using std::memcpy.
The more correct thing to do is not to implement it at all:).

Of course, but in real life it is sometime necessary for some
system specific operations...

If you're writing very low level software, you almost have to.
How would you write a garbage collector without breaking typing,
for example?
... or optimizations *ducks*.

No need to duck. As I've said more than once, if the profiler
says you have to, you have to.

Practically: see the thread about reading floating point values.
You generally have an engineering decision: you don't need the
type punning, but if it can be done reasonably safely, and saves
a couple of days of development time...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top