A basic (?) problem with addresses (gcc)

P

Piotrne

Hi,

I have a strange problem with casting addresses to different
types. I have a float variable. The sequence of 4 bytes
representing its value should be copied to an int
(of the same size like float: 4 bytes) and then back
to the float variable. Example:

#include <stdio.h>

int main(int argc, char **argv)
{
float x = 4.3;
int y;

y = *(int*)&x; /* copying of 4 bytes to int */
x = *(float*)&y; /* and back to float */

printf("x=%f\n",x); /* 4.3 expected here */
printf("y=%d\n",y);
return 0;
}

I have compiled this example in Linux, using gcc 4.1.1, with
different optimization options. Here are the results:

$ gcc cast2.c
$ ./a.out
x=4.300000
y=1082759578

$ gcc -O2 cast2.c
$ ./a.out
x=-167393728255558576456059409056680378368.000000
y=134513634

The first result is correct, but what happened to the second?
Probably addresses have been shifted for some reason. But such
constructions seem to be an elementary property of the C language,
and they don’t work...

Regards
Piotr
 
A

Adrian

$ gcc cast2.c
$ ./a.out
x=4.300000
y=1082759578

$ gcc -O2 cast2.c
$ ./a.out
x=-167393728255558576456059409056680378368.000000
y=134513634
Hi Piotr

I tried this with gcc.4.5 and both versions produce the same expected
result (and assembler).

Have you tried compiling with -S to check the assembler, maybe 4.1 is
optimizing something away it should not.

I also tried gcc.4.1.1 on x64 and got 0 for the optomized version.

If you try compiling with -S you will see the optimized version has very
different assembler.

gcc.4.1.1 has had a few bugs.

HTH

Adrian Cornish
 
M

Morris Keesan

Hi,

I have a strange problem with casting addresses to different
types. I have a float variable. The sequence of 4 bytes
representing its value should be copied to an int
(of the same size like float: 4 bytes) and then back
to the float variable. Example:

#include <stdio.h>

int main(int argc, char **argv)
{
float x = 4.3;
int y;

y = *(int*)&x; /* copying of 4 bytes to int */
x = *(float*)&y; /* and back to float */

printf("x=%f\n",x); /* 4.3 expected here */
printf("y=%d\n",y);
return 0;
}

I have compiled this example in Linux, using gcc 4.1.1, with
different optimization options. Here are the results:

$ gcc cast2.c
$ ./a.out
x=4.300000
y=1082759578

$ gcc -O2 cast2.c
$ ./a.out
x=-167393728255558576456059409056680378368.000000
y=134513634

The first result is correct, but what happened to the second?
Probably addresses have been shifted for some reason. But such
constructions seem to be an elementary property of the C language,
and they don’t work...

Casting a pointer to a pointer of a different type is not guaranteed
to work. Worth trying here would be adding the lines

if (sizeof(int) != sizeof(float))
printf("int and float different sizes\n");

if ((void *)&x != (void *)(int *)&x)
printf("Casting &x to (int *) changes its value\n");

if ((void *)&y != (void *)(float *)&x)
printf("Casting &y to (float *) changes its value\n");


The optimizer may be changing the alignment requirements of float and
int objects. If you're trying to copy bytes between incompatible
types, using memcpy() would seem safer and more readable.
But what you're trying to do here seems suspicious, in any case.
 
P

Piotrne

Morris Keesan pisze:
Casting a pointer to a pointer of a different type is not guaranteed
to work.

The value of the pointer should be retained and is:

Worth trying here would be adding the lines
if (sizeof(int) != sizeof(float))
printf("int and float different sizes\n");

if ((void *)&x != (void *)(int *)&x)
printf("Casting &x to (int *) changes its value\n");

if ((void *)&y != (void *)(float *)&x)
printf("Casting &y to (float *) changes its value\n");

I have checked them (changing x to y in the last "if")
and got no messages, in both compiled versions - in the
optimized, with wrong copying, too.

I have even checked (by printing) values of addresses after
casting to float* or int* - they are the same.

It seems, that the way how the pointer is used to read
memory causes problems. Placing the float variable
in an array (as a middle element of it) changes
the result. Anyway, it is strange. I'll write if I find
something about this.

Piotr
 
P

Piotrne

Morris said:
> Casting a pointer to a pointer of a different type is not guaranteed
> to work.

The value of the pointer should be retained and is:
> Worth trying here would be adding the lines
>
> if (sizeof(int) != sizeof(float))
> printf("int and float different sizes\n");
>
> if ((void *)&x != (void *)(int *)&x)
> printf("Casting &x to (int *) changes its value\n");
>
> if ((void *)&y != (void *)(float *)&x)
> printf("Casting &y to (float *) changes its value\n");

I have checked them (changing x to y in the last "if")
and got no messages, in both compiled versions - in the
optimized, with wrong copying, too.

I have even checked (by printing) values of addresses after
casting to float* or int* - they are the same.

It seems, that the way how the pointer is used to read
memory causes problems. Placing the float variable
in an array (as a middle element of it) changes
the result. Anyway, it is strange. I'll write if I find
something about this.

Piotr
 
P

Piotrne

pete said:
(sizeof (float) == 4) isn't an elementary property
> of the C language.

In this case it is satisfied, I have removed the check
to make the code shorter.

P.
 
N

Nick Bowler

It seems, that the way how the pointer is used to read memory causes
problems. Placing the float variable in an array (as a middle element
of it) changes the result. Anyway, it is strange. I'll write if I find
something about this.

Newer versions of GCC have an optimization called "strict aliasing",
enabled by default at the higher -O levels. It allows the compiler to
assume (among other things) that pointers to int are never used to
access objects of type float and vice versa. Such behaviour is
consistent with the C standard. Violating that assumption will lead
to unpredictable consequences.
 
S

Seebs

The first result is correct, but what happened to the second?

Your code was wrong. The compiler did whatever it wanted.
Probably addresses have been shifted for some reason. But such
constructions seem to be an elementary property of the C language,
and they don???t work...

Nope. You invoked undefined behavior, the compiler caught you at it.

Don't do that.

-s
 
K

Keith Thompson

Piotrne said:
I have a strange problem with casting addresses to different
types. I have a float variable. The sequence of 4 bytes
representing its value should be copied to an int
(of the same size like float: 4 bytes) and then back
to the float variable. Example:

#include <stdio.h>

int main(int argc, char **argv)
{
float x = 4.3;
int y;

y = *(int*)&x; /* copying of 4 bytes to int */
x = *(float*)&y; /* and back to float */

printf("x=%f\n",x); /* 4.3 expected here */
printf("y=%d\n",y);
return 0;
}

Your code is at least non-portable. It assumes, among other things,
that int and float are the same size and that they have similar
alignment requirements. Your system (apparently) happens to satisfy
those assumptions, but others may not.

Even so, the compiler may assume that a float* won't be used to access
an int object, and that an int* won't be used to access a float object.

If you really want to copy the representation of one object into an
object of a different type, use memcpy().

What exactly are you trying to accomplish?
 
B

BartC

Seebs said:
Your code was wrong. The compiler did whatever it wanted.


Nope. You invoked undefined behavior, the compiler caught you at it.

What was wrong with it? Assuming int and float are the same sizes and are
aligned in a compatible way.
 
J

jacob navia

Le 15/12/10 21:45, BartC a écrit :
What was wrong with it? Assuming int and float are the same sizes and
are aligned in a compatible way.
It is wrong because gcc with optimizations screws it in some machines


In my Macintosh (OSX Intel) gcc gives the SAME results with and without
optimizations presumably because here Apple does a good job for us.
/tmp $ gcc -v
Using built-in specs.
Target: i686-apple-darwin10
[snip]
Thread model: posix
gcc version 4.2.1 (Apple Inc. build 5664)


Using Open Suse (inside VirtualBox under Macintosh) I obtain the same
good results using both optimized and non optimized code.

gcc -v gives

Using built-in specs.
Target: x86_64-suse-linux
[snip]
Thread model: posix
gcc version 4.4.1 [gcc-4_4-branch revision 150839] (SUSE Linux)

The problem with many êople is that they will never accept that gcc has
bugs. This is politically incorrect since gcc is GNU and GNU means

Gcc has No bUgs

:)
 
K

Keith Thompson

jacob navia said:
Le 15/12/10 21:45, BartC a écrit :
What was wrong with it? Assuming int and float are the same sizes and
are aligned in a compatible way.
It is wrong because gcc with optimizations screws it in some machines [snip]

The problem with many people is that they will never accept that gcc has
bugs. This is politically incorrect since gcc is GNU and GNU means

Gcc has No bUgs

:)

Here's the original program:

#include <stdio.h>

int main(int argc, char **argv)
{
float x = 4.3;
int y;

y = *(int*)&x; /* copying of 4 bytes to int */
x = *(float*)&y; /* and back to float */

printf("x=%f\n",x); /* 4.3 expected here */
printf("y=%d\n",y);
return 0;
}

Assuming int and float have the same alignment requirements, the
pointer conversions are ok (C99 6.3.2.3p7), but 6.5p7 says:

An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:

-- a type compatible with the effective type of the object,

-- a qualified version of a type compatible with the effective
type of the object,

-- a type that is the signed or unsigned type corresponding to
the effective type of the object,

-- a type that is the signed or unsigned type corresponding to
a qualified version of the effective type of the object,

-- an aggregate or union type that includes one of the
aforementioned types among its members (including,
recursively, a member of a subaggregate or contained
union), or

-- a character type.

The effective types of x and y are float and int, respectively
(their declared types, see 6.5p6). (Storing a value into an object
with no declared type can change its effective type; that doesn't
apply here.) The first assignment accesses the stored value of x
by an lvalue of type int; likewise, the second accesses the stored
value of y by an lvalue of type float. Both accesses violate 6.5p7,
so the program's behavior is undefined. gcc apparently assumes
that y will not be modified via an lvalue of type float, and that
x will not be modified via an lvalue of type int, and performs some
optimizations based on those assumptions.

It even warns about what it's doing:

c.c:8: warning: dereferencing type-punned pointer will break strict-aliasing rules
c.c:9: warning: dereferencing type-punned pointer will break strict-aliasing rules

I do not claim or believe for one moment that gcc is bug-free
(and I seem to recall someone here saying recently that gcc's
"strict-aliasing rules" might go beyond what the standard permits),
but in this case the bug is in the program, not in the compiler.
 
B

BartC

int main(int argc, char **argv)
{
float x = 4.3;
int y;

y = *(int*)&x; /* copying of 4 bytes to int */
x = *(float*)&y; /* and back to float */

printf("x=%f\n",x); /* 4.3 expected here */
printf("y=%d\n",y);
return 0;
}
It even warns about what it's doing:

c.c:8: warning: dereferencing type-punned pointer will break
strict-aliasing rules
c.c:9: warning: dereferencing type-punned pointer will break
strict-aliasing rules

I do not claim or believe for one moment that gcc is bug-free
(and I seem to recall someone here saying recently that gcc's
"strict-aliasing rules" might go beyond what the standard permits),
but in this case the bug is in the program, not in the compiler.

How then you do this (vaguely Fortran code) in C:

integer*4 i
real*4 a
equivalence (a,i)

This is a related problem: both i and a share the same address, and those
four bytes can be accessed as an integer or a float value.

And do it without doing any unnecessary copying (memcpy) or using unions
(not always practical, and which could anyway have the same problems).
(Assume you know the hardware would have no problems with this, and you
don't care about portability.)
 
K

Keith Thompson

BartC said:
How then you do this (vaguely Fortran code) in C:

integer*4 i
real*4 a
equivalence (a,i)

This is a related problem: both i and a share the same address, and those
four bytes can be accessed as an integer or a float value.

Use a union.
And do it without doing any unnecessary copying (memcpy) or using unions
(not always practical, and which could anyway have the same problems).
(Assume you know the hardware would have no problems with this, and you
don't care about portability.)

I think a union is the best solution. A footnote on C99 6.5.2.3p3 says:

If the member used to access the contents of a union object
is not the same as the member last used to store a value in
the object, the appropriate part of the object representation
of the value is reinterpreted as an object representation in
the new type as described in 6.2.6 (a process sometimes called
"type punning"). This might be a trap representation

So they're not going to have the same problem (assuming you can avoid
trap representations).
 
S

Seebs

What was wrong with it? Assuming int and float are the same sizes and are
aligned in a compatible way.

It tried to read something throug an lvalue of the wrong type. Ultimately,
this violates the strict aliasing rules; the compiler is allowed to ignore
the reference or do anything it wants with it.

-s
 
S

Seebs

How then you do this (vaguely Fortran code) in C:

You doin't -- it violates one of the rules. Any attempt to do this
is *necessarily* undefined behavior.
And do it without doing any unnecessary copying (memcpy) or using unions
(not always practical, and which could anyway have the same problems).
(Assume you know the hardware would have no problems with this, and you
don't care about portability.)

The way you express that is with a union. Apart from that, it's undefined
behavior and you *can't* express it in plain C.

One of the points of using C, rather than assembly, is that the language
spec defines the language in a way that, at least a little, cares about
portability.

You might be able to fake something up by declaring things with "volatile"
somewhere in them, but...

Basically, if you are assuming you know the hardware has no problems with
this, you're not writing C, but a machine-specific variant which the
compiler may not support, and isn't obliged to.

-s
 
B

BartC

Keith Thompson said:
Use a union.

OK. But apart from the inconvenience of wrapping these things in unions then
having to use field selection to access the data, how do you do something
like this:

integer*4 i(20)
real*8 a
equivalance (a,i(7))

(So the 8 bytes at i(7..8) are shared with the floating point number.)

The OP's method (perhaps wrapped in a macro) would have been ideal for this:

#define asdouble(x) *(double*)&(x)

asdouble(i[7]);

I think a union is the best solution. A footnote on C99 6.5.2.3p3 says:

If the member used to access the contents of a union object
is not the same as the member last used to store a value in
the object, the appropriate part of the object representation
of the value is reinterpreted as an object representation in
the new type as described in 6.2.6 (a process sometimes called
"type punning"). This might be a trap representation

So they're not going to have the same problem (assuming you can avoid
trap representations).

I must have got the idea somewhere that you could only read out the same
member that was last written.
 
J

Jens Thoms Toerring

OK. But apart from the inconvenience of wrapping these things in unions then
having to use field selection to access the data, how do you do something
like this:
integer*4 i(20)
real*8 a
equivalance (a,i(7))
(So the 8 bytes at i(7..8) are shared with the floating point number.)

You don't. The EQUIVALENCE stuff in FORTRAN is just a horrible hack
IMHO (beside computed GOTOs and COMMON blocks it's one of the most
effective ways to write completely obfuscated FORTRAN programs;-).
I never went too far with FORTRAN (actually my first language but
then quickly forgotten), so how does FORTRAN deal with this when
on a certain system a real must be 8-byte aligned but an integer
only on 4-bytes? Then accessing 'a' if 'i' starts at an 8-byte
aligned address might be "interesting". To make it transparent
to the programmer (and not resulting in a SIGBUS) the compiler
would have to do something equivalent to a memcpy() to a tempo-
rary (correctly aligned for real) variable each time 'a' is ac-
cessed...

But then you can get a similar effect in C anyway with memcpy(),
just the normal syntax of the language doesn't support it for good
reasons IMHO. Why make something inherently broken (unless under
some very special circumstances) easy to do?
I must have got the idea somewhere that you could only read out the same
member that was last written.

I would guess the standard as cited by Keith (with emphasizing the
problem with trap representations) is pretty clear, i.e. if you're
lucky (no trap representation) it "works". What "works" actually
means is another question - if you e.g. try to read the value of
a float as an int then, of course, what you get will depend on the
bit representation of floats and ints on that system. So the result
will be inherently system dependent - but then it already is because
for this to somehow "work" requires that a float and an int have the
same size.
Regards, Jens
 
L

lawrence.jones

BartC said:
I must have got the idea somewhere that you could only read out the same
member that was last written.

C89. The rules were changed in C99 to bless what everyone expected and
all known implementations did anyway.
 
S

Seebs

OK. But apart from the inconvenience of wrapping these things in unions then
having to use field selection to access the data, how do you do something
like this:
integer*4 i(20)
real*8 a
equivalance (a,i(7))
(So the 8 bytes at i(7..8) are shared with the floating point number.)

You don't. C doesn't support or allow for overlap like this, so far as I
know.
I must have got the idea somewhere that you could only read out the same
member that was last written.

You can only read the same member that was last written if you want to know
what you'll get. :)

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top