Address of a union member

N

Noob

Hello,

My compiler complains when I take the address of a member in a union.

$ cat mu.c
union foo
{
int i;
double d;
};

int main(void)
{
union foo bar = { 0 };
int *p = &(bar.i);
return *p;
}

$ cc mu.c
w "mu.c",L10/C12(#241): Address of a union member is being used as a
| pointer. This may violate an assumption made by the optimizer.
| To be safe, you should recompile your program at a lower
| optimization level; or else, turn off the BEHAVED toggle.
No errors 1 warning

I don't see what the problem is, and gcc did not seem to mind.

$ gcc -O2 -std=c89 -Wall -Wextra mu.c
/* NO OUTPUT */

Is this an aliasing problem?

What am I doing wrong?

Regards.
 
N

Noob

Eric said:
Nothing that I can see. My guess is that the compiler wants
to assume that an int* and a double* point to different objects
since they point to different types. That is, if you pass both
&bar.i and &bar.d as arguments to

void silly(int *ip, double *dp) {
*ip = 42;
*dp = 42.0;
*ip = 42;
}

... the compiler might assume that the two differently-typed
parameters point to two differently-typed and distinct objects,
so the third assignment could be omitted.

... but if I were you, I'd read my compiler's documentation
starting with what it says about "the BEHAVED toggle."

Good advice :)

"""
When it assumes that code is well-behaved, the compiler can be less conservative
in generating code for pointer-based objects. Well-behaved code follows these rules:

o The address of a union member is never assigned to a pointer.
o A value of a pointer type is never cast to an incompatible pointer type.

Given these assumptions, the compiler might be able to generate substantially
better code in referencing pointer-based variables. The compiler issues an
appropriate warning if either of these assumptions is violated in such a way as
to affect assumptions made by the optimizer. You must decide whether the
warnings can be safely ignored or whether the program should be compiled at a
lower optimization level.

CAUTION: The compiler might not catch all instances of misbehaved code.
For example, a pointer-to-char might be passed to an undeclared
(unprototyped) external function expecting a pointer-to-int.
Therefore, it is possible for a program to compile at optimization
level 6 without warnings (and run incorrectly), but run correctly
when compiled at a lower optimization level.
"""

I'm not sure what they mean by "You must decide whether the warnings can be
safely ignored". How do I tell whether it is safe? :)

Regards.
 
B

Ben Bacarisse

Noob said:
My compiler complains when I take the address of a member in a union.

$ cat mu.c
union foo
{
int i;
double d;
};

int main(void)
{
union foo bar = { 0 };
int *p = &(bar.i);
return *p;
}

$ cc mu.c
w "mu.c",L10/C12(#241): Address of a union member is being used as a
| pointer. This may violate an assumption made by the optimizer.
| To be safe, you should recompile your program at a lower
| optimization level; or else, turn off the BEHAVED toggle.
No errors 1 warning

I don't see what the problem is, and gcc did not seem to mind.

$ gcc -O2 -std=c89 -Wall -Wextra mu.c
/* NO OUTPUT */

Is this an aliasing problem?

I think so, yes. In section 6.5 you will find:

7 An object shall have its stored value accessed only by an lvalue
expression that has one of the following types[76]

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type
of the object,

— a type that is the signed or unsigned type corresponding to the
effective type of the object,

— a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or

— a character type.

Footnote 76 is: "The intent of this list is to specify those
circumstances in which an object may or may not be aliased."

The compiler is warning you that I can assume that the union (in
particular the other member) will not change as a result of changes
though the pointer you have just obtained.

There in not problem in taking the address, but the compiler can
assume that *p does not change when u.d changes (and vice versa).
This is often called the "strict aliasing rule".
 
G

Guest

| I'm not sure what they mean by "You must decide whether the warnings can be
| safely ignored". How do I tell whether it is safe? :)

You will have to understand what the optimization does ... on each platform
.... and decide if that optimization will conflict with the behaviour being
coded. For example, in the silly() function shown earlier, is it OK for the
3rd assignment to be skipped?
 
K

Keith Thompson

Noob said:
My compiler complains when I take the address of a member in a union.

$ cat mu.c
union foo
{
int i;
double d;
};

int main(void)
{
union foo bar = { 0 };
int *p = &(bar.i);
return *p;
}

$ cc mu.c
w "mu.c",L10/C12(#241): Address of a union member is being used as a
| pointer. This may violate an assumption made by the optimizer.
| To be safe, you should recompile your program at a lower
| optimization level; or else, turn off the BEHAVED toggle.
No errors 1 warning

I don't see what the problem is, and gcc did not seem to mind.
[...]

Compilers are allowed to warn about anything they like.

In this case, the compiler appears to be warning about a *potential*
problem, not one that actually occurs in the code you posted.

Consider, for example:

union foo bar;
int *p = &bar.i;
*p = 10;
bar.d = 12.34;
printf("*p = %d\n", *p);

The optimizer might assume that, since you just stored the value 10 in
*p, the value 10 must still be there, and optimize the printf to
something like puts("p = 10").

(The standard has more to say about whether this behavior is defined
or undefined and whether an optimizer is allowed to make this
assumption; I don't have my copy of the standard handy right now and
I'm too lazy to look it up.)
 
P

Peter Nilsson

Eric Sosman said:
     Nothing that I can see.  My guess is that the compiler
wants to assume that an int* and a double* point to different
objects since they point to different types.  That is, if you
pass both &bar.i and &bar.d as arguments to

        void silly(int *ip, double *dp) {
            *ip = 42;
            *dp = 42.0;
            *ip = 42;
        }

... the compiler might assume that the two differently-typed
parameters point to two differently-typed and distinct objects,
so the third assignment could be omitted.

The first perhaps, but not the third. Strictly conforming code
could detect the difference...

union { int i; double d; } u;
silly(&u.i, &u.d);
printf("%d\n", u.i);
 
B

Ben Bacarisse

Peter Nilsson said:
The first perhaps, but not the third. Strictly conforming code
could detect the difference...

union { int i; double d; } u;
silly(&u.i, &u.d);
printf("%d\n", u.i);

I disagree but given your history of being correct, I currently
suspect that I have missed something here. Does not your snippet of
code violate the "shall" from section 6.5 paragraph 7?
 
N

Nobody

I'm not sure what they mean by "You must decide whether the warnings can be
safely ignored". How do I tell whether it is safe? :)

Examine the resulting assembler output (or disassembly) to see if it does
what you want.

If you don't understand assembler, either reduce the optimisation level or
avoid constructs which the compiler complains about.

In this case, it looks like the compiler is being overly conservative
about checking for potential aliasing bugs. If you are actually
referencing both union members for the same object, that may well be a bug.
 
R

Richard Bos

Keith Thompson said:
Consider, for example:

union foo bar;
int *p = &bar.i;
*p = 10;
bar.d = 12.34;
printf("*p = %d\n", *p);

The optimizer might assume that, since you just stored the value 10 in
*p, the value 10 must still be there, and optimize the printf to
something like puts("p = 10").

(The standard has more to say about whether this behavior is defined
or undefined and whether an optimizer is allowed to make this
assumption; I don't have my copy of the standard handy right now and
I'm too lazy to look it up.)

You're reading a member of a union which is not the last member that has
been assigned to. You're reading it indirectly, but you're still reading
it. This means that its bytes have unspecified values, and therefore
that its value may be a trap value[1]; hence, in theory undefined
behaviour, but most likely to result in nonsense values. And AFAICT it's
_allowed_ to result in the same nonsense value no matter what you store
in bar.d, or even in different nonsense values even if you store the
same value in bar.d more than once.

Richard

[1] Of the member not last assigned to, _not_ of the union as a, dare I
say it, thing-in-itself.
 
P

Peter Nilsson

I disagree but given your history of being correct,

It may have happened. I think it was a Tuesday. ;)
I currently suspect that I have missed something here.  Does
not your snippet of code violate the "shall" from section 6.5
paragraph 7?

I don't see how. The last u.i accesses an object that was last
assigned via an int lvalue. That assigment imposed the
effective type. [6.5p6]
 
B

Ben Bacarisse

Peter Nilsson said:
I disagree but given your history of being correct,

It may have happened. I think it was a Tuesday. ;)
I currently suspect that I have missed something here.  Does
not your snippet of code violate the "shall" from section 6.5
paragraph 7?

I don't see how. The last u.i accesses an object that was last
assigned via an int lvalue. That assigment imposed the
effective type. [6.5p6]

Duh! I was reading the earlier quote as if the programmer were
permitted to remove the third line, not the compiler.
 
B

Ben Bacarisse

Ben Bacarisse said:
Duh! I was reading the earlier quote as if the programmer were
permitted to remove the third line, not the compiler.

OK, even that does not make sense. Take it that I misread everything!
 
E

Eric Sosman

Peter said:
The first perhaps, but not the third. Strictly conforming code
could detect the difference...

union { int i; double d; } u;
silly(&u.i, &u.d);
printf("%d\n", u.i);

Yes, that was my point: The compiler's assumption that
differently typed pointers point to distinct objects can be
incorrect. Hence (I guess) the compiler's warning that it
might be a good idea to run the optimizer at a less aggressive
level, because at high levels it's a bit over-optimistic.
 
T

Tim Rentsch

You're reading a member of a union which is not the last member that has
been assigned to. You're reading it indirectly, but you're still reading
it. This means that its bytes have unspecified values,

Probably you are misremembering. It's only bytes /other than/
the bytes of the member last stored that take unspecified
values. Bytes that overlap the member last stored take on
the values that were stored as a result of assigning to
that member.
and therefore
that its value may be a trap value[1]; hence, in theory undefined
behaviour, but most likely to result in nonsense values. And AFAICT it's
_allowed_ to result in the same nonsense value no matter what you store
in bar.d, or even in different nonsense values even if you store the
same value in bar.d more than once.

There's a common misconception that reading a (non-character type)
union member other than the last member stored is automatically
undefined behavior. It isn't. Of course, it's possible to
get undefined behavior if there's a trap representation, but
if trap representations can be ruled out, the result is only
implementation defined behavior. For example,

union {
int i;
unsigned u;
} x;
x.i = 5;
return x.u;

must return the value 5.
 
L

lawrence.jones

Tim Rentsch said:
There's a common misconception that reading a (non-character type)
union member other than the last member stored is automatically
undefined behavior. It isn't.

It was, prior to C99.
 
T

Tim Rentsch

It was, prior to C99.

That's good to know. Was this deliberate or accidental?
(I expect it was deliberate, but it seems right to ask.)
If it was deliberate, what prompted the change?
 
T

Tim Rentsch

Richard Heathfield said:
That's a common misconception. From C89 3.3.2.3:

" With one exception, if a member of a union object is accessed after
a value has been stored in a different member of the object, the
behavior is implementation-defined."

I don't have a C89 document readily available -- can you
find out what the exception was?
 
T

Tim Rentsch

Richard Heathfield said:
Common initial sequence (for a union made up of several structures).
The exception is /well/-defined (not undefined).

Ahhh, that makes sense. Thank you for the followup.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top