Joshua Maurice said:
Sorry. Some silly questions if I may, please? Consider the following
programs:
int main(void)
{
union { int x; float y; } u;
u.y = 2;
u.x = 1;
return u.x;
}
/* ---- */
int main(void)
{
union { int x; float y; } u;
float * y = &u.y;
*y = 2;
int * x = &u.x;
*x = 1;
return u.x;
}
/* ---- */
int main(void)
{
union { int x; float y; } u;
float * y = &u.y;
int * x = &u.x;
*y = 2;
*x = 1;
return u.x;
}
/* ---- */
void foo(int * x, float * y)
{
*y = 2;
*x = 1;
}
int main(void)
{
union { int x; float y; } u;
float * y = &u.y;
int * x = &u.x;
foo(x, y);
return u.x;
}
/* ---- */
The last program above, except with foo in a different translation
unit.
Where exactly do you think we cross from defined behavior to undefined
behavior? I would argue that the first example is clearly not UB, and
the last example with foo() in a different translation unit is
probably UB. Specifically, the intent of the effective type rules is
to allow the compiler to do additional aliasing analysis and reorder
reads and writes that are sufficiently differently typed. With foo()
in a different translation unit, we want the compiler to be able to
reorder the writes to x and y in foo() from type aliasing analysis,
but if we do that then we'll change the semantics of the last program
and have it return garbage.
I don't have a strong opinion on this one. It seems that the intent of
the type access rules and the existence of unions is an inherent
contradiction - with several plausible ways out, of course.
I'm sorry, I didn't see any silly questions. Is it okay if I
just answer what you asked? (See, there's an example of a silly
question.
If we take the effective type rules at face value, I don't think
any of these are undefined behavior. In each case the stores that
are done are consistent with the declared type of the member whose
object is being stored into. Going through the different sequences
(and I admit I haven't checked them as carefully as I might have)
and referring to the effective type rules in each case, I don't see
any violations. That includes the last case where the foo()
function is defined in a different TU, although AFAIK that doesn't
change whether effective type rules are violated.
Of course, this is upsetting, because intuitively we expect that
when it looks like reordering might muck things up then either the
reordering isn't allowed (presumably due to effective type rules
considerations) or the program has crossed over into undefined
behavior (probably because effective type rules have been
violated). None of the obvious alternatives seems appealing, eg,
"no reordering can be done in cases like this" (ick), or "stores
through the x and y pointers can be reordered, and the later access
of u.x just gets one or the other -- ie, unspecified behavior, but
not undefined behavior" (at odds with other parts of the Standard),
or "even though these case follow the letter of the law, effective
type wise, they violate its spirit, and therefore are undefined
behavior" (lacks evidence to be convincing). Of course, any
sensible developer would instinctively shy away from writing such
code, but that doesn't resolve the question.
I have two principal takeaways to offer.
First, how the effective type rules are phrased is somewhat broken,
or at least incomplete. If these examples are defined behavior,
that has serious negative consequences for code reordering. If
they are supposed to have undefined behavior, the effective type
rules don't express that adequately. Neither of those consequences
is acceptable, I would say, and in either case the Standard needs
to clarify what is meant.
Second, as a practical matter, this kind of pattern (taking
addresses of several members of the same union object, storing
through the resultant pointers, then using . or -> to get the value
of one of those members, is likely to be unspecified hehavior as
far as which store occurred last. That behavior is what I think
most seasoned developers would expect, how most actual compilers
will generate code, and (I opine) what the Standard would prescribe
if a suitable way of expressing that presented itself. My feeling
is that cases like this one _should_ be unspecified behavior, and not
undefined behavior, but I also know that finding suitable language
to delimit the boundaries -- clearly, correctly, and exactly --
is not at all an easy task.