union arrangement

T

tedu

does anyone know of a platform/compiler which will place union elements
to not overlap?
as in
union u {
int a;
long b;
size_t c;
};
in my limited experience, writing to any of (a, b, or c) will affect
the value read from any other. i understand this is UB, but i'm
curious if there are any real platforms where this is not the case.
 
V

Vladimir S. Oka

tedu said:
does anyone know of a platform/compiler which will place union
elements to not overlap?
as in
union u {
int a;
long b;
size_t c;
};
in my limited experience, writing to any of (a, b, or c) will affect
the value read from any other. i understand this is UB, but i'm
curious if there are any real platforms where this is not the case.

Unions members are meant to overlap. That's exactly what they're there
for. Having them behave otherwise would make the compiler
non-conforming.

What you're describing are the structures. Look up the struct keyword in
your C manual.

Cheers

Vladimir
 
M

Mark B

tedu said:
does anyone know of a platform/compiler which will place union elements
to not overlap?

All of them... just replace the 'union' keyword with 'struct'.
as in
union u {
int a;
long b;
size_t c;
};
in my limited experience, writing to any of (a, b, or c) will affect
the value read from any other. i understand this is UB,

no, not undefined, "Implementation defined"
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
does anyone know of a platform/compiler which will place union elements
to not overlap?

IIRC, it is the definition of a union that /requires/ the elements to
overlap. ("A union type describes an overlapping nonempty set of member
objects, each of which has an optionally specified name and possibly
distinct type.")
as in
union u {
int a;
long b;
size_t c;
};
in my limited experience, writing to any of (a, b, or c) will affect
the value read from any other. i understand this is UB, but i'm
curious if there are any real platforms where this is not the case.

I would hope not, as by definition in the standard, union elements are
required to overlap.


- --

Lew Pitcher, IT Specialist, Enterprise Data Systems
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFD1m+ragVFX4UWr64RArgXAJ9DuYkJhXWeXdSOLTQvHk1gMKNMEgCdGPPk
W6Mufap6++SwPMnfPPBUh3M=
=oyvE
-----END PGP SIGNATURE-----
 
T

tedu

Lew said:
I would hope not, as by definition in the standard, union elements are
required to overlap.

perhaps i'm misremembering the standard (don't have it atm), but i was
pretty sure "write to union field A, read from union field B" was not
defined. is that not correct?
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
perhaps i'm misremembering the standard (don't have it atm), but i was
pretty sure "write to union field A, read from union field B" was not
defined. is that not correct?

Defined as "implementation defined", I believe.

However, that does not mean that
- - union elements do not overlap (they do, they are supposed to), or
- - write to element A, read from element B will not work (it usually
does, but just in an "implementation defined" manner)


- --

Lew Pitcher, IT Specialist, Enterprise Data Systems
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFD1o1lagVFX4UWr64RAv91AJ9gS8SBQHggyDk2/FGD2akEQQpWGgCfSlOE
/UTwxC3rSrvh63nw4a9AEDE=
=W/Y7
-----END PGP SIGNATURE-----
 
V

Vladimir S. Oka

tedu said:
perhaps i'm misremembering the standard (don't have it atm), but i was
pretty sure "write to union field A, read from union field B" was not
defined. is that not correct?

6.2.6.1.7
When a value is stored in a member of an object of union type, the bytes
of the object representation that do not correspond to that member but
do correspond to other members take unspecified values, but the value
of the union object shall not thereby become a trap representation.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

6.7.2.1.4
As discussed in 6.2.5, a structure is a type consisting of a sequence of
members, whose storage is allocated in an ordered sequence, and a union
is a type consisting of a sequence of members whose storage overlap.
^^^^^^^

Cheers

Vladimir
 
M

Mark McIntyre

does anyone know of a platform/compiler which will place union elements
to not overlap?

No. This is what union members do. Its what they're /supposed/ to do.
(of writing through one member and reading through another)
i understand this is UB,

Actually, I recall that its implementation defined.
Mark McIntyre
 
T

tedu

Vladimir said:
6.2.6.1.7
When a value is stored in a member of an object of union type, the bytes
of the object representation that do not correspond to that member but
do correspond to other members take unspecified values, but the value
of the union object shall not thereby become a trap representation.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

6.7.2.1.4
As discussed in 6.2.5, a structure is a type consisting of a sequence of
members, whose storage is allocated in an ordered sequence, and a union
is a type consisting of a sequence of members whose storage overlap.
^^^^^^^

thanks. now consider:
union u {
char c[8];
float f;
};
according to the above, on a machine with potential float trap
representations, the c and f fields cannot completely overlap,
otherwise i would be able to write in the trap representation bits. is
that correct?
 
K

Keith Thompson

tedu said:
Vladimir said:
6.2.6.1.7
When a value is stored in a member of an object of union type, the bytes
of the object representation that do not correspond to that member but
do correspond to other members take unspecified values, but the value
of the union object shall not thereby become a trap representation.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

6.7.2.1.4
As discussed in 6.2.5, a structure is a type consisting of a sequence of
members, whose storage is allocated in an ordered sequence, and a union
is a type consisting of a sequence of members whose storage overlap.
^^^^^^^

thanks. now consider:
union u {
char c[8];
float f;
};
according to the above, on a machine with potential float trap
representations, the c and f fields cannot completely overlap,
otherwise i would be able to write in the trap representation bits. is
that correct?

No. The value *of the union object* cannot become a trap
representation; the value of a member of the union object can be.

The point of the restriction is to make sure that referring to the
value of a union (assigning it, passing it to a function, whatever)
doesn't invoke undefined behavior just because one of its members has
a trap representation. In effect, any operations that work on the
union as a whole rather than on one if its members should treat the
union as an uninterpreted bag of bits.

That's probably not *quite* true, though. The standard doesn't say
that unions can't have trap representations; it merely says that a
union's value can't become a trap representation as a result of
storing a valid value into one of its members. It's at least
conceivable that, given:

union f {
float x;
float y;
} obj;

storing a trap representation into obj.x (perhaps via memcpy() could
cause obj itself to have a trap representation. But I'd be surprised
if any implementation other than the DS9K actually worked this way.

Digressing a bit, the OP's question wasn't entirely silly. The
question was more or less equivalent to:

Could a lazy C implementation that simply treats "union" as a
synonym for "struct" be conforming?

If all instances of storing a value in one union member and then
reading the value of a different member invoked undefined behavior,
and if the standard didn't specifically say that the members overlap,
the answer would be yes. (Programs that try to use unions for type
punning would fail, but we're assuming that would be undefined
behavior.) But they don't, and it does, so it isn't.
 
T

tedu

Keith said:
Could a lazy C implementation that simply treats "union" as a
synonym for "struct" be conforming?

If all instances of storing a value in one union member and then
reading the value of a different member invoked undefined behavior,
and if the standard didn't specifically say that the members overlap,
the answer would be yes. (Programs that try to use unions for type
punning would fail, but we're assuming that would be undefined
behavior.) But they don't, and it does, so it isn't.

ok, let me try to pin this down a bit more.
considering:
union u {
int x;
int y;
};
(u.x == u.y) should always evaluate true?
and with
union u {
int x;
short y;
};
{
int x = u.x;
u.y = u.y + 4; /* anything to change value of y */
x != u.x; /* this must evaluate true? */
}
or, in the last case, the value of u.x must change (because u.y must
overlap it), but there's no way to determine which bits of u.x changed.
 
K

Keith Thompson

tedu said:
ok, let me try to pin this down a bit more.
considering:
union u {
int x;
int y;
};
(u.x == u.y) should always evaluate true?

I believe so, yes. At least I can't think of a way a conforming
implementation could avoid it. (I'll assume u is an object of type
"union u".)
and with
union u {
int x;
short y;
};
{
int x = u.x;
u.y = u.y + 4; /* anything to change value of y */
x != u.x; /* this must evaluate true? */
}
or, in the last case, the value of u.x must change (because u.y must
overlap it), but there's no way to determine which bits of u.x changed.

That's likely to be true on any real-world implementation, but not on
the DS9K. If adding 4 to u.y affects only bits that happen to be
padding bits of u.x, the representation of x will change, but its
value might not -- or it might become a trap representation.

For that matter, it's not inconceivable that sizeof(short) > sizeof(int).
This could happen if short has more padding bits than int. If adding
4 to u.y affects only bits that aren't part of u.x, both the
representation and value of u.x could be unchanged. (The standard
requires the range of int to include the range of short; it doesn't
actually say anything about their sizes.) But this should happen only
in a deliberately perverse implementation.
 
B

boa

Keith said:
That's likely to be true on any real-world implementation, but not on
the DS9K. If adding 4 to u.y affects only bits that happen to be
padding bits of u.x, the representation of x will change, but its
value might not -- or it might become a trap representation.

Are you sure about this?

From C99, §6.2.6.1 #7:
When a value is stored in a member of an object of union type, the bytes
of the object representation that do not correspond to that member but
do correspond to other members take unspecified values, but the value of
the union object shall not thereby become a trap representation.

Boa
 
K

Keith Thompson

boa said:
Are you sure about this?

From C99, §6.2.6.1 #7:
When a value is stored in a member of an object of union type, the bytes
of the object representation that do not correspond to that member but
do correspond to other members take unspecified values, but the value
of the union object shall not thereby become a trap representation.

Reasonably sure, yes. The value *of the union object* cannot become a
trap representation. The value of a member of the union can.

If you think about it, there's no way to avoid that possibility.
Given:

union u {
some_type foo;
unsigned char bar[sizeof(some_type)];
};
union u u_obj;

if some_type has any trap representations at all, it's possible to
create such a trap representation by assigning appropriate values to
u_obj.bar.

The requirement you quoted basically says that any reference to the
union as a whole (rather than to one of its members) should treat it
as an uninterpreted bag of bits.
 
B

boa

Keith said:
Reasonably sure, yes. The value *of the union object* cannot become a
trap representation. The value of a member of the union can.

Thanks. For some reason, I read that as "of the union member object".
Why? No idea.

boasema
[snip]
 
S

S.Tobias

Keith Thompson said:
I believe so, yes. At least I can't think of a way a conforming
implementation could avoid it. (I'll assume u is an object of type
"union u".)
# 6.7.2.1
# 14 [...] The value of at most one of the members can be stored in
# a union object at any time. [...]

It says that after you store a value into a union members, all other
members don't have a value, so it must be UB to read through them.
In the absence of other rules, I believe it gives the compiler license
to assume that for value reading, different member-access expressions
cannot alias. This means that a union could be treated as a struct, with
the exception that for representation purposes the members must lie
at the beginning of the object.

I'm not sure how relevant this is (ie. how much the Readers are aware of
this): In the draft n869.txt in 6.5.2.3#5, the first sentence "With one
exception, if the value of a member of a union object is used when the
most recent store to the object was to a different member, the behavior
is implementation-defined.70)" is *not* in the C99 Standard (but I see
it in C89 draft, so it must have been in C89.).

I'm aware of a point in Annex J.1, which lists as unspecified:
# -- The value of a union member other than the last one stored
# into (6.2.6.1).
IMHO it is wrong: 6.2.6.1p7 is talking about the representation of a
union object, and does not give the complete semantics of member access.

I have looked through an amount of posts from the last years. I don't
give any specific references, it's enough to say that opinions varied to
the extremes. Eg. C.Feather at the exactly same case as the discussed
above one, said it was defined (it's undefined if union members are
incompatible); at another occasion Dan Pop said one could read only the
last written-to member, with an exception of character members; at yet
another time Doug Gwyn said reading a not-last-written-to union member
was meant to be undefined (but his remark was in a context where two
fields were incompatible, so I couldn't judge how far conclusions could
be drawn).

(I suggest, can we add c.s.c. to the discussion?)
 
N

Netocrat

Keith Thompson said:
I believe so, yes. At least I can't think of a way a conforming
implementation could avoid it. (I'll assume u is an object of type
"union u".)
# 6.7.2.1
# 14 [...] The value of at most one of the members can be stored in #
a union object at any time. [...]

It says that after you store a value into a union members, all other
members don't have a value, so it must be UB to read through them. In
the absence of other rules, I believe it gives the compiler license to
assume that for value reading, different member-access expressions
cannot alias. This means that a union could be treated as a struct,
with the exception that for representation purposes the members must lie
at the beginning of the object.

I'm not sure how relevant this is (ie. how much the Readers are aware of
this): In the draft n869.txt in 6.5.2.3#5, the first sentence "With one
exception, if the value of a member of a union object is used when the
most recent store to the object was to a different member, the behavior
is implementation-defined.70)" is *not* in the C99 Standard (but I see
it in C89 draft, so it must have been in C89.).

I'm aware of a point in Annex J.1, which lists as unspecified: # --
The value of a union member other than the last one stored # into
(6.2.6.1).
IMHO it is wrong: 6.2.6.1p7 is talking about the representation of a
union object, and does not give the complete semantics of member access.

I have looked through an amount of posts from the last years. I don't
give any specific references, it's enough to say that opinions varied to
the extremes.
[omit listing of specific opinions]

Did you come across Tim Rentsch's post to c.l.c of 6 December 2005, where
he points out DR283? Its TC is not yet present in n1124, but it reads:
| Attach a new footnote 78a to the words "named member" in 6.5.2.3#3:
|
| 78a If the member used to access the contents of a union object is
| not the same as the member last used to store a value in the object,
| the appropriate part of the object representation of the value is
| reinterpreted as an object representation in the new type as
| described in 6.2.6 (a process sometimes called "type punning"). This
| might be a trap representation.
(I suggest, can we add c.s.c. to the discussion?)

I haven't added c.s.c as it seems from that DR and the associated DR257
that this has been pretty well discussed already.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top