Given a union of the form
union {
T1 m1;
T2 m2;}obj;
where T1 and T2 are different scalar (non-aggregate) types.
The C99 standard states that
obj.m1 = value;
if (obj.m2 ...
invokes undefined behavior because my reference to the union is via a
member different than the last one stored into.
Right. Note that on "real world" systems (as opposed to Deathstations
or some such

) the problem is most likely to occur when T1 is
some sort of integral type and T2 is some sort of floating-point
type, and you have managed to store a reserved or signalling-NaN
bit pattern into the bytes that will be examined for obj.m2. For
instance, it is easy enough to come up with bit patterns that result
in "floating point exception" crashes on Intel CPUs (provided
signalling NaNs are not being ignored) when T1 is int and T2 is
float, or when T1 is long long and T2 is double.
My question is, what about the following?
memcpy(&obj, &data, sizeof data);
if (obj.m1 ...
Ignoring the pathological cases such as sizeof data > sizeof obj or
sizeof data < sizeof (T1), is this valid?
Since this copies bytes (what C99 calls "object representations")
from "data" to "obj", it is valid if and only if those bytes are
those resulting from storing a valid value to an obj.m1 or equivalent.
One obvious problem here is that "obj" has an unnamed union type,
so that it is impossible for "data" to have the same type unless
"data" is declared and defined in a separate translation unit --
but in that separate translation unit it is at least difficult, if
not impossible, to declare "obj" correctly.
If we give the union type a name so that we can consistently refer
to it:
union U { T1 m1; T2 m2; };
union U obj;
union U data;
then we can be sure about what is in "data" if, e.g., we do this:
obj.m1 = value;
memcpy(&data, &obj, sizeof data);
Now "data" is a copy of "obj", so that data.m1 is valid because
obj.m1 is valid. A subsequent memcpy() back to &obj leaves obj.m1
valid again.
If so and if I replace m1 with m2 above (thereby accessing something
other than the first member), is it still valid?
The conditions for whether obj.m2 is valid are basically the same as
those for whether obj.m1 is valid -- the bytes copied from &data to
&obj must be those making up a vaild "object representation".
A somewhat trickier question (and the one I suspect you are really
asking) is: suppose we have union U as above, but we then do
something like this:
union U obj;
T1 data;
...
data = some_valid_value_of_type_t1;
memcpy(&obj, &data, sizeof data);
... now refer to obj.m1 ...
I think it is safe to say that most real-world C implementations
will have no problem with this; but without careful scrutiny of
the C99 standard to prove otherwise, I would assume that
Deathstation-like "evil" C implementations would be allowed to fail
if "unused" bytes of the union were not properly set. For instance,
suppose T1 is int and T2 is double, and sizeof(int) is 4 while
sizeof(double) is 8. Suppose further that the Evil Implementation
handles the union by storing a checksummed copy of the four bytes
making up the "int" in a fifth byte in the space that would otherwise
be occupied by the double. If the checksum fails to match, the
implementation delivers a runtime exception. As far as I can tell
(without careful study of the C99 wording) this is allowed.
In other words, unless you want to depend on the friendliness of
your implementation, Don't Do That.
