*snap*
I'm still waiting for the compiler that actually *does* generate code that
formats your harddrive whenever it encounters UB
- Sylvester
It is interesting that union hacking is undefined behavior, whereas
reinterpret_cast has implementation-defined behavior.
In the (non-normative) example in 9.5 para 2 of my copy of the draft
standard (which has probably moved), it shows an anonymous union { int
a; char *p ; } and notes that a and p are ordinary variables which
have the same address. Local variables may be placed in registers -
sometimes even if their address is taken - in which case the compiler
could reasonably place a and p in different registers (especially if
sizeof(int) != sizeof(p)), as that would not affect any fully-
complient program. For example,
void test( bool set_int, bool get_int )
{
union { int a; char *p ; };
if( set_int ) a = 42;
else p = "hello";
if( get_int ) std::cout << a << std::endl;
else std::cout << p << std::endl;
}
test(false, false) ["42"] and test(true, true) ["hello"] are both well-
defined. test(true, false) is undefined and likely to crash.
test( false, true ) may be expected to print the address of the local
string, but - while it probably won't actually format your hard drive
- there are many "reasonable" types of undefined behavior that a
conscientious (i.e. non-malicious) compiler writer may cause to
happen. These include outputting the string address, outputting "42",
or outputting a random integer; architectures which reserve 0x800..000
as an integer NaN may even helpfully tell that a was uninitialized!
The code may have been effectively rewritten as:
void test( bool set_int, bool get_int )
{
register int a; // = NaN
register char *p ;
if( set_int ) a = 42;
else p = "hello";
if( get_int ) std::cout << a << std::endl;
else std::cout << p << std::endl;
}
which would give a garbage (or uninitialized) integer value.
This may have been optimized to:
void test( bool /*set_int*/, bool get_int )
{
register int a = 42; // only value actually used for a.
register char *p = "hello"; // only value actually used for p.
if( get_int ) std::cout << a << std::endl;
else std::cout << p << std::endl;
}
or equivalently:
void test( bool /*set_int*/, bool get_int )
{
if( get_int ) std::cout << 42 << std::endl;
else std::cout << "hello" << std::endl;
}
Using reintrepret_cast on the data should be more reliable - just read
the compiler documentation ;-).
In practice, I must admit to using the "union hack" in low level code
- accessing hardware registers or implementing communications
protocols - without problem. However, this is library code that is
known to be non-portable, and has a suite of unit tests which get run
on each new compiler version.
I can't think of any reasons, outside maintenence situations - where
the union hack would apply to application code.
Best Regards,
David O.