gcc is helpfully telling you that your code will actually fail on
real, existing implementations (including, in some cases, some
using gcc). ...
It occurs to me that I never explained what gcc's specific problem is
(my example machine, the old Data General Eclipse, has a different
and less-subtle problem here).
The C "abstract machine" -- C as defined by the C standards -- says
that all objects are made up of "bytes" of at least 8 bits. "C
bytes" may be more than 8 bits long, but sizeof(char) is always 1,
even if char is 16 or 32 bits long. Typically, "C bytes" really
are 8-bit bytes -- there are exceptions, such as various digital
signal processing CPUs -- and objects with types like "int" and
"long" and "double" are made up of more than one byte.
Now, suppose we have one of these typical ordinary machines with
8-bit "char", 16-bit "short", 32-bit "int", and 64-bit "long".
In this case, C allows one to do this:
short s;
int i;
long l;
unsigned char *p;
...
p = (unsigned char *)&s;
... work with p[x] for x in {0, 1} ...
p = (unsigned char *)&i;
... work with p[x] for x in {0, 1, 2, 3} ...
p = (unsigned char *)&l;
... work with p[x] for x in {0, 1, 2, 3, 4, 5, 6, 7} ...
Each p[x] here accesses one of the four individual bytes in "i",
one of the 8 in "l", and so on. (Remember that we have nailed
down a particular implementation with sizeof(long) being 8 and
so forth. In the general case we have to check the "sizeof"
before marching off to p[7]; if sizeof(long) is 1 or 2, only
p[0] and maybe p[1] would be allowed. Your machine at home
may well only have 4-byte "long"s.)
The fact that changing p[2] might change i or l creates a problem
for optimizing compilers. The situation in which one so-called
"lvalue", like p[2], affects another like i or l, is called "aliasing".
Here p[2] is an alias for just *part* of i or l -- C also allows
things like:
int *ip;
...
ip = &i;
after which *ip is another name -- i.e., an "alias" -- for *all*
of i.
C compilers are stuck with this situation. If some pointer might
be able to alias some other ordinary variable, the C compiler
still has to get the right answer:
i = 3;
*ip = 4;
printf("i is now %d\n", i);
cannot print "i is now 3" if the "*ip = 4" line changed the value
in i.
C does, however, forbid you to change i when using the wrong
*type* of pointer, other than the special case for decomposing
objects into "bytes". For instance, in this case:
short *sp;
...
sp = (short *)&i; /* highly dubious at best */
the C abstract machine forbids even the assignment to "sp", and
it is definitely the case that writing:
i = 3;
*sp = 4; /* NOT A GOOD IDEA! */
printf("i is now %d\n", i);
does not have to print "i is now 4". In fact, on typical big-endian
machines, without optimization, i is now actually 262147 -- its
32-bit bit pattern has become 0x00040003. To set i to 4 via sp we
would have to write on sp[1] instead of sp[0]. Writing on sp[0]
"works" in this way on typical little-endian machines, though,
provided the compiler is sufficiently stupid (and/or run with "do
not optimize" settings).
Because the abstract machine forbids changing "int i" through the
pointer "short *sp", an optimizing compiler can *assume* that the
"*sp = 4" line did not change it. So even on a little-endian system
like the Intel x86 architecture, a smart, optimizing compiler like
gcc can rewrite those last three lines as:
i = 3;
*sp = 4; /* still not a good idea! */
puts("i is now 3"); /* puts() adds a newline */
After all, you said "set i to 3", then you said "set *sp to 4" but
that cannot possibly change i (even if it does), then you asked to
print the value of i. Since *sp "cannot" change i, it is quite
safe to assume that it *did* not change i -- even if the actual
undefined-behavior-in-abstract-machine but defined-on-x86-hardware
machine-code does change it.
In effect, you have "lied" to the compiler, and it is allowed to
get any arbitrary amount of "revenge" here.
The latest versions of gcc have the ability to optimize code that
older versions of gcc did not, and in particular, to assume that
writing to a "short *" never changes any "int" or "double", or
indeed anything other than a "short". Writing to an "int *" cannot
change a short or a float or a double; writing to a "double *"
cannot change an int or a long; and so on. Under these assumptions
-- which the C standard allows *any* compiler to make -- an optimizing
compiler can produce faster code in many cases, because reading or
writing via a pointer can only see or change variables whose types
match the type to which that pointer points. (Well, except for
those pesky "byte pointers" -- unsigned char * -- that can access
or modify anything.)
Under these strict type-aliasing rules, casting from (e.g.)
"int *" to "short *" is not only quite suspicious, it is also likely
to cause puzzling behavior, at least if you expect your "short *"
to access or modify your "int". Even the time-honored, albeit
dubious, practise of breaking a 64-bit IEEE "double" into two 32-bit
integers (int or long depending on the CPU involved) via a union
need not work, and sometimes does not. (We had a problem with
strtod() not working right because of code just like this. It
worked in older gcc compilers, and eventually failed when gcc began
doing type-specific alias analysis and optimizations.)