I have a somewhat flippant question regarding undefined behaviour. Does
an operation which invokes undefined behaviour affect the whole program,
or are earlier statements guaranteed to execute correctly?
Others have made various notes, including the fact that just because
something actually happened in some particular temporal order --
and assuming that Doctor Who does not come in his TARDIS and change
the past -- does not guarantee that future actual events cannot
erase the effect before you have a chance to observe it. (For
instance, if your program really does print a message, which really
does appear on the screen, but then the monitor fails -- perhaps
because your "undefined behavior" went so far as to change the sync
frequency to something out of range, and you have a monitor built
in 1983 that fries when you do this rather than one built recently
that switches to showing "sync frequency out of range" -- before
you notice that the output has appeared, you will never know that
the output *did* appear.)
More practically, though, Standard C is all about *observable*
behavior from a conforming program. For instance, in:
void f(void) {
int i = 0;
do_something();
i++;
do_something_else();
i = 42;
do_third_thing();
}
the local variable "i" is never *used*, just set to 0,
incremented, and set to 42. A compiler can remove some or
all of these. If we add an "observation" at each point:
int i = 0;
printf("i = %d\n", i);
do_something();
i++;
printf("i = %d\n", i);
do_something_else();
i = 42;
printf("i = %d\n", i);
do_third_thing();
our "observable" output has to include "i = 0", "i = 1", and
"i = 42" -- but a smart compiler can still remove the variable,
and just call printf with the constants; or, if it is even
smarter, it can replace the printf() calls with puts() calls:
puts("i = 0"); /* remember, puts adds \n */
do_something();
puts("i = 1");
do_something_else();
puts("i = 42");
do_third_thing();
Furthermore, a compiler is allowed to *assume* that undefined
behavior did not occur -- so if we do something like:
void g(void) {
int *ptr, *func(void);
ptr = func();
(void) *ptr;
}
this must "call func" and then "use the resulting pointer"; but
even if func() happens to return NULL, and *(int *)NULL crashes on
our implementation, the compiler can *assume* that func() returned
a valid pointer, see that *ptr is not actually used for anything,
and remove the reference. (Adding a "volatile" qualifier is intended
to prevent this kind of removal, and does in real C compilers,
although the wording in the Standard is loose enough to allow a
compiler to "cheat".) Similarly, given:
void h(void) {
struct S *p, *z(void);
p = z();
(void) p->field;
puts("z() did not return NULL");
}
is not guaranteed to fail even if z() *does* return NULL, and the
output "z() did not return null" can mislead us. (As sane and
reasonable programmers, we can avoid misleading ourselves by
rewriting this as:
if (p != NULL)
puts("z() did not return NULL");
rather than relying on the "crash" that our implementation promises
us, above and beyond the requirements of Standard C; this is more
sensible, and I think more defensible, than adding a volatile
qualifier to p, even though that might also work. In any case,
*all* Standard C compilers are required to handle the comparison
the way we would like. We stop relying on a particular feature of
a particular implementation, make the code simultaneously clearer
*and* more portable.)
Last, as some others mentioned earlier, once you step far enough
outside the Standard, "observable" behavior may well involve
rearranging things that you *thought* had a particular temporal
sequence. Hence:
void h(void) {
struct S *p, *z(void);
p = z();
puts("before access of p->field");
use(p->field);
puts("after access of p->field");
}
can legitimately be compiled "as if" it read:
void h(void) {
struct S *z(void);
int tmp;
tmp = z()->field;
puts("before access of p->field");
use(tmp);
puts("after access of p->field");
}
(assuming of course that p->field has type "int"; if not, make the
obvious change). Most C compilers do, however, tend to avoid
hoisting pointer dereference operations above function calls, even
in cases where Standard C guarantees that it is OK, if only because
certain implementation-specific tricks will change contents of
memory in ways that the *implementation* guarantees (though
Standard C says nothing about it). Moreover, if we replace
puts() with a user-written function -- for instance:
p = z();
userfunc("string");
and userfunc() and z() cooperate, e.g.:
struct S shared;
struct S *z(void) { shared.field = 0; return &shared; }
void userfunc(const char *str) { shared.field = 42; }
then the compiler *cannot* (even by the limited rules of Standard
C) hoist the access in h() above the call to userfunc(), since
userfunc() changes p->field from 0 to 42.
Getting all this stuff right is tricky, and is why compiler people
are (sometimes) well-paid, and why so many compilers have so many
bugs.