Question regarding UB

P

Philip Potter

I have a somewhat flippant question regarding undefined behaviour. Does
an operation which invokes undefined behaviour affect the whole program,
or are earlier statements guaranteed to execute correctly?

For example:

#include <stdio.h>

int main(void) {
int i;
printf("Hello, world!\n");
fflush(stdout);
i = i++; /* or some other undefined behaviour */
return 0;
}

Is the printf() statement guaranteed to execute? Is "Hello, world!\n"
guaranteed to be sent to stdout? Or could a conforming implementation
refuse to compile this program, or accept it and let loose nasal demons
without even greeting the world first?
 
J

Jean-Marc Bourguet

Philip Potter said:
I have a somewhat flippant question regarding undefined behaviour. Does an
operation which invokes undefined behaviour affect the whole program, or
are earlier statements guaranteed to execute correctly?

I don't think so. Case where such non causal effect is plausible:

if (p == NULL) {
printf("Pointer is NULL\n");
}
x = p->field;

If p is NULL we have UB here, so p must not be NULL for the program be
conforming. That knowledge is back propagated and then the test optimized
out.

I seem to remember that the value range propagation pass of gcc did this
but I couldn't trigger something similar in released versions.

Yours,
 
S

santosh

Philip said:
I have a somewhat flippant question regarding undefined behaviour.
Does an operation which invokes undefined behaviour affect the whole
program, or are earlier statements guaranteed to execute correctly?

For example:

#include <stdio.h>

int main(void) {
int i;
printf("Hello, world!\n");
fflush(stdout);
i = i++; /* or some other undefined behaviour */
return 0;
}

Is the printf() statement guaranteed to execute? Is "Hello, world!\n"
guaranteed to be sent to stdout? Or could a conforming implementation
refuse to compile this program, or accept it and let loose nasal
demons without even greeting the world first?

I *think* a conforming implementation may do anything it wants with a
program that invokes undefined behaviour, including refusing to
translate it. If the latter, then a diagnostic is required. During
runtime, no constraint on program behaviour is imposed.

Having said that of course, one would be hard pressed to point to actual
implementations where invoking undefined behaviour affects behaviour of
the preceding program statements.
 
M

Micah Cowan

Philip said:
I have a somewhat flippant question regarding undefined behaviour. Does
an operation which invokes undefined behaviour affect the whole program,
or are earlier statements guaranteed to execute correctly?

For example:

#include <stdio.h>

int main(void) {
int i;
printf("Hello, world!\n");
fflush(stdout);
i = i++; /* or some other undefined behaviour */
return 0;
}

Is the printf() statement guaranteed to execute? Is "Hello, world!\n"
guaranteed to be sent to stdout? Or could a conforming implementation
refuse to compile this program, or accept it and let loose nasal demons
without even greeting the world first?

What an interesting question!

I'm inclined to think the latter. In particular, I think a compiler
refusing to finish translating the program, saying "I don't know how to
translate multiple modifications to the same object without an
intervening sequence point", would be behaving properly.

Certainly there are "non-runtime" cases of UB that I might expect to
affect the implementation before translation had finished/execution had
begun; such as the ## preprocessing operator resulting in an invalid
preprocessing token.

Even for UB which could not possibly be discovered until execution had
begun (many invalid pointer dereferences, for example), though, I think
there are really no guarantees. AFAICT, an implementation that wasn't
actually evaluating statements in order, but only behaved "as if" it
were, is no longer bound by the "as if" once UB comes into play.
 
M

Micah Cowan

santosh said:
I *think* a conforming implementation may do anything it wants with a
program that invokes undefined behaviour, including refusing to
translate it. If the latter, then a diagnostic is required. During
runtime, no constraint on program behaviour is imposed.

FWIW, I can't find a normative reference to that effect (just notes,
footnotes).
 
V

vippstar

No, they are not guaranteed.
If your C code invokes undefined behavior, any conforming
implementation can produce *any* output.
I don't think so. Case where such non causal effect is plausible:

if (p == NULL) {
printf("Pointer is NULL\n");}

x = p->field;

If p is NULL we have UB here, so p must not be NULL for the program be
conforming. That knowledge is back propagated and then the test optimized
out.
After being assigned to the return value of malloc; perhaps.
It could still be undefined behavior.
Example:
/* sizeof *p == 12 */
p = malloc(11);
p->field = 0;
But a pointers value not being NULL guarantees almost nothing.
ie
p = malloc(1); /* assume malloc returns non-NULL */
free(p);
/* what meaning does the value of p have now? */
I seem to remember that the value range propagation pass of gcc did this
but I couldn't trigger something similar in released versions.
I do not know this, but I guess you could try a gcc newsgroup if you
are interested on the validity.
 
P

Philip Potter

Jean-Marc Bourguet said:
I don't think so. Case where such non causal effect is plausible:

if (p == NULL) {
printf("Pointer is NULL\n");
}
x = p->field;

If p is NULL we have UB here, so p must not be NULL for the program be
conforming. That knowledge is back propagated and then the test optimized
out.

I seem to remember that the value range propagation pass of gcc did this
but I couldn't trigger something similar in released versions.

Interestingly, you have touched on one of the points that drove me to
ask this question; I recently attended a talk by Sebastian and Antoniu
Pop which touched on this value range propagation behaviour.

Incidentally, I agree that what you say makes sense and this is the way
the Standard *should* be. I don't know how it actually is, however.

Phil
 
E

Eric Sosman

Philip said:
I have a somewhat flippant question regarding undefined behaviour. Does
an operation which invokes undefined behaviour affect the whole program,
or are earlier statements guaranteed to execute correctly?

For example:

#include <stdio.h>

int main(void) {
int i;
printf("Hello, world!\n");
fflush(stdout);
i = i++; /* or some other undefined behaviour */
return 0;
}

Is the printf() statement guaranteed to execute? Is "Hello, world!\n"
guaranteed to be sent to stdout? Or could a conforming implementation
refuse to compile this program, or accept it and let loose nasal demons
without even greeting the world first?

Hard to say. Note that fflush() delivers any buffered
output "to the host environment," which may not be the same
thing as "to its final destination" -- which might, for
example, be on the far end of a TCP/IP connection currently
undergoing a network storm. fflush() will probably deliver
the data to the network stack and not wait for acknowledgment
from the other end, and then when the U.B. strikes and takes
the local machine down in flames, the remote side's socket
may just time out and give up. No greeting is received.

Also, there are undefined behaviors that are less strongly
tied to a particular execution path. For example,

/* main.c */
#include <stdio.h>
int answer = 42;
int function(void);
int main(void) {
printf ("%d\n", answer == 0 ? function() : answer);
return 0;
}

/* function.c */
extern double answer;
int function(void) { return answer; }

.... produces undefined behavior because the two declarations of
`answer' are in conflict (6.2.7p2). It's U.B. even though function()
is not called -- maybe the linker catches the mismatch and refuses
to produce an executable program, maybe the linker succeeds but
the program's frammis mapping is whirligigged, ...
 
K

Keith Thompson

santosh said:
I *think* a conforming implementation may do anything it wants with a
program that invokes undefined behaviour, including refusing to
translate it. If the latter, then a diagnostic is required. During
runtime, no constraint on program behaviour is imposed.

No diagnostic is required. A note (non-normative, but obviously
intended to reflect the normative intent) in C99 3.4.3, the definition
of "undefined behavior", says:

NOTE Possible undefined behavior ranges from ignoring the
situation completely with unpredictable results, to behaving
during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution
(with the issuance of a diagnostic message).
Having said that of course, one would be hard pressed to point to
actual implementations where invoking undefined behaviour affects
behaviour of the preceding program statements.

Well, rejecting the tranlsation unit certainly affects the behavior of
preceding statements. Apart from that, you're probably right, but I
certainly wouldn't count on it.
 
S

santosh

Keith said:
No diagnostic is required. A note (non-normative, but obviously
intended to reflect the normative intent) in C99 3.4.3, the definition
of "undefined behavior", says:

NOTE Possible undefined behavior ranges from ignoring the
situation completely with unpredictable results, to behaving
during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution
(with the issuance of a diagnostic message).

Maybe I'm not getting it, but as far as I can make out from the excerpt
above, a diagnostic is required only if translation or execution is
terminated. If translation does succeed and the program is executed,
then I suppose that the Standard imposes no further requirements on the
behaviour of such a program with the exception that *if* execution is
terminated due to undefined behaviour, then a diagnostic is needed.

Since termination of execution is only one possible outcome of undefined
behaviour, I wonder why a diagnostic has been mandated for it?
Well, rejecting the tranlsation unit certainly affects the behavior of
preceding statements. Apart from that, you're probably right, but I
certainly wouldn't count on it.

I would argue that if a translation unit is rejected during translation,
then there *are* no preceding statements. Of course this may not hold
if the program is being interpreted, but I don't think that the
creators of the Standard gave much consideration to a C interpreter.
 
K

Keith Thompson

santosh said:
Maybe I'm not getting it, but as far as I can make out from the excerpt
above, a diagnostic is required only if translation or execution is
terminated. If translation does succeed and the program is executed,
then I suppose that the Standard imposes no further requirements on the
behaviour of such a program with the exception that *if* execution is
terminated due to undefined behaviour, then a diagnostic is needed.

I'm embarrassed to admit that I didn't read the entire paragraph
before I copy-and-pasted it; I didn't notice the final ("with the
issuance of a diagnostic message") clause.
Since termination of execution is only one possible outcome of undefined
behaviour, I wonder why a diagnostic has been mandated for it?

I don't think it is. I don't believe the note is intended to cover
*all* possible consequences of undefined behavior. In particular, I
believe this note is the only place in the standard other than the
description of assert() that even hints at the possibility of a
run-time diagnostic.

I'd also argue that the previous phrase, "behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message)",
covers the possibility of terminating the program's compilation or
execution *without* a diagnostic message. I'm not sure what the final
parenthesized clause adds; in particular, I don't see how the meaning
would differ if it were removed. Perhaps it's intended as a hint that
issuing a diagnostic when possible would be a nice idea.

[snip]
 
S

santosh

Keith said:
I don't think it is. I don't believe the note is intended to cover
*all* possible consequences of undefined behavior. In particular, I
believe this note is the only place in the standard other than the
description of assert() that even hints at the possibility of a
run-time diagnostic.

I'd also argue that the previous phrase, "behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message)",
covers the possibility of terminating the program's compilation or
execution *without* a diagnostic message. I'm not sure what the final
parenthesized clause adds; in particular, I don't see how the meaning
would differ if it were removed. Perhaps it's intended as a hint that
issuing a diagnostic when possible would be a nice idea.

As you say, if the phrase "behaving during translation or program
execution in a documented manner characteristic of the environment (with
or without the issuance of a diagnostic message)", is also intended for
cases of aborted termination or execution, then the last clause becomes
rather moot.

I suspect that the Committee recommends a diagnostic (only recommends
since notes are not normative) if translation or execution is
terminated, as a practise encouraging good QoI, since this is usually
quite deliberate on the part of the implementation, and probably done
only for extreme cases of undefined behaviour.

The previous phrase is again for those instances of UB which can be
detected at compile or runtime, and a characteristic and documented
response is surely better than doing something random. The first
phrase, of course, is the "catch all" for all instances of UB where the
implementation either cannot detect it or cannot respond meaningfully.
 
U

user923005

I have a somewhat flippant question regarding undefined behaviour. Does
an operation which invokes undefined behaviour affect the whole program,
or are earlier statements guaranteed to execute correctly?

For example:

#include <stdio.h>

int main(void) {
    int i;
    printf("Hello, world!\n");
    fflush(stdout);
    i = i++; /* or some other undefined behaviour */
    return 0;

}

Is the printf() statement guaranteed to execute? Is "Hello, world!\n"
guaranteed to be sent to stdout? Or could a conforming implementation
refuse to compile this program, or accept it and let loose nasal demons
without even greeting the world first?

Consider:
int main(void) {
int i;
printf("Hello, world!\n");
memset(&i, 0, 1000000000);
fflush(stdout);
return 0;
}

The previous printf() call completes successfully.
Then the memset() function call causes some horrible crash and we
never see the output.
That would be an example of subsequent undefined behavior affecting
previously correct code.
I do not think it is safe even after the fflush() call because :

"2 Ifstream points to an output stream or an update stream in which
the most recent
operation was not input, the fflush function causes any unwritten data
for that stream
to be delivered to the host environment to be written to the file;
otherwise, the behavior is
undefined."

So the data has been delivered to the operating system, but I do not
think it is a guarantee that it is written (indeed, for some operating
systems you must have a successful fsync() call or some such thing to
really guarantee delivery to the output stream).

So it is clear that we cannot depend on earlier code that is correct
to produce expected output after the introduction of undefined
behavior later in the program.
 
P

pete

Philip said:
I have a somewhat flippant question
regarding undefined behaviour.
Does an operation which invokes undefined behaviour
affect the whole program,
Yes.

or are earlier statements guaranteed to execute correctly?

No.
For example:

#include <stdio.h>

int main(void) {
int i;
printf("Hello, world!\n");
fflush(stdout);
i = i++; /* or some other undefined behaviour */
return 0;
}

Is the printf() statement guaranteed to execute?
No.

Is "Hello, world!\n" guaranteed to be sent to stdout?
No.

Or could a conforming implementation
refuse to compile this program,
or accept it and let loose nasal demons
without even greeting the world first?

Yes.

UB is only what the standard says that it is.

What the standard guarantees, and how a program is likely to run,
are two different things.

The C standard comittee doesn't care about
what any program which contains any undefined behavior, does.

All that is required for a kind of code to be undefined,
is a lack of interest in that kind of code by the standard comittee.
 
F

Flash Gordon

santosh wrote, On 26/02/08 18:09:
As you say, if the phrase "behaving during translation or program
execution in a documented manner characteristic of the environment (with
or without the issuance of a diagnostic message)", is also intended for
cases of aborted termination or execution, then the last clause becomes
rather moot.

I suspect this phase was intended to cover extensions making use of
things the C standard leaves undefined, so the diagnostic (if produced)
would be expected to be along the lines of "extension X being used".
I suspect that the Committee recommends a diagnostic (only recommends
since notes are not normative) if translation or execution is
terminated, as a practise encouraging good QoI, since this is usually
quite deliberate on the part of the implementation, and probably done
only for extreme cases of undefined behaviour.

Yes. One data point is that gcc can be made to issue a warning on some
instances of undefined behaviour, and with an additional option it can
be made to convert the warning in to an error aborting compilation.
The previous phrase is again for those instances of UB which can be
detected at compile or runtime, and a characteristic and documented
response is surely better than doing something random. The first
phrase, of course, is the "catch all" for all instances of UB where the
implementation either cannot detect it or cannot respond meaningfully.

I think the list was intended to be an incomplete list of typical
behaviours. So terminating execution with a diagnostic was refering to
things like "Segmentation violation".
 
C

Chris Torek

I have a somewhat flippant question regarding undefined behaviour. Does
an operation which invokes undefined behaviour affect the whole program,
or are earlier statements guaranteed to execute correctly?

Others have made various notes, including the fact that just because
something actually happened in some particular temporal order --
and assuming that Doctor Who does not come in his TARDIS and change
the past -- does not guarantee that future actual events cannot
erase the effect before you have a chance to observe it. (For
instance, if your program really does print a message, which really
does appear on the screen, but then the monitor fails -- perhaps
because your "undefined behavior" went so far as to change the sync
frequency to something out of range, and you have a monitor built
in 1983 that fries when you do this rather than one built recently
that switches to showing "sync frequency out of range" -- before
you notice that the output has appeared, you will never know that
the output *did* appear.)

More practically, though, Standard C is all about *observable*
behavior from a conforming program. For instance, in:

void f(void) {
int i = 0;
do_something();
i++;
do_something_else();
i = 42;
do_third_thing();
}

the local variable "i" is never *used*, just set to 0,
incremented, and set to 42. A compiler can remove some or
all of these. If we add an "observation" at each point:

int i = 0;
printf("i = %d\n", i);
do_something();
i++;
printf("i = %d\n", i);
do_something_else();
i = 42;
printf("i = %d\n", i);
do_third_thing();

our "observable" output has to include "i = 0", "i = 1", and
"i = 42" -- but a smart compiler can still remove the variable,
and just call printf with the constants; or, if it is even
smarter, it can replace the printf() calls with puts() calls:

puts("i = 0"); /* remember, puts adds \n */
do_something();
puts("i = 1");
do_something_else();
puts("i = 42");
do_third_thing();

Furthermore, a compiler is allowed to *assume* that undefined
behavior did not occur -- so if we do something like:

void g(void) {
int *ptr, *func(void);

ptr = func();
(void) *ptr;
}

this must "call func" and then "use the resulting pointer"; but
even if func() happens to return NULL, and *(int *)NULL crashes on
our implementation, the compiler can *assume* that func() returned
a valid pointer, see that *ptr is not actually used for anything,
and remove the reference. (Adding a "volatile" qualifier is intended
to prevent this kind of removal, and does in real C compilers,
although the wording in the Standard is loose enough to allow a
compiler to "cheat".) Similarly, given:

void h(void) {
struct S *p, *z(void);

p = z();
(void) p->field;
puts("z() did not return NULL");
}

is not guaranteed to fail even if z() *does* return NULL, and the
output "z() did not return null" can mislead us. (As sane and
reasonable programmers, we can avoid misleading ourselves by
rewriting this as:

if (p != NULL)
puts("z() did not return NULL");

rather than relying on the "crash" that our implementation promises
us, above and beyond the requirements of Standard C; this is more
sensible, and I think more defensible, than adding a volatile
qualifier to p, even though that might also work. In any case,
*all* Standard C compilers are required to handle the comparison
the way we would like. We stop relying on a particular feature of
a particular implementation, make the code simultaneously clearer
*and* more portable.)

Last, as some others mentioned earlier, once you step far enough
outside the Standard, "observable" behavior may well involve
rearranging things that you *thought* had a particular temporal
sequence. Hence:

void h(void) {
struct S *p, *z(void);

p = z();
puts("before access of p->field");
use(p->field);
puts("after access of p->field");
}

can legitimately be compiled "as if" it read:

void h(void) {
struct S *z(void);
int tmp;

tmp = z()->field;
puts("before access of p->field");
use(tmp);
puts("after access of p->field");
}

(assuming of course that p->field has type "int"; if not, make the
obvious change). Most C compilers do, however, tend to avoid
hoisting pointer dereference operations above function calls, even
in cases where Standard C guarantees that it is OK, if only because
certain implementation-specific tricks will change contents of
memory in ways that the *implementation* guarantees (though
Standard C says nothing about it). Moreover, if we replace
puts() with a user-written function -- for instance:

p = z();
userfunc("string");

and userfunc() and z() cooperate, e.g.:

struct S shared;
struct S *z(void) { shared.field = 0; return &shared; }
void userfunc(const char *str) { shared.field = 42; }

then the compiler *cannot* (even by the limited rules of Standard
C) hoist the access in h() above the call to userfunc(), since
userfunc() changes p->field from 0 to 42.

Getting all this stuff right is tricky, and is why compiler people
are (sometimes) well-paid, and why so many compilers have so many
bugs. :)
 
P

Philip Potter

Chris said:
Others have made various notes, including the fact that just because
something actually happened in some particular temporal order --
and assuming that Doctor Who does not come in his TARDIS and change
the past -- does not guarantee that future actual events cannot
erase the effect before you have a chance to observe it. (For
instance, if your program really does print a message, which really
does appear on the screen, but then the monitor fails -- perhaps
because your "undefined behavior" went so far as to change the sync
frequency to something out of range, and you have a monitor built
in 1983 that fries when you do this rather than one built recently
that switches to showing "sync frequency out of range" -- before
you notice that the output has appeared, you will never know that
the output *did* appear.)

Yes, I was loathe to use I/O as my observable behaviour, but then in C
what other observable behaviour is there?

I've snipped the rest of your reply, but I read it all and I thank you
very much for it. I never realised quite how awkward the answer could be
- it seems there are no simple answers!

Philip
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top