Undefined Behaviour designed to be caught [Was: Books for advancedC++ debugging]

N

Nick Keighley

this was originally on comp.lang.c++
but discussions about Undefined Behaviour seem on-topic to comp.lang.c
as well


[snipped discussion about run-time error when OP accessed
uninitialized memory. OP complained that C++ compiler (gcc)
could not detect this even though its warning level was set to
highest]
You seem to be taking the opinion that compilers should
catch all undefined behavior. C++ is not Java. C++'s stated
primary design goals include
- runtime performance comparable with assembly
- don't pay for what you don't use
- portable
- easy to write code / programmer productivity (with less relative
emphasis on this one IMHO)
With these design goals in mind, it is not reasonable to
expect a compiler to catch all possible undefined behavior
or errors. To do that would necessarily restrict the
language so that it's less comparable to assembly in speed
and/or you start paying for things you don't use.
That's not strictly true.  Both the C and the C++ standards
were designed so that all undefined behavior can be caught.

this surprised me

At run-time, at the latest.  (I think that there is some which
can't be detected at compile time.  But probably a lot less than
one might think---compilers have gotten quite good at tracing
intermodular code flow.)  And it's scattered throughout the
standard.  Mostly in the form of "undefined behavior"---the
behavior is undefined precisely so that a checking
implementation can trap it.

I thought a fair amount of Undefined Behaviour was implicit.
Thta is no behaviour was defined therefore the behaviour was
undefined; rather there being an explicit statement that
"this behaviour is not defined". I'm pretty sure this is true
of C if not C++.

Too hard, no.  Too expensive, perhaps: to catch all pointer
violations, you need "fat" pointers---each pointer contains a
current address, plus the limits, each modification of the
pointer value verifies that the current address stays in the
limits, and each access through the pointer verifies that it
isn't using the end pointer (and that the pointer isn't null,
but most hardware traps this already today).

Of course, a good compiler could eliminate a certain number of
these checks, or at least hoist them outside of a loop.  But I
don't think it could easily avoid the fact that the size of a
pointer is multiplied by three, which makes things like copying
significantly more expensive, and can have very negative effects
on locality.


That's unspecified, not undefined behavior.


Takes a pointer.  If the pointer contains the bounds, then it
can easily check.

<snip>
 
J

James Kanze

this was originally on comp.lang.c++ but discussions about
Undefined Behaviour seem on-topic to comp.lang.c as well

Given that C++ just takes over the C definition here.
this surprised me

On thinking about it, I probably overstated it. The context of
the discussion was things like array bounds and pointer errors,
and that's really what I had in mind. Although I think things
like i = ++i are catchable, I don't think that the intent of
making it undefined was to allow it to be caught at runtime.

There are still large categories of behavior which is undefined
expressedly to allow an implementation to catch it; arithmetic
overflow and array bounds and pointer errors are in this
category.
I thought a fair amount of Undefined Behaviour was implicit.
Thta is no behaviour was defined therefore the behaviour was
undefined; rather there being an explicit statement that "this
behaviour is not defined". I'm pretty sure this is true of C
if not C++.

I don't think so. I think that almost all of the cases of
undefined behavior are explicitly stated as such. I think that
the rule of undefined behavior when the standard doesn't say
anything is mainly there to catch oversights. What "undefined
behaviors" did you have in mind?

(The typical examples of undefined behavior are all explicitely
stated as undefined: pointer and array bounds errors in the
specifications of the various operators on pointers, things like
i=++i in the header text for the Expressions section, illegal
operands to functions in the introductory text of the Library
section, and violations of what C++ calls the one definition
rule in section 3.2 in C++, and in section 6.2.7 in C.)
 
E

Eric Sosman

James said:
[...]
I thought a fair amount of Undefined Behaviour was implicit.
Thta is no behaviour was defined therefore the behaviour was
undefined; rather there being an explicit statement that "this
behaviour is not defined". I'm pretty sure this is true of C
if not C++.

I don't think so. I think that almost all of the cases of
undefined behavior are explicitly stated as such. I think that
the rule of undefined behavior when the standard doesn't say
anything is mainly there to catch oversights. What "undefined
behaviors" did you have in mind?
[...]

The Committee's reasons for using three means to declare
behavior "undefined" are unknown to me, but there is no
difference in effect or in quality between "undefined due
to violation," "explicitly undefined" and "undefined by
omission." ISO/IEC 9899:1999, section 4 paragraph 2:

If a ‘‘shall’’ or ‘‘shall not’’ requirement that
appears outside of a constraint is violated, the
behavior is undefined. Undefined behavior is otherwise
indicated in this International Standard by the words
‘‘undefined behavior’’ or by the omission of any
explicit definition of behavior. There is no difference
in emphasis among these three; they all describe
‘‘behavior that is undefined’’.

The final sentence says it all (and says it normatively!):
All three means of un-definition are equivalent. (In C, at
any rate: I don't know That Other Language.)
 
J

James Kuyper

Eric said:
James said:
[...]
I thought a fair amount of Undefined Behaviour was implicit.
Thta is no behaviour was defined therefore the behaviour was
undefined; rather there being an explicit statement that "this
behaviour is not defined". I'm pretty sure this is true of C
if not C++.

I don't think so. I think that almost all of the cases of
undefined behavior are explicitly stated as such. I think that
the rule of undefined behavior when the standard doesn't say
anything is mainly there to catch oversights. What "undefined
behaviors" did you have in mind?
[...]

The Committee's reasons for using three means to declare
behavior "undefined" are unknown to me, but there is no
difference in effect or in quality between "undefined due
to violation," "explicitly undefined" and "undefined by
omission." ISO/IEC 9899:1999, section 4 paragraph 2:

If a ‘‘shall’’ or ‘‘shall not’’ requirement that
appears outside of a constraint is violated, the
behavior is undefined. Undefined behavior is otherwise
indicated in this International Standard by the words
‘‘undefined behavior’’ or by the omission of any
explicit definition of behavior. There is no difference
in emphasis among these three; they all describe
‘‘behavior that is undefined’’.

The final sentence says it all (and says it normatively!):
All three means of un-definition are equivalent. (In C, at
any rate: I don't know That Other Language.)

The C++ standard does not mention "shall" as a method of indicating
undefined behavior. Section 1.3.13 says "Undefined behavior may also be
expected when this International Standard omits the description of any
explicit definition of behavior.", which strikes me as bad wording - the
phrase "may ... be expected" reflects and reinforces the misconception
that "undefined behavior" refers to a specific type of undesireable
behavior.

I think that the C++ wording provides more support for James Kanze's
opinion that the C wording does.
 
N

Nick Keighley

Given that C++ just takes over the C definition here.

which is why I thought it was a legitimate x-post


On thinking about it, I probably overstated it.  The context of
the discussion was things like array bounds and pointer errors,
and that's really what I had in mind.  Although I think things
like i = ++i are catchable, I don't think that the intent of
making it undefined was to allow it to be caught at runtime.

ah, that what I was disputing. Or rather that it was took me by
surprise I'd always kind of assumed they bunged in UB just to make
the implementor's job easier.
 
J

Jerry Coffin

"Nick Keighley" <[email protected]> ha scritto nel messaggio
this was originally on comp.lang.c++
but discussions about Undefined Behaviour seem on-topic to comp.lang.c
as well

in a cpu can not be undefinited behaviour because
if the cpu is in the state X and it read the instruction "a"
the result will be always the state X'

the same for the couple cpu-os
of the cpu-os is in the state XX and it read the instruction "a"
the result will be always the state XX'

UB exist only in the standards

Not really.

First of all, some CPUs have instructions that cause undefined
results -- and while on a _specific_ CPU, the result of execution may
be predictable, different versions of the CPU, down to and including
different steppings, may give different behavior for that
instruction.

In other cases, the behavior even on a single CPU could be
unpredictable -- just for example, Intel has included a thermal diode
in some of their CPUs that's intended as high quality (albeit slow)
source of truly random numbers. While there are certainly defined
ways to access that diode, it's entirely possible that executing some
undefined instruction could do so as well -- and at least part of the
result state after doing so could be entirely unpredictable.
 
J

Jerry Coffin

[ ... ]
i not speak about standards, i speak about a real cpu
if one 386 cpu of state X(eax=1, ebx=19, ecx=20 ...)
read the binary of "add eax, ebx"
the result will be always the state of cpu
X'(eax=20, ebx=19, ecx=20 ...)

Yes, but what if what's executed is 'add eax, [ebx]' instead? If ebx
happens to point to uninitialized memory, you don't know what you'll
get in eax, and it will probably vary from one invocation of the
program to the next. If ebx starts out set to zero (or another small
number, typically anything less than 4 million or so) quite a few
OSes will detect that you're accessing an illegal address, and halt
the program with some sort of error message about it doing something
illegal (of course, the exact message varies between OSes).
 
B

Barry Schwarz

Jerry Coffin said:
[ ... ]
i not speak about standards, i speak about a real cpu
if one 386 cpu of state X(eax=1, ebx=19, ecx=20 ...)
read the binary of "add eax, ebx"
the result will be always the state of cpu
X'(eax=20, ebx=19, ecx=20 ...)

Yes, but what if what's executed is 'add eax, [ebx]' instead? If ebx
happens to point to uninitialized memory,

i see the hardware 386cpu, the memory that can read, like a system
if this system has one state

State(0)={X(eax=20, ebx=19, ecx=20 ...)
Memory={1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,1 ...}
}
if the cpu read the instruction
'add eax, [ebx]'

State(1)={X(eax=21, ebx=19, ecx=20 ...)
Memory={1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,1,...}
}
there is not UB

it is like a phisical system
if i know the position in the time 0 i know the position in the next time 1

UB in that system could be only for the fail of cpu

Except you will never know all the states. Every time you run the
program, the clock will be different. If you are running on a typical
system, you share the CPU, memory, and OS with numerous other tasks
running "simultaneously" and their states will not be the same from
one execution to the next.

If you run the program in the morning and it produces a result of 42
and in the afternoon and it produces 24 or you recompile it on an odd
numbered Thursday during a full moon and it now produces 69 or you
change compilers and it now produces 7734, there is no way for you to
define the behavior of the program. Hence, its behavior is undefined.
 
J

James Kanze

"Nick Keighley" <[email protected]> ha scritto nel messaggiothis was originally on comp.lang.c++
but discussions about Undefined Behaviour seem on-topic to
comp.lang.c as well
in a cpu can not be undefinited behaviour because if the cpu
is in the state X and it read the instruction "a" the result
will be always the state X'

Obviously, you've never worked on real hardware. CPU's have
certain rules which must be obeyed, and can have undefined
behavior if you violated them.
the same for the couple cpu-os of the cpu-os is in the state
XX and it read the instruction "a" the result will be always
the state XX'
Ditto.

UB exist only in the standards

Not really. What you can say is that most of the cases which
result in undefined behavior in the standard will still be
compiled to deterministic code on most most machines. What that
code does, however, is not defined, and can be pretty much
anything. Including behavior not recognized by the standard
(like generating a core dump, crashing the system or the
processor, formatting the hard disk, sending spam emails to half
the world...).
 
J

James Kanze

Jerry Coffin said:
it is like a phisical system if i know the position in the
time 0 i know the position in the next time 1
UB in that system could be only for the fail of cpu
Except you will never know all the states. Every time you run the
program, the clock will be different. If you are running on a typical
system, you share the CPU, memory, and OS with numerous other tasks
running "simultaneously" and their states will not be the same from
one execution to the next.
If you run the program in the morning and it produces a result
of 42 and in the afternoon and it produces 24 or you recompile
it on an odd numbered Thursday during a full moon and it now
produces 69 or you change compilers and it now produces 7734,
there is no way for you to define the behavior of the program.
Hence, its behavior is undefined.

The actual speed of the gates on the chip will depend on the
temperature. Depending on this speed, the results of accessing
"inexistant" memory may vary. (Just to cite one case I've
actually encountered. The program caused the machine to hang or
not, depending on how long the machine had been turned on.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top