removing a loop cause it to go at half the speed?

M

Mark McIntyre

Robin isn't ignoring those important words; he specifically accounted
for them by invoking implementation-specific definitions of behavior.

Perhaps, but I interpreted the balance of his remark as implying that
it wasn't undefined if an implementation defined it. This is wrong.
So even if an implementation defines the behavior of signed integer
overflow as the usual 2's-complement wraparound, the behavior is still
undefined. Strictly speaking, it's not even implementation-defined
behavior; "implementation-defined behavior" is a subset of
"unspecified behavior", and integer overflow doesn't meet that
definition.

The areas of doubt and uncertainty around implementation defined are
even more rigdly defined than those around UB.

Was Douglas a committe member by the way? :)

Mark McIntyre
 
M

Mark McIntyre

In general, you can't use the run-time behaviour of a program as a
code-validity criterion.

Absolutely. I can't see the point in any of this posting, except this
bit.
and/or under some circumstances (excluded by the spec). To say that the
standard and the newsgroup have nothing to say about such programs is to
make the standard and the newsgroup useless.

You're mistaken.

Mark McIntyre
 
R

Robin Haigh

Keith Thompson said:
Robin isn't ignoring those important words; he specifically accounted
for them by invoking implementation-specific definitions of behavior.
For example, the standard doesn't define the behavior on signed
integer overflow, but an implementation is free to do so.

However, I think that Mark is correct as far as the terminology is
concerned. A note on the definition of "undefined behavior" says:

NOTE Possible undefined behavior ranges from ignoring the
situation completely with unpredictable results, to behaving
during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution
(with the issuance of a diagnostic message).

(Yes, the note is non-normative, but I accept what it says unless it's
contradicted elsewhere.)

So even if an implementation defines the behavior of signed integer
overflow as the usual 2's-complement wraparound, the behavior is still
undefined. Strictly speaking, it's not even implementation-defined
behavior; "implementation-defined behavior" is a subset of
"unspecified behavior", and integer overflow doesn't meet that
definition.


Bad example, sorry. All the same, I don't see the point of "behaving... in
a documented manner..." unless it means "... and things will then continue
normally."

If I write c = a + b and a + b overflows, I'm prepared for alarm bells, but
if the behaviour takes the form of yielding some valid value for a + b, I
would expect that value and not some different value to be assigned to c,
and so on, and I wouldn't expect some expression to be misevaluated 3000
lines later on the grounds that the standard is imposing no requirements.

Otherwise, all extensions become impossible. In practice, an extension
defines some behaviour not defined by the standard and then execution
continues according to the standard.

As I said elsethread, source is polymorphic across implementations, but
behaviour can only be discussed in the context of an implementation. There
is no pure "implementation-free" environment. You can of course discuss a
minimal conforming implementation, one that does nothing it doesn't have to.
The standard says what it must do. But that's not all the standard is for,
it also forms a large part of the description of real-life implementations.
 
K

Keith Thompson

Mark McIntyre said:
I didn't actually. I understood that *if* all bit patterns could
represent values, then there could be no traps. I merely pointed out
that it wasn't relevant to the undefinedness in question since the
Standard doesn't require that behaviour.

The question, I suppose, is whether it makes sense to talk about
conditional undefined behavior. Is undefined behavior an absolute
attribute, or can a program exhibit undefined behavior on one
implementation but not on another? In my opinion, conditional
undefined behavior is possible.

First, here's a program that *unconditionally* exhibits undefined
behavior:

#include <limits.h>
#include <stdio.h>
int main(void)
{
int x = INT_MAX;
x ++;
printf("x = %d\n", x);
return 0;
}

Even if an implementation happens to define the behavior of signed
integer overflow, the program still exhibits UB; whatever behavior the
implementation defines is just one particular manifestation of UB.

But consider this program:

#include <stdio.h>
int main(void)
{
int x = 32767;
x ++;
printf("x = %d\n", x);
return 0;
}

On an implementation with INT_MAX == 32767, this program exhibits UB,
just like the previous one. On an implementation with INT_MAX > 32767,
though, there is no undefined behavior. In fact, on such an
implementation, the standard specifically requires the program to
print "x = 32768".

Similarly, this program:

#include <stdio.h>
#include <limits.h>
int main(void)
{
if (INT_MAX > 32767) {
int x = 32767;
x ++;
printf("x = %d\n", x);
}
else {
puts("x = 32768");
}
return 0;
}

never exhibits undefined behavior. Even though the "x ++;"
would exhibit UB if it were executed on an implementation with
INT_MAX == 32767, it never will be, so it's not an issue. (In fact,
I think this program is even strictly conforming, though in a rather
roundabout way.)

Going back to the trap representation issue, a program that accesses
an uninitialized auto unsigned short object exhibits undefined
behavior if it's executed on an implementation where unsigned short
has one or more trap representations. It *doesn't* exhibit UB if it's
executed on an implementation where unsigned short has no trap
representations.

(Or maybe it does; see comp.std.c for discussion of whether accessing
an an unspecified value that's known not to be a trap representation
can still exhibit UB.)

As a practical matter, though, there's little difference between
unconditional and conditional UB. Both are most likely programming
bugs, unless you're depending on non-portable assumptions.
 
R

Robin Haigh

Jordan Abel said:
Yes, but only behavior that the standard SAYS is implementation-defined
"counts" for the purpose of "defined includes implementation-defined".


That's a possible definition of "defined", I suppose, but it doesn't seem to
be a useful one. It doesn't mean portable, or predictable, or known, or
well-behaved in any way. What it comes down to is

"defined" = this fragment of the code will do this here and that there, and
may do something we haven't thought of on some platform we haven't heard of,
maybe non-existent, but any such platform must document what it does, though
it may document it as being undefined, or at any rate, the immediate
knock-on consequences may be undefined

"undefined" = the same thing with the word "must" replaced by "may"

This may be important to somebody writing compiler documentation, but in
discussing program behaviour it's neither here nor there
 
R

Robin Haigh

Mark McIntyre said:
Absolutely. I can't see the point in any of this posting, except this
bit.


I was trying to figure out what you meant by "Once you start relying on the
behaviour of a
given platform, your code is no longer C.".

Source code is what you build binaries from. Binaries are
implementation-specific. If you compile the source on a lot of different
platforms, the executables might all do the same thing, in which case the
source is portable, or they might not, or you might not even want them to.

Even in the portable case, what the programs are doing internally might be
very different. It'll almost always be different. Even in Hello World,
you've got a string literal. The signedness of the chars, their bit-width,
their values, the representation of the address, are all
implementation-dependent. To know the code is portable you have to figure
out that none of that will actually affect the outcome. With experience,
you don't have to think about it much, but it's there. In other cases the
differences might be much bigger and you do have to work it out.

Or, as I said, you might not want portability. Suppose I have

#include <limits.h>
int main (void) {
if ( CHAR_BIT == 8 ) {
/* code here to output the date */
} else if ( CHAR_BIT == 9 ) {
/* code here to output the words of the Star-spangled Banner */
} else {
int i = 0;
i++ + i++;
}
return 0;
}

Compiled on platform A I get a useful tool to tell me the date. On platform
B I get a useful tool to remind me what comes after the dawn's early light.
On platform C, well who cares. Not portable source, then. But I don't see
why it's not C. If it isn't, what is?

Portability can't come into it, because the definition of portability isn't
an effective test for source code, so you only get a circularity. You have
to analyse behaviour on implementations to draw conclusions about
portability.

If you want to talk about behaviour, what happens when you run a program,
it's not only a function of the source. It's a function of the source and
the implementation and the input.
There is no non-trivial implementation-free behaviour.

If you only want to talk about what the standard says about behaviour, you
can do that as well.
But you aren't talking about behaviour, you're only talking about what the
standard says about behaviour. "The behaviour is undefined" in the standard
only means "if this code is reached, the standard imposes no requirements",
but people quote it as if it said something about the behaviour of the
program. It doesn't: the behaviour of a program might be perfectly
well-defined (on some inputs and implementations or even on all of them) in
spite of this phrase appearing in a clause applicable to a construct that
appears in the code.

A discussion at this level -- what the standard says about the behaviour --
just doesn't get you very far, when you're talking about programs rather
than statements and code fragments. In simple enough cases, it might tell
you the code is portable and tell you what it does. In other simple cases
it might tell you that the code is categorically not portable. Most of the
time in real life it can't tell you anything about anything except
portability and it can't even tell you about that.

To say "The standard says 'The behaviour [of this bit] is undefined' [with
the standard's meaning]" might be indisputably correct, but since it can be
said of most non-trivial programs,
it needs a bit more to make it interesting.
 
C

Chris Dollin

Robin said:
Behaviour is only undefined if it's not defined, and defined includes
implementation-defined.

Not as the term is used here, to the best of my knowledge. "Defined"
means "defined by the [relevant] C standard" - defined in such a way
that you can rely on it portably.
Therefore, it's never possible to say that any code _always_ produces
undefined behaviour.

Therefore, when we say that code produces undefined behaviour, without
reference to platform, we can't mean "always".

We can. At least one of us /does/. It's the non-reference to a platform
that allows it.
 
C

Chris Dollin

Robin said:
If I write c = a + b and a + b overflows, I'm prepared for alarm bells,
but if the behaviour takes the form of yielding some valid value for a +
b, I would expect that value and not some different value to be assigned
to c, and so on, and I wouldn't expect some expression to be misevaluated
3000 lines later on the grounds that the standard is imposing no
requirements.

You might not expect it, but if it happens, it happens.

If the expressions a, b, and c are complicated and appear in complicated
contexts and you turn the optimisation knobs up, the compiler may do
extensive reorganisation of your code /on the basis that it doesn't
do anything undefined/.
Otherwise, all extensions become impossible.
No.

In practice, an extension
defines some behaviour not defined by the standard and then execution
continues according to the standard.

And there you have it - you can only understand the behaviour of the
program /by using the definition of the extension/, not just the
definition in the standard. The extension is free to define things
the standard leaves undefined, and to do so in a useful way.
 
R

Richard G. Riley

Thats incorrect. Undefined behaviour is, in the context of CLC,
behaviour "for which this international standard imposes no
requirements".

A given platform may indeed define the behaviour. That is however
nongermane to the point. Once you start relying on the behaviour of a
given platform, your code is no longer C.


Rubbish. It is no longer "portable C". And there are lots of programs
out there coded to be very, very platform specific. It might make the
baby jesus cry, but in the real world it happens a lot.

Its why C is often used : to give that abiity to get at the platform when
one needs to.
 
R

Robin Haigh

Chris Dollin said:
Robin said:
Behaviour is only undefined if it's not defined, and defined includes
implementation-defined.

Not as the term is used here, to the best of my knowledge. "Defined"
means "defined by the [relevant] C standard" - defined in such a way
that you can rely on it portably.


I'll agree it's a useful shorthand, but it does tend to lead to people
bandying these terms around without understanding their significance, or
lack of it. People develop concepts that don't apply across the whole set
of possible C programs, but only to the type of thing they're thinking of.

Lack of undefined behaviour doesn't make a program portable. It's
impossible to avoid implementation dependencies, and once you allow in
literally one bit of implementation dependency, the effect on a program's
behaviour is potentially unbounded. You can only know that code is portable
by looking at all the dependencies and figuring out that they don't make any
difference across the set of platforms that matter.

In that process, if the standard says "undefined" and the relevant platforms
say "defined", you're as well off as if the standard said
"implementation-defined". And conversely, if the standard says
"implementation-defined" but you don't know how some relevant platform might
do that, you're as badly off as if the standard said "undefined".

And then there's the question of inputs. Most implementations of the
standard library contain code that produces undefined behaviour, for invalid
arguments. Link with the library and your program has undefined behaviour
in it, from a purely static viewpoint. To know you're ok, you have to know
that the undefined behaviour is guarded, because your running program won't
actually pass invalid arguments in any circumstances. The argument may be
non-trivial and will usually have to involve implementation dependencies
and program inputs. If the logic of your validation of inputs isn't totally
bombproof, for instance, you may think your source is clean code, and
portable,
but you've got unguarded undefined unpredictable behaviour in your
executable
anyway, or you will have on some other platform.

Behaviour only really makes sense in the context of a running executable.
The notion that source code has behaviour, though it seems to be appealing,
just isn't sustainable.

In fact one of the sources of bugs is that programmers don't focus on what
platforms they're writing for and just think they'll write portable source,
but then it turns out not to be as portable as they thought. Portability
doesn't come that easy.
 
F

Flash Gordon

Robin said:
Chris Dollin said:
Robin said:
Behaviour is only undefined if it's not defined, and defined includes
implementation-defined.
Not as the term is used here, to the best of my knowledge. "Defined"
means "defined by the [relevant] C standard" - defined in such a way
that you can rely on it portably.

I'll agree it's a useful shorthand, but it does tend to lead to people
bandying these terms around without understanding their significance, or
lack of it. People develop concepts that don't apply across the whole set
of possible C programs, but only to the type of thing they're thinking of.

The concepts are also useful because they are well defined by the same
standard that defined the C language. If someone doesn't understand them
then they should be explained to them.
Lack of undefined behaviour doesn't make a program portable. It's
impossible to avoid implementation dependencies, and once you allow in
literally one bit of implementation dependency, the effect on a program's
behaviour is potentially unbounded. You can only know that code is portable
by looking at all the dependencies and figuring out that they don't make any
difference across the set of platforms that matter.

#include <stdlib.h>
#include <stdio.h>
int main(void)
{
if (puts("Hello world!") >= 0)
return EXIT_SUCCESS;
else
return EXIT_FAILURE;
}

On any hosted implementation it will either output "Hello world!/n"
successfully and return a success code to the environment or it will
fail and return a failure code. Hardly unbounded.
In that process, if the standard says "undefined" and the relevant platforms
say "defined", you're as well off as if the standard said
"implementation-defined". And conversely, if the standard says
"implementation-defined" but you don't know how some relevant platform might
do that, you're as badly off as if the standard said "undefined".

No, with implementation defined behaviour you know that it *will* be
documented on *all* conforming implementations, because part of the
definition of "implementation defined" is that the implementation is
*required* to document it. If the standard says it is undefined
behaviour then even if your implementation defines it you know that you
will have to check whether it is documented for *every* system you want
to use it on in the future, and you may well come across a system which
leaves it completely undefined and possibly even causes random behaviour.
And then there's the question of inputs. Most implementations of the
standard library contain code that produces undefined behaviour, for invalid
arguments.

If you give the functions valid inputs they are *required* to behave in
the documented manner. Therefore, to avoid invoking undefined behaviour
with the library you just have to write your program properly so that it
only passes valid arguments (and avoid things like gets).
> Link with the library and your program has undefined behaviour
in it, from a purely static viewpoint.

I would disagree with that. See my program above which links to the
library and clearly does *not* invoke undefined behaviour (subjust to
typos and thinkos).
> To know you're ok, you have to know
that the undefined behaviour is guarded, because your running program won't
actually pass invalid arguments in any circumstances.

Or write your program such that it cannot generate invalid arguments in
the first place.
> The argument may be
non-trivial and will usually have to involve implementation dependencies
and program inputs. If the logic of your validation of inputs isn't totally
bombproof, for instance, you may think your source is clean code, and
portable,
but you've got unguarded undefined unpredictable behaviour in your
executable
anyway, or you will have on some other platform.

If you accidentally put your car in to reverse when doing 30mph on the
road you will wreck your gearbox, does that mean it is not useful to
define the behaviour of your car when you don't do that but not define
it when you do?
Behaviour only really makes sense in the context of a running executable.
The notion that source code has behaviour, though it seems to be appealing,
just isn't sustainable.

It is perfectly sustainable. Something is either guaranteed to behave in
a certain way, or in one of a finite set of ways (sometimes with the
choice having to be documented) or it is not. It is *extremely* useful
to know whether the behaviour is something where you have a guarantee of
the behaviour or not.
In fact one of the sources of bugs is that programmers don't focus on what
platforms they're writing for and just think they'll write portable source,
but then it turns out not to be as portable as they thought.

In those cases the problem is the programmer not understanding what *is*
portable, and that IMHO is a very strong argument for letting people
know whether things are guaranteed by the standard (defined behaviour),
whether the standard guarantees that the behaviour will be documented by
the implementation (implementation defined), one of a number of
specified options but not documented which one (unspecified behaviour)
or one where the C standard places no requirements (undefined behaviour).

You know at one extreme if you stick to defined behaviour you will
always get a well defined result and if you invoke undefined behaviour
that unless you can find some other documentation for *every* platform
of interest that specifies it then all bets are off.
> Portability doesn't come that easy.

Portability is not always easy or possible, but the starting point is
knowing what the C standard guarantees and what it doesn't. If you start
by making as much of the code as possible defined by the C standard then
you are minimising how much work you have to do when porting to a new
implementation, and remember that a new version of Windows or Linux or
your compiler (or even a service pack) might (and in the past has)
changed the behaviour of things not defined by the C standard.
 
K

Keith Thompson

Robin Haigh said:
In that process, if the standard says "undefined" and the relevant platforms
say "defined", you're as well off as if the standard said
"implementation-defined".

Only if the set of "relevant platforms" never changes. If it does,
you can get nasal demons (which is, of course, a metaphor for
arbitrarily bad things happening).
And conversely, if the standard says
"implementation-defined" but you don't know how some relevant platform might
do that, you're as badly off as if the standard said "undefined".

Not quite; you can pretty much assume that any particular instance of
implementation-defined behavior is going to be more benign than the
kind of stuff that can happen with actual undefined behavior.

[...]
Behaviour only really makes sense in the context of a running executable.
The notion that source code has behaviour, though it seems to be appealing,
just isn't sustainable.

The standard places constraints on the behavior of any conforming
program. In that context, it makes perfect sense to talk about the
"behavior" of a program given only the source. That's the whole point
of having a standard.
In fact one of the sources of bugs is that programmers don't focus on what
platforms they're writing for and just think they'll write portable source,
but then it turns out not to be as portable as they thought. Portability
doesn't come that easy.

Yes, unintentionally writing non-portable code is one of the
infinitely many mistakes one can make in C.
 
J

Jordan Abel

I was trying to figure out what you meant by "Once you start relying on the
behaviour of a
given platform, your code is no longer C.".

Source code is what you build binaries from. Binaries are
implementation-specific. If you compile the source on a lot of different
platforms, the executables might all do the same thing, in which case the
source is portable, or they might not, or you might not even want them to.

Even in the portable case, what the programs are doing internally might be
very different. It'll almost always be different. Even in Hello World,
you've got a string literal. The signedness of the chars, their bit-width,
their values, the representation of the address, are all
implementation-dependent. To know the code is portable you have to figure
out that none of that will actually affect the outcome. With experience,
you don't have to think about it much, but it's there. In other cases the
differences might be much bigger and you do have to work it out.

Or, as I said, you might not want portability. Suppose I have

#include <limits.h>
int main (void) {
if ( CHAR_BIT == 8 ) {
/* code here to output the date */
} else if ( CHAR_BIT == 9 ) {
/* code here to output the words of the Star-spangled Banner */
} else {
int i = 0;
i++ + i++;
}
return 0;
}

Compiled on platform A I get a useful tool to tell me the date. On
platform B I get a useful tool to remind me what comes after the
dawn's early light. On platform C, well who cares. Not portable
source, then. But I don't see why it's not C. If it isn't, what is?

I think that for something to be "not C" requires one of two things.
First, that it's identifiably a language other than C. Or second, that
it depends on extensions, other than new library functions [say, new
keywords, interpretations of preprocessor numbers that are not valid
numbers, etc], that are not historical parts of C [i.e. =+, old-style
array initializers, implicit int, -> with an integer left operand, are
all still C]
 
O

Old Wolf

Keith said:
I'm not convinced that it can be inferred from the normative text, and
I tentatively believe that it can't. If you can convince me
otherwise, I'd be glad to see the evidence. (Of course, if you just
don't want to take the time to do it, that's fine too.)

3.18
[#1] undefined behavior
behavior, upon use of a nonportable or erroneous program
construct, of erroneous data, or of indeterminately valued
objects, for which this International Standard imposes no
requirements

Is that normative?
 
K

Keith Thompson

Old Wolf said:
Keith said:
I'm not convinced that it can be inferred from the normative text, and
I tentatively believe that it can't. If you can convince me
otherwise, I'd be glad to see the evidence. (Of course, if you just
don't want to take the time to do it, that's fine too.)

3.18
[#1] undefined behavior
behavior, upon use of a nonportable or erroneous program
construct, of erroneous data, or of indeterminately valued
objects, for which this International Standard imposes no
requirements

Is that normative?

It would be if it were in the standard. I think you got that from
the N869 draft. The final C99 standard says:

3.4.3
undefined behavior
behavior, upon use of a nonportable or erroneous program construct
or of erroneous data, for which this International Standard
imposes no requirements

I knew the C90 standard had the "indeterminately valued objects"
wording; I didn't know that it had survived into N869.

In any case, even the old definition doesn't necessariliy imply that
*all* uses of indeterminately valued objects cause undefined behavior;
after all, not all uses of "nonportable program constructs" cause UB.
 
C

Chris Dollin

Robin said:
Behaviour only really makes sense in the context of a running executable.
The notion that source code has behaviour, though it seems to be
appealing, just isn't sustainable.

Source code /defines/ behaviour: you can give semantics to it. That
the normal way of getting that behaviour is to compile and run the
source doesn't mean that the /only/ way to understand it is compile-and-run.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top