the 'standard' is so strange

P

Pilcrow

This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.

-------------------- code -----------------------------------
#include <stdio.h>

int main (void)
{
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );
putchar('\n');
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p2, *p2++);
}

-------------------- output -----------------------------------

0 a a
1 b b
2 c c
3 d d

0 b a
1 c b
2 d c
3 d
 
J

James Kuyper

Pilcrow said:
This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.

-------------------- code -----------------------------------
#include <stdio.h>

int main (void)
{
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );

That function call expression has four sub-expressions. The standard
requires that all four sub-expressions must be evaluated, with all of
their side-effects fully carried out, before the function itself is
called. However, it does not specify which order the sub-expressions are
evaluated in. Furthermore, there are no sequence points separating the
different function arguments, so the side-effects of the *p1++
expression can occur in any order relative to the other sub-expressions.

The simplest explanation for the results that you see below is that the
*p1 expression was evaluated before the *p1++ expression was.
putchar('\n');
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p2, *p2++);

You have the same problem here. Again, it seems that *p2++ was evaluated
before *p2 was.
}

-------------------- output -----------------------------------

0 a a
1 b b
2 c c
3 d d

0 b a
1 c b
2 d c
3 d

-------------------------------------------------------------

The reason why the standard leaves the order of evaluation unspecified
is that this enables compilers to choose whichever order works best for
that compiler. Do NOT assume that a given compiler will use the same
order for every function call; they're perfect free to choose a
different order for different function calls. A compiler is even free to
choose a different order each time the same function call is made,
though I can't imagine any good reason for doing so.
 
C

Chris Dollin

Pilcrow said:
This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.

I can't 'explain' it, but I can explain it.
#include <stdio.h>

int main (void)
{
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );

Your code just fell into the undefined behaviour swamp. You
updated `p1` (in `*p1++`) and also accessed it elsewhere
(in `*p1`). The Standard explicitly makes such behaviour
undefined, because there's no sequence point separating the
evaluation of those two arguments of `printf`.

(fx:cutnpaste)
0 a a
1 b b
2 c c
3 d d

It looks like the compiler has fetched `*p1*` once, used
its value twice, and incremented `p1` after the fetches,
as it may (it can order it howsoever it likes).

DO NOT RELY ON THIS BEHAVIOUR. IT IS IMPLEMENTATION SPECIFIC
AND MAY CHANGE AT ANY TIME.

By "undefined", we don't just mean that the values you get out
might change. We mean you might not get any values out /at all/;
the program might terminate at that point:

ABORT: hit compiler-detected undefined behaviour; didn't
you read the disgnostic?

(I don't know any compilers that do this, but they'd be /allowed/
to do this.)

Or the program might continue:

0 a a
(Don't rely on recent output.)
1 b b
(Don't rely on recent output.)
2 c c
(Bored now, discarding all further output.)
putchar('\n');
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p2, *p2++);

Same swamp.
0 b a
1 c b
2 d c
3 d

And here it looks like the compiler has fetched `*p2` for
the last argument and `*(p2 + 1)` for the last-but-one,
and incremented p2 later, as it may.

DO NOT RELY ON THIS BEHAVIOUR. IT MAY CHANGE AT ANY TIME
AND IS SPECIFIC TO THE IMPLEMENTATION.

I'd /guess/ that the compiler implementation evaluates the
arguments in reverse order (as it can), probably because it's
passing arguments to `printf` on a stack and it's handy to
have the first argument on top of such a stack, and applying
the increments as it encounters them. But it would be equally
allowable to evaluate the arguments in the other order, or in
any order -- it is this freedom that the undefinedness is there
to allow.

DO NOT ON THIS BEHAVIOUR BE RELIANT. IT MAY ALTER FROM MOMENT
TO MOMENT AND MIGHT LAND YOU IN THE PACIFIC.
 
M

Mark Wooding

Pilcrow said:
printf("%d %c %c\n", i, *p1++, *p1 );

Undefined behaviour. 6.5#2 says quite clearly

: Between the previous and next sequence point an object shall have its
: stored value modified at most once by the evaluation of an
: expression. Furthermore, the prior value shall be read only to
: determine the value to be stored.

and 6.5.2.2#10 says

: The order of evaluation of the function designator, the actual
: arguments, and subexpressions within the actual arguments is
: unspecified, but there is a sequence point before the actual call.
printf("%d %c %c\n", i, *p2, *p2++);

This too.

On to practical matters.
0 a a
1 b b
2 c c
3 d d
0 b a
1 c b
2 d c
3 d

What's going on here is that the arguments are being evaluated from
right to left. This probably makes sense on the platform you're using:
arguments are being passed on a descending stack (most-recently pushed
at lowest address), with arguments to the right at higher addresses;
therefore it's probably best to push right-hand arguments first, and
it's easiest to evaluate them in this order.

There's no reason to expect that other platforms will do the same;
there's no especial reason to expect that the same compiler won't make
different choices for subtly different code if it detects some scope for
improving the code (or maybe even if some nondeterministic process,
e.g., a time-limited search or a purely random decision, gives a
different answer).

Indeed, because you're not allowed to write code like the above, the
compiler is at liberty to botch code generation hopelessly; Google
`nasal demons' for details.

My earlier explanation is speculative at best, and should not be
interpreted as any kind of justification for writing code like the
above. In particular, it shouldn't be taken as describing `typical'
compiler behaviour.

Bottom line: just don't write code like this.

-- [mdw]
 
P

Phil Carmody

Pilcrow said:
This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.

-------------------- code -----------------------------------
#include <stdio.h>

int main (void)
{
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );

It is not specified when the effect "p1++" will happen. Therefore
"*p1" is meaningless in this context. The compiler can do what it
likes.
putchar('\n');
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p2, *p2++);

Again, it is not specified when the effect "p2++" will happen. Likewise,
"*p2" is meaningless in this context. Again, the compiler can do what it
likes.
}

-------------------- output -----------------------------------

0 a a
1 b b
2 c c
3 d d

It appears that the compiler chose to evaluate the arguments to printf
from the right to the left, or to not perform the incrementation of p1
until after all arguments were evaluated. You were lucky, much worse
things could have happened.
0 b a
1 c b
2 d c
3 d

It appears that the compiler chose to evaluate the arguments to printf
from the right to the left, and to perform the incrementation of p2 in
the rightmost argument before proceeding to evaluate the argument to the
left of it. Again, you were lucky, much worse things could have happened.

Combining those two data points, I think we can concluded that the
compiler choses to evaluate its arguments from right to left and to
perform the side-effects associated with each argument as it processes
that argument. It is permitted to so do. It it not oblided to so do.

The compiler's behaviour does not in any way seem "very strange", it
simply appears undefined. The only strange behaviour is that of the
author of the code thinking he could get away with writing code with
undefined behaviour.

Phil
 
K

Keith Thompson

Pilcrow said:
This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.

-------------------- code -----------------------------------
#include <stdio.h>

int main (void)
{
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );
putchar('\n');
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p2, *p2++);
}

-------------------- output -----------------------------------

0 a a
1 b b
2 c c
3 d d

0 b a
1 c b
2 d c
3 d

-------------------------------------------------------------

Others have done a very good job of answering your question. But I'm
curious: why do you have such a dismissive attitude regarding the C
standard (putting the words "standard" and "explain" in quotation
marks and so forth)?
 
P

Pilcrow

Pilcrow said:
This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.

-------------------- code -----------------------------------
#include <stdio.h>

int main (void)
{
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );
putchar('\n');
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p2, *p2++);
}

-------------------- output -----------------------------------

0 a a
1 b b
2 c c
3 d d

0 b a
1 c b
2 d c
3 d

-------------------------------------------------------------

Others have done a very good job of answering your question. But I'm
curious: why do you have such a dismissive attitude regarding the C
standard (putting the words "standard" and "explain" in quotation
marks and so forth)?

Before posting my example I had already realised that I was not allowed
(at least in *this* case) to use a shortcut to modify an argument to a
function *within* the parens. I just assumed that the much-cited k&r
had fully defined the language. All these 'undefines' have me feeling
like I'm trying to herd cats. (snakes?)

To mix metaphores: just when I think I'm beginning to get a grip on C,
it turns into sea, and runs through my fingers.


Now I realise that I will have to read the 'standard', which seems to
have been written by a team composed of Philidelphia lawyers and
abstract mathematicians, neither of whom heard of Henry W Fowler. Truly
gelatinous prose. Technical writing need not be so murky. Oh, well, we
live to learn.

Maybe I'll try to make a catalog of *all* the 'undefines' and similar
gotchas in C. Written in English.

Thanks to all for a most illuminating experience.

BTW, what's a 'sequence point', and how could I recognize one in a dark
alley? Never mind, maybe the *standard* will tell me.

It's been real
 
N

Nate Eldredge

Pilcrow said:
Pilcrow said:
This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.

-------------------- code -----------------------------------
#include <stdio.h>

int main (void)
{
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );
putchar('\n');
for(i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p2, *p2++);
}

-------------------- output -----------------------------------

0 a a
1 b b
2 c c
3 d d

0 b a
1 c b
2 d c
3 d

-------------------------------------------------------------

Others have done a very good job of answering your question. But I'm
curious: why do you have such a dismissive attitude regarding the C
standard (putting the words "standard" and "explain" in quotation
marks and so forth)?

Before posting my example I had already realised that I was not allowed
(at least in *this* case) to use a shortcut to modify an argument to a
function *within* the parens. I just assumed that the much-cited k&r
had fully defined the language. All these 'undefines' have me feeling
like I'm trying to herd cats. (snakes?)

Even K&R didn't define the issue of passing expressions with side
effects as function arguments. K&R I said "The order of evaluation of
arguments is undefined by the language; take note that the various
compilers differ." By the time K&R got around to writing a formal
description of the language, there were already different compilers
around that disagreed. Rather than demand a certain behavior and make
existing implementations wrong, and require them to switch to behavior
that could be less efficient, they decided (rightly, IMHO) that it
didn't really matter, and left it up to the compiler authors to decide
what to do.

If you want a completely specified language, where every possible piece
of code has a well-defined effect, then C is not for you. One of the
major strengths of the language is its efficiency and ability to run on
many different platforms; the tradeoff for this is that in many cases
the language will say "Programmers: don't do this if you want to be
portable. Compilers: handle this in whatever way you find most
convenient." That's what "undefined behavior" means.
To mix metaphores: just when I think I'm beginning to get a grip on C,
it turns into sea, and runs through my fingers.

That happens a lot when dealing with computers. Sorry.
Now I realise that I will have to read the 'standard', which seems to
have been written by a team composed of Philidelphia lawyers and
abstract mathematicians, neither of whom heard of Henry W Fowler. Truly
gelatinous prose. Technical writing need not be so murky. Oh, well, we
live to learn.

You don't necessarily have to read the standard, though it is the
authoritative source. Any decent textbook should have told you not to
write code that depends on the order in which function arguments are
evaluated.
Maybe I'll try to make a catalog of *all* the 'undefines' and similar
gotchas in C. Written in English.

Thanks to all for a most illuminating experience.

BTW, what's a 'sequence point', and how could I recognize one in a dark
alley? Never mind, maybe the *standard* will tell me.

Maybe you should ask again when you're more calm.
 
K

Keith Thompson

Pilcrow said:
Pilcrow said:
This behavior seems very strange to me, but I imagine that someone will
be able to 'explain' it in terms of the famous C standard.
[snip]

Others have done a very good job of answering your question. But I'm
curious: why do you have such a dismissive attitude regarding the C
standard (putting the words "standard" and "explain" in quotation
marks and so forth)?

Before posting my example I had already realised that I was not allowed
(at least in *this* case) to use a shortcut to modify an argument to a
function *within* the parens. I just assumed that the much-cited k&r
had fully defined the language. All these 'undefines' have me feeling
like I'm trying to herd cats. (snakes?)

To mix metaphores: just when I think I'm beginning to get a grip on C,
it turns into sea, and runs through my fingers.

There are a lot of things that the standard doesn't define (i.e., it
leaves implementations free to make their own decisions). In some
cases this is because the standard largely documents existing
practice; if existing implementations don't agree on something, the
standard can either force all implementations to conform to the same
definition (this has been done in some cases), or it can leave the
choice up to each compiler. Programmers need to be aware of these
cases, and either avoid programs that depend on such things or be
aware that their code is going to be non-portable.

My experience is that C leaves more things undefined than most (almost
all) other languages, but there are valid reasons for doing so.

The example here is that the standard doesn't define the order of
evaluation of subexpressions in may cases. If you need to impose an
particular order, you'll need to write your code differently. For
example (note that puts() returns an int):

x = puts("first") + puts("second");

can print the two strings in either order. So don't do that:

int p1 = puts("first");
int p2 = puts("second");
x = p1 + p2;
Now I realise that I will have to read the 'standard', which seems to
have been written by a team composed of Philidelphia lawyers and
abstract mathematicians, neither of whom heard of Henry W Fowler. Truly
gelatinous prose. Technical writing need not be so murky. Oh, well, we
live to learn.

Again, why put the word "standard" in scare quotes? It really is the
standard.

The standard is a technical document. The authors valued precision
over pleasant prose. If you have specific cases where you think the
same ideas could be expressed more clearly, by all means bring them
up; comp.std.c is probably the best place to do so. If you have
specific suggestions for better wording, that's great.
Maybe I'll try to make a catalog of *all* the 'undefines' and similar
gotchas in C. Written in English.

Take a look at Annex J of the C99 standard (or n1256 if you don't have
the official standard).
Thanks to all for a most illuminating experience.

BTW, what's a 'sequence point', and how could I recognize one in a dark
alley? Never mind, maybe the *standard* will tell me.

Maybe ``you'' should "read" the 'standard' before you *complain* about
/it/.

The standard has a good index; looking up "sequence points" will take
you to section 5.1.2.3.
 
J

jameskuyper

Pilcrow wrote:
....
Before posting my example I had already realised that I was not allowed
(at least in *this* case) to use a shortcut to modify an argument to a
function *within* the parens. I just assumed that the much-cited k&r
had fully defined the language. All these 'undefines' have me feeling
like I'm trying to herd cats. (snakes?)

K&R did not fully define the language. The C89 standard did define it,
but that definition included specifying, with considerable precision,
situations in which the standard does not specify the behavior.
Now I realise that I will have to read the 'standard', which seems to
have been written by a team composed of Philidelphia lawyers and
abstract mathematicians, neither of whom heard of Henry W Fowler. Truly
gelatinous prose. Technical writing need not be so murky. Oh, well, we
live to learn.

A key thing that you need to understand is that the standard is,
conceptually, a contract between implementors of C and developers of C
programs. If a developer writes strictly conforming code, a conforming
implementation has to produce exactly the behavior specified by the
standard. The "undefined behavior" that bothers you so much is simply
one of the several methods which the standard uses to identify code
which doesn't meet the contract requirements.

These ways are:
1. The standard requires that a conforming implementation must produce
at least one diagnostic message whenever given a program that contains
any syntax errors or constraint violations. An implementation is also
free to produce diagnostic messages for any other reason that it
wishes. Diagnostics are not required to provide you with any useful
information; they are not even required to be in any language that
anyone knows how to read. However, an implementation is required to
document how to identify diagnostic messages. If you don't find the
messages helpful, that's a legitimate issue to complain about to the
implementor - but it doesn't make the implementation non-conforming.

Having generated that diagnostic, an implementation is free to process
your code a produce a program that you might or might not be willing
to actually execute (I wouldn't).

2. Unspecified behavior: in some cases, the standard allows an
implementation a range of choices. This is usually not done just for
the fun of it - it's done because existing implementations handle the
situation in different ways, often because the best way to handle the
situation is different on different machines. By making the behavior
unspecified, the standard makes it easier to implement C in an
efficient fashion on a much wider variety of platforms than just about
any other language. The price we pay for this is that we have to be
careful to avoid writing code that depends upon unspecified behavior,
unless we have a very good reason to tie a program to a particular
implementation of C.

A program whose behavior is unspecified must still behave in one of
permitted ways, an implementation is not free to make it behave in a
completely arbitrary fashion. In many cases, unspecified behavior is
also implementation-defined behavior; in that case, the implementation
is required to document which choice it made. See Annex J.3 for
implementation-defined behavior; see Annex J.1 for other unspecified
behavior.

3. Undefined behavior: the standard imposes no requirements, of any
kind, on the behavior. An implementation is free to provide it's own
definition of the behavior. If you're deliberately relying upon a
particular implementation's definition of the behavior, that's fine.
However, if you intend your code to be portable, you must avoid
undefined behavior completely. See Annex J.2 for undefined behavior.
Maybe I'll try to make a catalog of *all* the 'undefines' and similar
gotchas in C. Written in English.

Virtually every sentence of the standard contains a "gotcha", many of
them contain several "gotchas". To write them out in clear English,
avoiding the technical jargon that makes the standard so difficult to
read, a complete list of the "gotchas" will have to be several times
longer than the standard itself.

....
BTW, what's a 'sequence point', and how could I recognize one in a dark
alley? Never mind, maybe the *standard* will tell me.

Here's the complete list of sequence points from Annex C:
 
C

CBFalconer

Pilcrow said:
This behavior seems very strange to me, but I imagine that someone
will be able to 'explain' it in terms of the famous C standard.

-------------------- code -----------------------------------
#include <stdio.h>

int main (void) {
char xx[]="abcd";
char * p1 = xx;
char * p2 = xx;
int i;
for (i = 0; i < 4; i++)
printf("%d %c %c\n", i, *p1++, *p1 );

From this point on your code is undefined. You have raised
undefined performance here. The compiler is entitled to do
whatever it wishes. Look up sequence points, and restrictions on
code between them.
 
S

Stephen Sprunk

Pilcrow said:
Before posting my example I had already realised that I was not allowed
(at least in *this* case) to use a shortcut to modify an argument to a
function *within* the parens. I just assumed that the much-cited k&r
had fully defined the language. All these 'undefines' have me feeling
like I'm trying to herd cats. (snakes?)

To mix metaphores: just when I think I'm beginning to get a grip on C,
it turns into sea, and runs through my fingers.

C is indeed less well-defined than most other languages. The reason is
that C evolved and branched on its own long before the standards bodies
got their hands on it, and different implementors had wildly different
ideas about what different things meant. As a result, ANSI (and thus
ISO) was reduced to trying to document the areas where most or all of
them agreed and leaving the areas of disagreement undefined.

Also, C was always intended to be as efficient as possible, and what
behavior is most efficient on different systems varies. Finally, one of
C's greatest strengths is the ability to write both portable code (such
as cross-platform applications) and unportable code (such as device
drivers) in the same language; the way this is done is by allowing
"undefined" and "implementation-defined" behavior, which can be avoided
by anyone trying to write portable code but embraced by those who don't
care about portability.
Now I realise that I will have to read the 'standard', which seems to
have been written by a team composed of Philidelphia lawyers and
abstract mathematicians, neither of whom heard of Henry W Fowler. Truly
gelatinous prose. Technical writing need not be so murky. Oh, well, we
live to learn.

Maybe I'll try to make a catalog of *all* the 'undefines' and similar
gotchas in C. Written in English.

Start by translating Annex J, but I bet you'll find that your
translation is several times as long, and by the time you're complete,
you won't need it anymore because the Standard will have started to make
an odd sort of sense to you.
Thanks to all for a most illuminating experience.

BTW, what's a 'sequence point', and how could I recognize one in a dark
alley? Never mind, maybe the *standard* will tell me.

The most obvious sequence points are ;'s after expressions and function
calls. There are a few others, as noted in the Standard, but if you're
just getting started, the simplest approach is to assume for now that no
others exist. More important is why you need to _care_ about sequence
points, and I can't even begin to explain that in English...

S
 
B

Ben Bacarisse

Before posting my example I had already realised that I was not allowed
(at least in *this* case) to use a shortcut to modify an argument to a
function *within* the parens. I just assumed that the much-cited k&r
had fully defined the language. All these 'undefines' have me feeling
like I'm trying to herd cats. (snakes?)

It is not as bad as you think. K&R (the original, I am rather
out-of-date) does not use the term "undefined behaviour" as far as I
can tell, but it does say that evaluation of function arguments can
occur in any order. So for at least 30 years (my copy is 1978) your
program's behaviour can be said to be unspecified at best.

All the precision about sequences points and the formal language of
undefined behaviour have been introduced to pin things down very
tightly, but your code has not had well-defined behaviour since the
very early days of C.

Thus, from a practical point of view, you would not have wanted to
write such code even after reading only K&R I.
 
P

Pilcrow

Again, why put the word "standard" in scare quotes? It really is the
standard.

because it scares me??
The standard is a technical document. The authors valued precision
over pleasant prose. If you have specific cases where you think the
In my experience, precision *is* pleasant.
same ideas could be expressed more clearly, by all means bring them
up; comp.std.c is probably the best place to do so. If you have
specific suggestions for better wording, that's great.


Take a look at Annex J of the C99 standard (or n1256 if you don't have
the official standard).

Turns out I have n1256. I thought that was it. Where can I get the
C99, and the other, previous ones? URL, please?
 
C

CBFalconer

Pilcrow said:
.... snip ...


Turns out I have n1256. I thought that was it. Where can I get
the C99, and the other, previous ones? URL, please?

n1256 is more accurate than the printed versions you can buy.
 
P

Pilcrow

Pilcrow wrote:
...

K&R did not fully define the language. The C89 standard did define it,
but that definition included specifying, with considerable precision,
situations in which the standard does not specify the behavior.


A key thing that you need to understand is that the standard is,
conceptually, a contract between implementors of C and developers of C
programs. If a developer writes strictly conforming code, a conforming
implementation has to produce exactly the behavior specified by the
standard. The "undefined behavior" that bothers you so much is simply
one of the several methods which the standard uses to identify code
which doesn't meet the contract requirements.

These ways are:
1. The standard requires that a conforming implementation must produce
at least one diagnostic message whenever given a program that contains
any syntax errors or constraint violations. An implementation is also
free to produce diagnostic messages for any other reason that it
wishes. Diagnostics are not required to provide you with any useful
information; they are not even required to be in any language that
anyone knows how to read. However, an implementation is required to
document how to identify diagnostic messages. If you don't find the
messages helpful, that's a legitimate issue to complain about to the
implementor - but it doesn't make the implementation non-conforming.

Having generated that diagnostic, an implementation is free to process
your code a produce a program that you might or might not be willing
to actually execute (I wouldn't).

2. Unspecified behavior: in some cases, the standard allows an
implementation a range of choices. This is usually not done just for
the fun of it - it's done because existing implementations handle the
situation in different ways, often because the best way to handle the
situation is different on different machines. By making the behavior
unspecified, the standard makes it easier to implement C in an
efficient fashion on a much wider variety of platforms than just about
any other language. The price we pay for this is that we have to be
careful to avoid writing code that depends upon unspecified behavior,
unless we have a very good reason to tie a program to a particular
implementation of C.

A program whose behavior is unspecified must still behave in one of
permitted ways, an implementation is not free to make it behave in a
completely arbitrary fashion. In many cases, unspecified behavior is
also implementation-defined behavior; in that case, the implementation
is required to document which choice it made. See Annex J.3 for
implementation-defined behavior; see Annex J.1 for other unspecified
behavior.

3. Undefined behavior: the standard imposes no requirements, of any
kind, on the behavior. An implementation is free to provide it's own
definition of the behavior. If you're deliberately relying upon a
particular implementation's definition of the behavior, that's fine.
However, if you intend your code to be portable, you must avoid
undefined behavior completely. See Annex J.2 for undefined behavior.


Virtually every sentence of the standard contains a "gotcha", many of
them contain several "gotchas". To write them out in clear English,
avoiding the technical jargon that makes the standard so difficult to
read, a complete list of the "gotchas" will have to be several times
longer than the standard itself.

...

Here's the complete list of sequence points from Annex C:

Thank you, sir. That will help.
 
P

Pilcrow

It is not as bad as you think. K&R (the original, I am rather
out-of-date) does not use the term "undefined behaviour" as far as I
can tell, but it does say that evaluation of function arguments can
occur in any order. So for at least 30 years (my copy is 1978) your
program's behaviour can be said to be unspecified at best.

All the precision about sequences points and the formal language of
undefined behaviour have been introduced to pin things down very
tightly, but your code has not had well-defined behaviour since the
very early days of C.

Thus, from a practical point of view, you would not have wanted to
write such code even after reading only K&R I.

Duh.. I'm such a dope.
 
P

Pilcrow

because it scares me??

In my experience, precision *is* pleasant.


Turns out I have n1256. I thought that was it. Where can I get the
C99, and the other, previous ones? URL, please?

nevermind... found it.
 
N

Nick Keighley

Before posting my example I had already realised that I was not allowed
(at least in *this* case) to use a shortcut to modify an argument to a
function *within* the parens. I just assumed that the much-cited k&r
had fully defined the language. All these 'undefines' have me feeling
like I'm trying to herd cats. (snakes?)

To mix metaphores: just when I think I'm beginning to get a grip on C,
it turns into sea, and runs through my fingers.

Now I realise that I will have to read the 'standard', which seems to
have been written by a team composed of Philidelphia lawyers and
abstract mathematicians, neither of whom heard of Henry W Fowler.

the C standard is a model of clarity. And Fowler is over rated.

Truly
gelatinous prose. Technical writing need not be so murky. Oh, well, we
live to learn.

if you're going to be precise in english then a certain murkiness
is inevitable. As Churchill almost said:

"writing a technical standard in english is the worst possible
solution, apart from all the other solutions that have been tried"

Maybe I'll try to make a catalog of *all* the 'undefines' and similar
gotchas in C. Written in English.

Thanks to all for a most illuminating experience.

BTW, what's a 'sequence point', and how could I recognize one in a dark
alley? Never mind, maybe the *standard* will tell me.

It's been real

It's been complex


--
Nick Keighley

[begin quote]
5.4.4.2. Semantics
A MOID-NEST-jump J, in an environ E, is elaborated as follows:
- let the scene yielded in E by the label-identfier of J be composed
of a series
S2 and an environ E1;
Case A:
MOID is not any procedure yielding MOID1:
- let S1 be the series of the smallest {1.1.3.2.g} serial-clause
containing
S2;
- the elaboration of S1 in E1, or of any series in E1 elaborated
in its
place, is terminated {2.1.4.3.e};
- S2 in E1 is elaborated \in place of" S1 in E1;
Case B:
MOID is some procedure yielding MOID1:
- J in E {is completed and} yields the routine composed of
(i) a new MOID-NEST-routine-text whose unit is akin {1.1.3.2.k} to
J,
(ii) E1.

[...]

10.3. Transput declarations
{ "So it does!" said Pooh, "It goes in!"
"So it does!" said Piglet, "And it comes out!"
"Doesn't it?" said Eeyore, "It goes in and out like
anything,"
Winnie-the-Pooh, A.A. Milne.}

[end quote]

Both from "Revised Report on the Algorithmic Language ALGOL 68"
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,172
Latest member
NFTPRrAgenncy
Top