Proposal for Amendment to Section 6.5.3.2, Unary * Operator

W

Wojtek Lerch

"Quite different"?

If every instance of "undefined behavior" was replaced with "the
behavior is undefined according to the C standard" in this newsgroup,
how much longer would it take to read posts?

A lot longer, mostly because such replacements would completely massacre
the grammar in most cases. You'd be replacing a noun with a full sentence.

What exactly did you originally mean by "depend on undefined behaviour"?
Did you mean "depend on the fact that the behaviour under those
circumstances is undefined by the C standard", or did you mean "depend
on a particular behaviour, even though the C standard doesn't define the
behaviour under those circumstances"? I can see how your vague
statement can reasonably be interpreted both ways, and I have to agree
that those two interpretations are, indeed, quite different.
 
S

Shao Miller

Wojtek said:
A lot longer, mostly because such replacements would completely massacre
the grammar in most cases. You'd be replacing a noun with a full sentence.

Right.

What exactly did you originally mean by "depend on undefined behaviour"?
Did you mean "depend on the fact that the behaviour under those
circumstances is undefined by the C standard", or did you mean "depend
on a particular behaviour, even though the C standard doesn't define the
behaviour under those circumstances"? I can see how your vague
statement can reasonably be interpreted both ways, and I have to agree
that those two interpretations are, indeed, quite different.

Both. The former allows for different conforming implementations to
behave differently. The latter particular behaviour exists due to the
license from the former. There's good reason for it to be interpreted
both ways.

Later in that post, a suggestion to read other posters' posts is given.
There are some examples in those posts where a programmer wishes to do
something out-of-scope of a C Standard, but the program is nonetheless
written in C.

The statement was not meant to be a challenge, but an agreement. Of
course, "since we can agree that" has been removed... And rightly so,
because evidently we can't. Seriously?

Feel free to find further alternative meanings and I can attempt to
clarify if those are intended, too. Maybe I could try:

Without the undefined behaviour of "dereferencing" a null pointer
constant cast to a complete object type, some real C programs that
currently work as expected would not work as expected.

Is that better?
 
S

Shao Miller

Ben said:
And I answered about *vp. If you'd asked about the type of 1 + x I'd
said you look at the declaration of x. The same is true of *vp -- it's
the declaration of vp that answers your question (unless I've
misunderstood the question, but it seem clear enough).

Absolutely. (You have understood.) :)
Yes, though I remember the history rather differently.

Not sure how that history goes. I remember having always argued that
'*vp' in the example is a void expression, based solely on the semantic
definition for the type. I remember others' posts explaining that the
result is undefined. All of that great discussion resulted in this
proposal.
Then it was a bad idea to start with an example that is already a
constraint violation.

Fair enough... If we can safely say that '*vp' has a defined type
regardless of the impossibility of the other two "if"s.
I have no opinion on any other constraints you'd
like to add, at least not unless you define what and when they might
occur.

Ok, fair enough. It's amended point #4 in the original post, which Mr.
M. Grzegorczyk suggested a slight change to. It's about application of
unary '*' to any pointer to an incomplete object type (per C1X). That
includes 'void *' and 'char(*)[]' and declared but undefined 'struct's,
etc. Should a diagnostic be mandatory for such cases?
No, and it would not be a C compiler if it did. I am no sure there is
any value in having C include extra constraints for non-C compilers to
ignore.

If the B. is U. (as has been argued), then what would make it a non-C
compiler if it chose to do so? Surely implementations can take certain
liberties that are out-of-scope of a C Standard, would you agree? Why
not turn 'void *' into 'char *' before the '*' result in this simple
example? Same representation and alignment for the pointer types...
But a surprise type for the result.

Perhaps an implementation might take advantage of the U. B. "license" in
order to directly work with 'void *'s as though they were 'char *'s in
their definitions of library functions? Just a thought.
 
W

Wojtek Lerch


??? You're not saying that you actually intended your words to be
ambiguous, are you?
The former allows for different conforming implementations to
behave differently.

Right. And how can a program *depend* on the fact that the C standard
makes no promises whatsoever about how implementations behave?
The latter particular behaviour exists due to the
license from the former. There's good reason for it to be interpreted
both ways.

??? I don't understand. What was your good reason to use a wording
that can be interpreted in two different ways, instead of just saying
two things separately and unambiguously? I can imagine that someone
might want that in poetry, but in a technical discussion it just seems
wrong. Were you hoping that that would save people some time reading
your post? Obviously, the result turned out quite the opposite.

....
The statement was not meant to be a challenge, but an agreement. Of
course, "since we can agree that" has been removed... And rightly so,
because evidently we can't. Seriously?

The problem is that before someone can decide whether to agree with your
statement or not, they have to know what you meant. If you actually
intended your statement to be interpreted in two different ways, then
each of those interpretations needs to be agreed or disagreed with
separately; but in a technical discussion people tend to assume that any
ambiguity in someone's statement is unintentional, and don't expect that
the answer to "Did you mean X or Y, which is quite different" will be
"Both".
Feel free to find further alternative meanings and I can attempt to
clarify if those are intended, too. Maybe I could try:

Without the undefined behaviour of "dereferencing" a null pointer
constant cast to a complete object type, some real C programs that
currently work as expected would not work as expected.

Is that better?

A little clearer for sure. But not necessarily true -- the undefined
behaviour could be removed from the standard by adding a statement
saying that it's unspecified whether the behaviour is X, Y, or Z, where
X, Y, and Z are the behaviours that different existing programs rely on.
 
B

Ben Bacarisse

Shao Miller said:
If the B. is U. (as has been argued), then what would make it a non-C
compiler if it chose to do so?

It depends on what you mean by "spontaneously changing the type". I
think the above is a CV. A compiler that changes the type and does not
report a CV is not a C compiler. If the type change is incidental to
reporting the CV then, yes, it's fine. I thought you were suggesting
there was permission to change the type (because of undefined behaviour)
"before the sizeof" (your phrase not mine) this avoiding the constraint.
Surely implementations can take
certain liberties that are out-of-scope of a C Standard, would you
agree? Why not turn 'void *' into 'char *' before the '*' result in
this simple example? Same representation and alignment for the
pointer types... But a surprise type for the result.

Perhaps an implementation might take advantage of the U. B. "license"
in order to directly work with 'void *'s as though they were 'char *'s
in their definitions of library functions? Just a thought.

Provided that constraints are reported that is of course fine.

Can I ask you a meta-question? Why do you care about this point? The
kinds of type errors and other static diagnostics that can be detected
by a C compiler are very limited. It's part of the design. Adding the
one you seem to be suggesting won't have any practical value that I can
see because in almost all cases using the incomplete type will be a CV
already.
 
S

Shao Miller

???  You're not saying that you actually intended your words to be
ambiguous, are you?

No, I'm not. As in, there was no such intention and no special effort
was invested for that non-existent intention.
Right.  And how can a program *depend* on the fact that the C standard
makes no promises whatsoever about how implementations behave?

If the behaviour was well-defined then the set of C programs could
shrink. Real C programs depend on undefined behaviour. If you
haven't done so, please do read the other posts in this thread.

In Syslinux (a Linux boot-loader), there are COMBOOT32 modules which
are written in C and do useful things, but those things are pretty
particular to the x86 platform. If a C Standard said that '*(char
*)0' should not translate, then a COMBOOT32 module could not examine
the first byte of memory. These programs are real. They depend on
UB.
???  I don't understand.  What was your good reason to use a wording
that can be interpreted in two different ways, instead of just saying
two things separately and unambiguously?

To me, those two things are an artificial separation of a notion which
is thin enough on its own. To me, they are akin to bifurcation of a
filamentous biomaterial.
 I can imagine that someone
might want that in poetry, but in a technical discussion it just seems
wrong.  Were you hoping that that would save people some time reading
your post?  Obviously, the result turned out quite the opposite.

Here's some poetry from 'n1256.pdf', section 4, point 2:

"If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a
constraint is violated, the
behavior is undefined. Undefined behavior is otherwise indicated in
this International
Standard by the words ‘‘undefined behavior’’ or by the omission of any
explicit definition
of behavior. There is no difference in emphasis among these three;
they all describe
‘‘behavior that is undefined’’."

That poetry describes "three" subjects, all of which apply to my
original. So if I say undefined behaviour, a reader interpreting one
or more of these three has a fair interpretation, without special
effort on my part to produce an ambiguity. Here's another bit about
it from 3.4.3p1:

"undefined behavior
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements"

So a fair interpretation could involve "nonportable" "program
construct" with "no" "Standard" "requirements", as well.
...


The problem is that before someone can decide whether to agree with your
statement or not, they have to know what you meant.  If you actually
intended your statement to be interpreted in two different ways, then
each of those interpretations needs to be agreed or disagreed with
separately; but in a technical discussion people tend to assume that any
ambiguity in someone's statement is unintentional, and don't expect that
the answer to "Did you mean X or Y, which is quite different" will be
"Both".

I have suggested in another thread that it is sometimes useful in
discussion to distinguish between "undefined by omission" and "defined
as undefined" in regards to discussion of the C Standard. Why?

Undefined by omission:
A: It's defined.
B: Where?
A: Uhh...
B: Case closed.

Undefined by omission:
A: It's defined here -> x.x.x
B: But note y.y.y, where it's subsequently undefined ("if...the
behaviour is undefined.")
A: Oh I see.
B: Case closed.

This suggestion did not meet with any agreement, as usual.
A little clearer for sure.  But not necessarily true -- the undefined
behaviour could be removed from the standard by adding a statement
saying that it's unspecified whether the behaviour is X, Y, or Z, where
X, Y, and Z are the behaviours that different existing programs rely on.

At and only at that time can it be rendered false... Correct?
 
S

Shao Miller

It depends on what you mean by "spontaneously changing the type".  I
think the above is a CV.  A compiler that changes the type and does not
report a CV is not a C compiler.  If the type change is incidental to
reporting the CV then, yes, it's fine.  I thought you were suggesting
there was permission to change the type (because of undefined behaviour)
"before the sizeof" (your phrase not mine) this avoiding the constraint.

Yes indeed that is what I am suggesting. During translation, does a
conforming implementation coming across '*vp' have an undefined-
behaviour-license to yield a result that is not a void expression?
Some folks have argued that there is this UB due to the two other
"ifs" not being met. Then '*vp' might be a complete object type for
'sizeof' or any other operator. Or maybe it'd be a surprise function
designator. As posters are so found of pointing out, if the B. is U.,
anything's possible; your kitten can tie you up and flog you with damp
tea-bags (is a comical example).
Provided that constraints are reported that is of course fine.

So you're saying that an implementation must diagnose "as if" '*vp' in
'sizeof *vp' is a void expression, despite that an implementation
might otherwise treat it as 'char'?
Can I ask you a meta-question?  Why do you care about this point?  The
kinds of type errors and other static diagnostics that can be detected
by a C compiler are very limited.  It's part of the design.  Adding the
one you seem to be suggesting won't have any practical value that I can
see because in almost all cases using the incomplete type will be a CV
already.

Portability. I think that one way in which a C Standard can encourage
a portable programming practice would be to require a diagnostic
message for the construct. Right now, I believe that we have to hope
for a diagnostic message. It might be nice to expect, rather than
hope.

int main(void) {
/* Pointer to incomplete object type */
void *vp = &vp;
/* Pointer to incomplete object type */
double(*dap)[] = vp;
/* Pointer to incomplete object type */
struct s *sp = vp;

*vp;
*dap;
*sp;
return 0;
}
 
W

Wojtek Lerch

If the behaviour was well-defined then the set of C programs could
shrink.

Of course it could; the interesting question is whether it would have
to. When you say that a program "depends on undefined behaviour", are
you claiming that it's impossible to make the behaviour defined without
breaking the program (which is hard to agree with), or just that it's
possible to make the behaviour defined in a way that breaks the program
(which is trivially true)? Please don't say that you meant both, or
that the difference is insignificant.
Real C programs depend on undefined behaviour.

Here you go using that ambiguous wording again. That's not helping.

A real C programs doesn't rely on a complete lack of restrictions on the
implementation's behaviour. At most, it relies on a particular
behaviour on a particular implementation. If the behaviour matches what
the program expects, forbidding some of the *other* possible behaviours
is not going to break the program.
> If you
> haven't done so, please do read the other posts in this thread.

No, thanks. I did not intend to comment on any other topics in this
thread, only the part that tried to clarify what you could have possibly
meant when you said that some programs "depend on undefined behaviour".
In Syslinux (a Linux boot-loader), there are COMBOOT32 modules which
are written in C and do useful things, but those things are pretty
particular to the x86 platform. If a C Standard said that '*(char
*)0' should not translate, then a COMBOOT32 module could not examine
the first byte of memory. These programs are real. They depend on
UB.

No, they depend on the fact that the C standard doesn't forbid *(char*)0
to access the first byte of memory. Some other programs may depend on
the fact that attempting to access the first byte of memory results in a
signal being delivered to the program. But those programs don't depend
on the fact that the behaviour is undefined -- they just depend on the
fact that one or the other behaviour is allowed by the C standard (and
that it actually happens on a particulr implementation). If the C
standard were changed to say that it's unspecified whether *(char*)0
accesses the first byte of memory or generates a signal, none of those
programs would suffer. In *that* sense, those programs do *not* rely on
undefined behaviour.

....
To me, those two things are an artificial separation of a notion which
is thin enough on its own. To me, they are akin to bifurcation of a
filamentous biomaterial.

Well then I guess that's the problem. To me and apparently at least one
other person here, there's quite a big difference between those two
interpretations -- one is nonsensical and the other is quite easy to
agree with. That's why we're trying to understand which of them is
closer to what you intended to say. But if you refuse to give a clear
answer because you disagree that the difference is important, then the
discussion is not going to be easy.
Here's some poetry from 'n1256.pdf', section 4, point 2: ....
That poetry describes "three" subjects, all of which apply to my
original. So if I say undefined behaviour, a reader interpreting one
or more of these three has a fair interpretation, without special
effort on my part to produce an ambiguity.

What are you talking about? Those are not three interpretations, it's a
list of three cases. A reader interpreting only one or two of them
instead of all three is simply wrong.

....
I have suggested in another thread that it is sometimes useful in
discussion to distinguish between "undefined by omission" and "defined
as undefined" in regards to discussion of the C Standard. ....
This suggestion did not meet with any agreement, as usual.

Did it meet with any disagreement, or was it just ignored? If it's any
consolation, I do agree that it sometimes is useful.
At and only at that time can it be rendered false... Correct?

Well no, unless again the interpretation(s) that you intended differ
from the one I assumed. I assumed you meant that it's impossible to get
rid of the undefined behaviour without breaking some programs, not that
it's possible to get rid of it in a way that breaks them. Or will you
say that that's not a difference worth being concerned about again?
 
S

Shao Miller

Wojtek said:
Of course it could; the interesting question is whether it would have
to. When you say that a program "depends on undefined behaviour", are
you claiming that it's impossible to make the behaviour defined without
breaking the program (which is hard to agree with),
No.

or just that it's
possible to make the behaviour defined in a way that breaks the program
(which is trivially true)?

Didn't mean that either, but I fully agree. I'm getting a sense that
the lack of a temporal reference is part of the interpretation challenge.
Please don't say that you meant both, or
that the difference is insignificant.


Here you go using that ambiguous wording again. That's not helping.

There is a program. The program is real. The program includes
constructs which any C Standard familiar could reason as leading to
undefined behaviour during translation or execution via a conforming
implementation. Undefined behaviour is allowed. The program compiles.
The program works. The program does not work if these constructs are
removed. There is another such program. Real C programs depend on
undefined behaviour. Right now.
A real C programs doesn't rely on a complete lack of restrictions on the
implementation's behaviour.

Which is why there is no "only". "Real C programs only depend on..."
There's also no "all". "All real C programs only depend on..."
At most, it relies on a particular
behaviour on a particular implementation. If the behaviour matches what
the program expects, forbidding some of the *other* possible behaviours
is not going to break the program.
Sure.


No, thanks. I did not intend to comment on any other topics in this
thread, only the part that tried to clarify what you could have possibly
meant when you said that some programs "depend on undefined behaviour".

The suggestion to read the other posts can be independent of a choice to
comment on those other posts. Such perusal could contribute to a shared
context that my unintentionally challenging claim of agreement could be
associated with. It might be challenging simply due to the lack of a
shared context.
No, they depend on the fact that the C standard doesn't forbid *(char*)0
to access the first byte of memory.

Since it's been removed from my previous post, here's 'n1256.pdf'
section 3.4.3, point 1, again:

"undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of
erroneous data, for which this International Standard imposes no
requirements"

I do not understand why this is different than what you have just said
above, starting with "No" in response to my claim that "They depend on UB."
Some other programs may depend on
the fact that attempting to access the first byte of memory results in a
signal being delivered to the program. But those programs don't depend
on the fact that the behaviour is undefined -- they just depend on the
fact that one or the other behaviour is allowed by the C standard (and
that it actually happens on a particulr implementation).

Please note that "the fact that the" B. is U. is not included in the
original, just as "only" is not included.

Regardless of that, I disagree. "The fact that one or the other
behaviour is allowed" seems very similar to "behavior...for which
this...Standard imposes no requirements", which is the definition for
undefined behaviour (as given above).
If the C
standard were changed to say that it's unspecified whether *(char*)0
accesses the first byte of memory or generates a signal, none of those
programs would suffer. In *that* sense, those programs do *not* rely on
undefined behaviour.

I'm not so sure about that. Is that an exhaustive list of the
possibilities? Is a list of two or more possibilities provided by the
Standard (unspecified) the same as a virtual infinitude of possibilities
(undefined)?
...

Well then I guess that's the problem. To me and apparently at least one
other person here, there's quite a big difference between those two
interpretations -- one is nonsensical and the other is quite easy to
agree with. That's why we're trying to understand which of them is
closer to what you intended to say. But if you refuse to give a clear
answer because you disagree that the difference is important, then the
discussion is not going to be easy.

There has been no refusal, but continued attempts to reach an
understanding instead, so please expect the same and please do not fret
about a potential refusal.
What are you talking about? Those are not three interpretations,

Hmm... If you read carefully, I did not say that there were three
interpretations in the quotation. I said "\"three\" subjects".
it's a
list of three cases. A reader interpreting only one or two of them
instead of all three is simply wrong.

How can I make this clear? "...if ___I___ say undefined behaviour, a
reader...has a fair interpretation..." That is, _me_. That is, take
the three subjects of the 4p2 quotation above and call them X,Y,Z. Now
take:

"Real C programs depend on the undefined behaviour of "dereferencing"
a null pointer constant cast to a complete object type under certain
circumstances."

What I have said was that a reader can interpret _my_ use of the words
"undefined behaviour" as any of X,Y,Z and that such an interpretation is
fair.

Hopefully put more simply: Use the C Standard to understand undefined
behaviour. If I say something about it, I fully intend my use of the
term to be interpreted according to the C Standard. If that means that
a statement of mine is ambiguous because the Standard offers multiple
choices, so be it. That's hardly my doing! (No intentional ambiguity.)
...

Did it meet with any disagreement, or was it just ignored? If it's any
consolation, I do agree that it sometimes is useful.

It did not meet with disagreement. The situation was no more and no
less than that it did not meet with any agreement (which has been
established as typical, thus far). And yes, that is a fantastic
consolation that someone else perceives a benefit for discussion ("Is
so!" "Is not!") if a distinction is made between these cases of undef.
behav. in the discussion. With a good sort of fortune, it could save
some time. So thanks very much! :)
Well no, unless again the interpretation(s) that you intended differ
from the one I assumed. I assumed you meant that it's impossible to get
rid of the undefined behaviour without breaking some programs, not that
it's possible to get rid of it in a way that breaks them. Or will you
say that that's not a difference worth being concerned about again?

Please allow me to re-post some original material:
Sure we can throw away the amended point #3, since we can agree that:

Real C programs depend on the undefined behaviour of "dereferencing"
a null pointer constant cast to a complete object type under certain
circumstances.

That is not meant as "it's impossible to get rid of the undefined
behavior without breaking some programs". It is directly in regards to
the proposal's amended point #3. If this is too vague, _perhaps_ we can
agree that:

One has a better chance of understanding some text by reading any
surrounding material to establish context.
 
B

Ben Bacarisse

Shao Miller said:
Yes indeed that is what I am suggesting. During translation, does a
conforming implementation coming across '*vp' have an undefined-
behaviour-license to yield a result that is not a void expression?

Not in this context. I am only talking about your posted example.
Some folks have argued that there is this UB due to the two other
"ifs" not being met. Then '*vp' might be a complete object type for
'sizeof' or any other operator. Or maybe it'd be a surprise function
designator. As posters are so found of pointing out, if the B. is U.,
anything's possible; your kitten can tie you up and flog you with damp
tea-bags (is a comical example).

The two "ifs" have nothing to do with it as far as I can see. The
expression is not evaluated -- only its type is needed and I see no
ambiguity at all in that part of the phrasing. The type of *vp is void
and the result is a constraint violation.
So you're saying that an implementation must diagnose "as if" '*vp' in
'sizeof *vp' is a void expression, despite that an implementation
might otherwise treat it as 'char'?

It can't treat it as something else and stay within the rules of C.
Portability. I think that one way in which a C Standard can encourage
a portable programming practice would be to require a diagnostic
message for the construct. Right now, I believe that we have to hope
for a diagnostic message. It might be nice to expect, rather than
hope.

I need a practical case where the new CV would diagnose something that
would other wise be missed. I don't think there can be many because
there is very little you can do with the thing that would result from
the * operation.

<snip>
 
K

Keith Thompson

Shao Miller said:
Wojtek said:
On 03/09/2010 9:51 AM, Shao Miller wrote: [...]
Real C programs depend on undefined behaviour.

Here you go using that ambiguous wording again. That's not helping.

There is a program. The program is real. The program includes
constructs which any C Standard familiar could reason as leading to
undefined behaviour during translation or execution via a conforming
implementation. Undefined behaviour is allowed. The program compiles.
The program works. The program does not work if these constructs are
removed. There is another such program. Real C programs depend on
undefined behaviour. Right now.

You're still using the same ambiguous wording, even though you're trying
to explain what you mean by it.

A C program cannot simply "depend on undefined behavior". It can depend
on the implementation-specific behavior that a given construct provides
for some construct whose behavior is not defined by the C standard.

[...]
 
S

Shao Miller

Keith said:
Shao Miller said:
Wojtek said:
On 03/09/2010 9:51 AM, Shao Miller wrote: [...]
Real C programs depend on undefined behaviour.
Here you go using that ambiguous wording again. That's not helping.
There is a program. The program is real. The program includes
constructs which any C Standard familiar could reason as leading to
undefined behaviour during translation or execution via a conforming
implementation. Undefined behaviour is allowed. The program compiles.
The program works. The program does not work if these constructs are
removed. There is another such program. Real C programs depend on
undefined behaviour. Right now.

You're still using the same ambiguous wording, even though you're trying
to explain what you mean by it.

A C program cannot simply "depend on undefined behavior". It can depend
on the implementation-specific behavior that a given construct provides
for some construct whose behavior is not defined by the C standard.

Well, I fully disagree. Let's expand the definition into the apparent
problem:

Real C programs depend on the behaviour, upon use of a nonportable or
erroneous program construct or of erroneous data, for which the
International C Standard imposes no requirements, of "dereferencing" a
null pointer constant cast to a complete object type under certain
circumstances.

A: Hey, my program works!
B: But it has C Standard undefined behaviour.
A: Where?
B: You do '*(char *)0'.
A: Right. So what?
B: So your program depends on undefined behaviour; it mightn't be portable.

I'm very sorry to have to disagree, but I cannot understand why the
expansion above cannot simply be true. I cannot understand why the
sample conversation between A and B might have B saying something
unreasonable at the end.

Actually, perhaps if someone perceives UB as an empty set of behaviour,
then I could understand the problem in interpretation. But it's not an
empty set of behaviour, it's a set with any member you like, which of
course can include your implementation-specific behaviour members. This
follows from the C Standard definition for undefined behaviour, does it not?

"This discussion is really unreal." "This stuff is defined as undefined."
 
L

lawrence.jones

In comp.std.c Shao Miller said:
Well, I fully disagree. Let's expand the definition into the apparent
problem:

Real C programs depend on the behaviour, upon use of a nonportable or
erroneous program construct or of erroneous data, for which the
International C Standard imposes no requirements, of "dereferencing" a
null pointer constant cast to a complete object type under certain
circumstances.

It's simply a matter of terminology and semantics. When you say the
program "depends on undefined behavior", what you mean is that it
depends on a particular behavior that isn't be required by the C
Standard, but that presumably is required or guaranteed by something
else. When Keith hears "depends on undefined behavior", he interprets
it (as would most native English speakers, I would guess) as depending
on the behavior not being defined by anything, which is generally
non-sensical (unless you're relying on random behavior to do something
like generate random numbers).

Your statement isn't exactly wrong, but it's very confusing. If you
want to be understood, it would be better to rephrase it by saying that
such programs depend on behavior that isn't guaranteed by the standard.
 
S

Shao Miller

It's simply a matter of terminology and semantics. When you say the
program "depends on undefined behavior", what you mean is that it
depends on a particular behavior that isn't be required by the C
Standard, but that presumably is required or guaranteed by something
else. When Keith hears "depends on undefined behavior", he interprets
it (as would most native English speakers, I would guess) as depending
on the behavior not being defined by anything, which is generally
non-sensical (unless you're relying on random behavior to do something
like generate random numbers).

Your statement isn't exactly wrong, but it's very confusing. If you
want to be understood, it would be better to rephrase it by saying that
such programs depend on behavior that isn't guaranteed by the standard.

Then what is the point of the definition of "undefined behavior" in the
Standard if we can't talk about it by its name? Especially when it's
used in many posts in these newsgroups to carry exactly the same
definition that I am intending? Why should my use of the term be
interpreted any differently than another poster's? I'm sure there must
be an answer.

I'll simply drop it. Let's throw away the amended point #3 simply
because the sun is hot. I'm still interested in thoughts on the amended
#4, which Ben B. has been kind enough to offer some feedback about.
 
K

Keith Thompson

Shao Miller said:
Then what is the point of the definition of "undefined behavior" in the
Standard if we can't talk about it by its name? Especially when it's
used in many posts in these newsgroups to carry exactly the same
definition that I am intending? Why should my use of the term be
interpreted any differently than another poster's? I'm sure there must
be an answer.

Of course we can talk about "undefined behavior" by its name.
We do it all the time.

But "undefined behavior" as defined by the standard makes no
guarantees whatsoever, and no program can possibly depend on it.
What "this program depends on undefined behavior" says to me is
"this program depends on the undefinedness of this behavior",
which is nonsensical.

What programs can and do depend on is specific behavior that *is*
defined, though perhaps by something other than the C standard.

For example, the behavior of division by zero is undefined.
Suppose that I want my program to trigger a particular interrupt,
and that the simplest way to do that on a particular platform is to
divide by zero. My program is not "depending on undefined behavior".
It is depending on behavior that is defined by the particular
implementation. In the absence of a system-specific definition of
the behavior, the behavior would still be undefined according to
the C standard, but my program would break. The undefined behavior
you say I was relying on is still there, but I cannot rely on it --
because I never did.

[...]
 
W

Wojtek Lerch

Didn't mean that either, but I fully agree. I'm getting a sense that the
lack of a temporal reference is part of the interpretation challenge.

Well then it's a good thing that you have finally mentioned temporal
reference -- this way I can ask what you meant by it. :)

....
There is a program. The program is real. The program includes constructs
which any C Standard familiar could reason as leading to undefined
behaviour during translation or execution via a conforming
implementation. Undefined behaviour is allowed. The program compiles.
The program works. The program does not work if these constructs are
removed. There is another such program. Real C programs depend on
undefined behaviour. Right now.

Wait a minute. Now it sounds like you're talkig about "removing"
undefined behaviour by removing constructs from the program (and
presumably replacing them with something else), rather than modifying
the C standard to make the behaviour of those constructs defined. That
sounds like a bit of a change of subject to me.
Which is why there is no "only". "Real C programs only depend on..."
There's also no "all". "All real C programs only depend on..."

Sorry, it seems that you have lost me again...

....
Since it's been removed from my previous post, here's 'n1256.pdf'
section 3.4.3, point 1, again:

"undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of
erroneous data, for which this International Standard imposes no
requirements"

I do not understand why this is different than what you have just said
above, starting with "No" in response to my claim that "They depend on UB."

Maybe the definition you quote doesn't make it sufficiently clear, but I
think the rest of the standard does: the purpose of the term "undefined
behaviour" is not to talk about behaviours, but to talk about constructs
whose behaviour the standard declines to restrict. The expression
*(char*)0 may access the first byte of memory, generate a signal, or
hang your machine; but those are not three undefined behaviours --
they're just examples of behaviours that the standard allows when it
says that a construct has "undefined behaviour". Therefore, when you
say "They depend on UB", the most natural interpretation is "They depend
on the fact that a certain construct has UB", in the sense that they'd
necessarily fail if the construct were to have a defined behaviour,
regardless of what that defined behaviour would be.
Please note that "the fact that the" B. is U. is not included in the
original, just as "only" is not included.

Regardless of that, I disagree. "The fact that one or the other
behaviour is allowed" seems very similar to "behavior...for which
this...Standard imposes no requirements", which is the definition for
undefined behaviour (as given above).

It's not. There's a very big difference between offering a choice of
two allowed behaviours and imposing no requirements whatsoever.
I'm not so sure about that. Is that an exhaustive list of the
possibilities?

It is, for those programs that I was talking about. In general no,
there are lots of other posibilities, some of them real and some purely
theoretical.
Is a list of two or more possibilities provided by the
Standard (unspecified) the same as a virtual infinitude of possibilities
(undefined)?

In general no, but to any program that doesn't care about any
possibilites other than the ones on the list, yes. If you make the list
exhaustive enough, you'll satisfy all real programs. There are only a
finite number of them.
There has been no refusal, but continued attempts to reach an
understanding instead, so please expect the same and please do not fret
about a potential refusal.

OK, how about this attempt at understanding:

Do *you* understand why I and a few others are against using the words
that you keep using? Are we trying to reach an understanding about
facts of real programs and requirements of the C standard, or about how
to use the term "undefined behaviour" without unnecessary risk of being
misinterpreted?

....
Hopefully put more simply: Use the C Standard to understand undefined
behaviour. If I say something about it, I fully intend my use of the
term to be interpreted according to the C Standard. If that means that a
statement of mine is ambiguous because the Standard offers multiple
choices, so be it. That's hardly my doing! (No intentional ambiguity.)

But your ambiguity has nothing to do with multiple choices in the
standard! Your ambiguity is caused by using the term "undefined
behaviour" in a way that's inconsistent with how it's normally used in
the standard (and in discussions about the standard). You seem to use
it to refer to a behaviour; the standard uses it to talk about the lack
of requirements on how a program will behave. As a result, when you say
that a program "depends on undefined behaviour", people tend to take
that as a claim that the program would necessarily fail if the set of
requirements were made non-empty, whereas you may have simply meant that
the program depends on some particular way that some particular
implementation chose (or happened) to make it behave, and therefore also
on the fact that that particular behaviour is not forbidden by the standard.
It did not meet with disagreement. The situation was no more and no less
than that it did not meet with any agreement (which has been established
as typical, thus far). And yes, that is a fantastic consolation that
someone else perceives a benefit for discussion ("Is so!" "Is not!") if
a distinction is made between these cases of undef. behav. in the
discussion.

Um, the statement I agreed with contained the word "sometimes". :)

....
Please allow me to re-post some original material:


That is not meant as "it's impossible to get rid of the undefined
behavior without breaking some programs". It is directly in regards to
the proposal's amended point #3. If this is too vague, _perhaps_ we can
agree that:

One has a better chance of understanding some text by reading any
surrounding material to establish context.

I can definitely agree with that. However, in situations where it
becomes clear that some text is difficult to understand or ambiguous and
the author is asked for a clarification, it usually is helpful if the
clarification tries to keep dependencies on the context to a minimum,
and make them explicit. :)
 
S

Shao Miller

Keith said:
Of course we can talk about "undefined behavior" by its name.
We do it all the time.

And somehow in this particular instance, it's reasonable for multiple
people to debate what it means when I use it.
But "undefined behavior" as defined by the standard makes no
guarantees whatsoever, and no program can possibly depend on it.

There _is_ a guarantee. If there is any behaviour, it is guaranteed to
be out-of-scope of any C Standard requirements. Isn't it?
What "this program depends on undefined behavior" says to me is
"this program depends on the undefinedness of this behavior",
which is nonsensical.

What sorts of results would you get if you used this interpretation in
every other post where someone uses "undefined behavior"? Are you
suggesting that:

A: Look at my nice program.
B: Uh oh. Line 5 has undefined behaviour.
A: Then how in blazes does it compile and run and work as expected?
B: Uhh... Just by random chance.

is a meaningful conversation?
What programs can and do depend on is specific behavior that *is*
defined, though perhaps by something other than the C standard.

What is the difference between what you have typed just above and the
definition for undefined behaviour?
For example, the behavior of division by zero is undefined.
Suppose that I want my program to trigger a particular interrupt,
and that the simplest way to do that on a particular platform is to
divide by zero. My program is not "depending on undefined behavior".
It is depending on behavior that is defined by the particular
implementation. In the absence of a system-specific definition of
the behavior, the behavior would still be undefined according to
the C standard, but my program would break. The undefined behavior
you say I was relying on is still there, but I cannot rely on it --
because I never did.

All right. It should be obvious to any reader by now what is happening
in this thread.

If "undefined behavior" is so confusing, the Standard should call it
something else, like maybe "non-ISO". Imagine if I'd used "non-ISO
behaviour" instead of "undefined behaviour" in the original "problem"
statement. Could there have been any confusion, then?

Why should I be burdened with over-complicating a statement if the
problem is that readers cannot understand that the Standard definition
applies? I shouldn't be. I should be able to simply say "Note the
Standard definition" and that should be the end of it. I have said this
several times now and yet there is still argument.

What is the reason for this?
 
S

Shao Miller

Wojtek said:
Well then it's a good thing that you have finally mentioned temporal
reference -- this way I can ask what you meant by it. :)

I meant that I didn't say "now" or "current" in the original "problem"
statement. So a reader trying to add to that statement might put in
"forever".

However, it seems to me that "some" instead of "all" and "now" instead
of "forever" are pretty natural; I omitted them as I would in everyday
speech.

A: Horses eat carrots.
B: Well, some day in the future, there mightn't be carrots to eat. What
will horses eat?
C: I know of horses that have never eaten carrots.
A: ... Seriously?
...

Wait a minute. ... ... ...Now it sounds like you're talkig about "removing"
undefined behaviour by removing constructs from the program (and
presumably replacing them with something else), rather than modifying
the C standard to make the behaviour of those constructs defined. That
sounds like a bit of a change of subject to me.

No. Please read the proposed amendment's point #3. Then read the
statement you have a problem with. If we defined the behaviour as in
the amendment, then we shrink the set of undefined behaviour which some
current C programs depend on. The programs would break and the
behaviour would be defined as something other than "undefined".
Sorry, it seems that you have lost me again...

Sorry about that.
...
I can definitely agree with that. However, in situations where it
becomes clear that some text is difficult to understand or ambiguous and
the author is asked for a clarification, it usually is helpful if the
clarification tries to keep dependencies on the context to a minimum,
and make them explicit. :)

Please forgive that I am not responding to each of the pieces of your
response. I did read them all and I did type responses to each one.
After reviewing them all, I decided against it. Instead, I will simply
offer that I did not intend for the text to be difficult to understand.
Please inject the definition(s) of "undefined behavior" from the C
Standard into what I said and that's what I meant.

We have "undefined" as a relative notion for some reason. I meant
"undefined" relative to the C Standard. That means, perhaps
paradoxically, two things:

1. Use the definition(s) of "undefined behavior" from the C Standard.
2. These definitions offer that the behaviour is undefined (in the
plain English meaning) relative to the C Standard.

If the program constructs change or the behaviour changes or the amended
point #3 becomes part of the Standard, the programs break.

Would you agree that "non-ISO" or some other term might be better for
the Standard to use instead of "undefined behavior"? If I'd typed
"non-ISO", would there have been any doubt about what I meant? Any
doubt about agreement?
 
K

Keith Thompson

Shao Miller said:
And somehow in this particular instance, it's reasonable for multiple
people to debate what it means when I use it.

It's not the meaning of "undefined behavior" we're debating,
it's your questionable statement that a program *depends on*
undefined behavior.

I think we all know what "undefined behavior" means. That's not the
problem.
There _is_ a guarantee. If there is any behaviour, it is guaranteed
to be out-of-scope of any C Standard requirements. Isn't it?

That's not a guarantee.
What "this program depends on undefined behavior" says to me is
"this program depends on the undefinedness of this behavior",
which is nonsensical.

What sorts of results would you get if you used this interpretation in
every other post where someone uses "undefined behavior"? Are you
suggesting that:

A: Look at my nice program.
B: Uh oh. Line 5 has undefined behaviour.
A: Then how in blazes does it compile and run and work as expected?
B: Uhh... Just by random chance.

is a meaningful conversation?


Yes, it's a meaningful conversation, though the last line is incomplete.
Random chance is certainly a possible explanation for the behavior of
the program (though not the only one).

How is it not meaningful?
What is the difference between what you have typed just above and the
definition for undefined behaviour?

Well, it's not a definition, but it's certainly consistent with the
Standard's definition of "undefined behavior".
All right. It should be obvious to any reader by now what is
happening in this thread.

Not really, no.
If "undefined behavior" is so confusing, the Standard should call
it something else, like maybe "non-ISO". Imagine if I'd used "non-ISO
behaviour" instead of "undefined behaviour" in the original "problem"
statement. Could there have been any confusion, then?


I think you've completely missed the point. It's not about undefined
behavior, it's about the meaning of the phrase "depends on undefined
behavior".

A program can depend on behavior that is defined by something other
than the C standard. Perhaps one could say that such a program
depends on a particular instance of undefined behavior (behavior
that is not defined by the C standard, but that is defined by
something else). I suppose that's what you mean by "depends on
undefined behavior", but it's not what that phrase says to me.

[...]
 
W

Wojtek Lerch

No. Please read the proposed amendment's point #3. Then read the
statement you have a problem with. If we defined the behaviour as in the
amendment, then we shrink the set of undefined behaviour which some
current C programs depend on. The programs would break and the behaviour
would be defined as something other than "undefined".

Are you referring to the proposal to make dereferencing a null pointer
constant a constraint violation? How would that define the behaviour of
anything? The behaviour is undefined with or without the constraint.
The set of constructs with undefined behaviour doesn't change. The set
of allowed "undefined behaviours" (i.e. possible behaviours of a
construct with undefined behaviour) doesn't change either.

(Actually, the above is not quite true. Currently, the presence of an
expression that dereferences a null pointer constant doesn't
automatically make the program's behaviour undefined -- only an attempt
to evaluate that expression does. Your amendment would indeed break
some programs -- not ones that "depend on UB", but ones that rely on the
fact that an expression that dereferences a null pointer is not UB if
it's not evaluated.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,157
Latest member
MercedesE4
Top