C Standard Regarding Null Pointer Dereferencing

S

Shao Miller

Hello Readers,

Please respond with the _highest_ levels of pedantry you can muster
up.

This e-mail is in regards to how a C translator/compiler should handle
the expression:

*(char *)0

Consider the following program:

int main(void) {
(void)*(char *)0;
return 0;
}

The question is: Does the above program imply undefined behaviour?

References here from the C standard draft with filename 'n1256.pdf'.

Looking at the second line of the program:

(A) "Expression and null statements", 6.8.3, Semantics 2:

"The expression in an expression statement is evaluated as a void
expression for its side effects."

The footnote 134 adds, "Such as assignments, and function calls which
have side effects."

This appears to describe the second line of the program pretty nicely.

(B) "void", 6.3.2.2, point 1:

"The (nonexistent) value of a void expression...shall not be used in
any way... If an expression of any other type is evaluated as a void
expression, its value or designator is discarded. (A void expression
is evaluated for its side effects.)"

Note that this doesn't read "_only_ evaluated for its side effects."
However, (A) doesn't read "_only_", either, but one can get that
impression due to the explicit mentioning of "side effects" in both
(A) and (B).

(C) "Address and indirection operators", 6.5.3.2, Semantics 4:

"...if [the operand] points to an object, the result is an lvalue
designating the object."

(D) "Address and indirection operators", 6.5.3.2, Semantics 4:

"If the operand has type 'pointer to type', the result has type
'type'.

(E) "Address and indirection operators", 6.5.3.2, Semantics 4:

"...If an invalid value has been assigned to the pointer, the
behavior...is undefined."

The footnote 87 adds, "Among the invalid values for dereferencing a
pointer...are a null pointer..." This footnote is referenced from
(E).

(C), (D) and (E) are in regards to the unary '*' operator, and where I
perceive a challenge in interpretation. This operator is followed by
a cast-expression, so such an expression would make up the operand, if
I'm not mistaken. The particular cast-expression in line two of the
program is '(char *)0'. _Is_this_an_assigned_value_?
_Is_"assigned"_meant_there_purposefully_or_not_?

(F) "object", 3.13, point 1:

"region of data storage in the execution environment, the contents of
which can represent values"

(G) "value", 3.17, point 1:

"precise meaning of the contents of an object when interpreted as
having a specific type"

By (G), is '(char *)0' a value? Maybe not by (G), but there are other
parts in the text which read as though expressions can have values,
without needing any objects. The "integer constant expression with
the value 0" in (H) below is such an example. Perhaps it _may_be_ a
value iff _used_ for its value?

(H) "Pointers", 6.3.2.3, points 3 and 4:

"An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer constant. If
a null pointer constant is converted to a pointer type, the resulting
pointer, called a null pointer, is guaranteed to compare unequal to a
pointer to any object or function."

"Conversion of a null pointer to another pointer type yields a null
pointer of that type. Any two null pointers shall compare equal."

By (H), it would appear that '(char *)0' is a pointer and a null
pointer. Also, it cannot point to an object. Thus, this operand does
not point to an object for (C), and we must forget (C)'s application
to our case.

(I) "Cast operators", 6.5.4, Semantics 4:

"Preceding an expression by a parenthesized type name converts the
value of the expression to the named type. ..."

Another example where an expression _has_ a value. But the text reads
"value of the expression" to describe the _use_ of that particular
property of the expression. Similar to "...expression with the value
0" in (H).

The footnote 89 adds, "A cast does not yield an lvalue."

By (I), it would appear that '(char *)0' converts the value of '0' to
a 'char *' type. But is this value _assigned_to_a_pointer_ in (E)?
We do now know that the type for this operand is 'char *' for (D).
Thus the unary '*' operator should yield a result with type 'char', by
(D).

(J) "The sizeof operator", 6.5.3.4, Semantics 2:

"...The size is determined from the type of the operand. ...the
operand is not evaluated."

In this, we see that a particular property "type" for the operand is
used. "...the operand is not evaluated" suggests that there is at
least one case in the C language where an expression can yield a
result with a type while avoiding that expression's evaluation.

But compare (J) with (A) and (B), which do describe evaluation, albeit
with "non-existant" values. (A) and (B) both mention side effects.

(K) "Program execution", 5.1.2.3, point 2:

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects, which are changes in the state of the execution environment.
Evaluation of an expression may produce side effects."

From (K), '*(char *)0' does not access a volatile object, nor does it
modify an object (remember that there's no assignment!), nor does it
modify a file, nor call a function doing any of those operations. It
does not appear to have any "side effects" at all. Iff '(char *)0'
can itself be considered an object (beyond being a pointer, a null
pointer, a cast expression, and having type 'char *'), then we _still_
don't have any side effects. For example: Would it be a volatile
object? Are we modifying an object?

If we constrain (A) and (B) to mean "_only_ evaluated for any side
effects", then (K) suggests '*(char *)0' has no side effects. This
constraint is not explicitly in the text, however. One can ponder if
it is meant or not.

Now then, let us please consider how '*(char *)0' evaluates if we take
"...If an invalid value has been assigned to the pointer..." from (E)
_literally_. There is no assignment here. There is conversion of the
value of the expression '0' to a null pointer. Then we are applying
the '*' operator to that null pointer. The result has of this
application yields a result with type 'char'. According to (J), this
expression can even be an operand to 'sizeof', since it has a type.
There is no object and there is no value.

Is there undefined behaviour? Perhaps consider it in terms of
variables and constants: In '*(char *)0', everything is constant. In
'*(char *)x', x is variable. Could be suppose that "has been
assigned" from (E) is used there _intentionally_, specifically because
with constants, we have full knowledge at translation-time, but with
variables, we need objects and an execution environment? In other
words, is an implementation _allowed_ to attempt to dereference a null
pointer, knowing 100% full well at translation time that that's what
the expression _looks_ like? With variables, the execution of the
program might or might not dereference a null pointer, and that can
trapped or not.

Consider the usual idea of '*x' as "object pointed-to by x" versus
splitting the idea into the more esoteric "result having a type,
possibly designating an object, and possibly having a value, depending
on properties of x".

What do you think? Thank you with sincerity for your time,

- Shao Miller
 
S

Shao Miller

Do your own homework.

Wow. I've not been a student since 1995, but never did homework then,
either. This isn't homework in the typical, student-applicable
sense. I'm thinking about this subject matter here at home, yes. If
you are trying to hint at something more and this is not just a wild,
inaccurate guess, please do enlighten.
 
S

Shao Miller

Any way it likes.
Despite the material I've offered?
Do you have any useful references to go along with that claim?
(void)*(char *)0;

is an expression. Therefore, it is evaluated.
If we take the text literally, we note the absence of "only" in "for
its side effects", as mentioned. Thus I agree that if we take the
text literally, the expression '*(char *)0' is evaluated. I'm not
sure why you mention '(void)*(char *)0;' being evaluated, however.
That doesn't seem relevant to me, since the question of UB surrounds
'*(char *)0', even beyond its now-agreed-upon evaluation for its
context as part of a void expression.
That's a violation of a "shall" outside a constraint - i.e. undefined
behaviour.
What is the violation? If there's no value, there's no value to be
used in any way or not to be used in any way.

Forget about the void expression context altogether and consider:

*(char *)0

Does this expression _have_ a value, given the text for the unary '*'
operator?

Can an implementation _even_attempt_to_get_ a value for this
expression, given that left alone, this expression has a result with a
type, but no mention of how its value can be gotten? A null pointer
does not refer to an object, so how do you get a value here?

Do you _need_ a value for this expression? Is there some kind of
requirement that during expression evaluation, a value is mandated at
some point? Back to the variables and constants perspective: We are
not comparing a value by using this expression, we are not attempting
to assign using this expression, the expression is not an argument to
a function, we are not attempting to use the expression as an lvalue,
etc. What exactly does evaluation entail?

(L) "Program execution", 5.1.2.3, point 3:

"In the abstract machine, all expressions are evaluated as specified
by the semantics. ..."

I do not see where the semantics entail that a value is required for
the expression '*(char *)0' outside of where such a value might be
_used_. For example:

(M) "Simple assignment", 6.5.16.1, Semantics 2:

"In simple assignment (=), the value of the right operand is converted
to the type of the assignment expression and replaces the value stored
in the object designated by the left operand."

Why the use of "the value of the right operand" rather than merely
"the operand"? Because it _demands_ a value. Back to the context of
a void expression, we do not _demand_ a value at all. In fact, any
value would be discarded. If there's no value, there's no value to
even discard.

By (L), does "evaluated" _really_ necessitate "the act of computing a
value" for '*(char *)0'? In the semantics (C), (D), (E), if we take
the text literally (like we did when we agreed above), there is no
assignment. The requirement for type is even satisfied.

Similarly, how about:

*(char *)58

Does that have a value? Does it not depend just the same upon
context? Used in a void expression, no value is required. Evaluate
all you like, you get something with a type, but there's no mention of
a mandatory value, is there? Essentially, I'm approaching it
constructively: The expression evaluation semantics have built us up a
thing for which a type is defined, but no value. Add to that, that
there is no requirement for a value. Is there some confusion between
"expression is evaluated" and "expression is evaluated, thus giving it
a value"?

Thanks anyway.
 
K

Keith Thompson

Richard Heathfield said:
Keith said:
Richard Heathfield said:
Shao Miller wrote:
Please respond with the _highest_ levels of pedantry you can muster
up. [...]
Consider the following program:

int main(void) {
(void)*(char *)0;
return 0;
}

The question is: Does the above program imply undefined behaviour?
Yes.

I tend to agree (I haven't done the research yet), but ...

[...]
(B) "void", 6.3.2.2, point 1:

"The (nonexistent) value of a void expression...shall not be used in
any way... If an expression of any other type is evaluated as a void
expression, its value or designator is discarded. (A void expression
is evaluated for its side effects.)"
That's a violation of a "shall" outside a constraint -
i.e. undefined behaviour.

But it doesn't apply here.

Yes, it does.

I believe you're mistaken. (As usual, I'm prepared to be convinced
otherwise.)
The undefined behaviour comes from the expression *(char *)0, which is
evaluated. The cast to void is neither here nor there.

I agree. But the "shall" you were referring to above was the one in
6.3.2.2p1, which applies only to void expressions. I don't argue that
the behavior of (void)*(char *)0 is well defined; I argue that it's
not 6.3.2.2p1 that causes it to be undefined.

Actually, I just noticed that there are two "shall"s in that paragraph.
Here's the whole thing:

The (nonexistent) value of a void expression (an expression
that has type void) shall not be used in any way, and implicit
or explicit conversions (except to void) shall not be applied
to such an expression. If an expression of any other type is
evaluated as a void expression, its value or designator is
discarded. (A void expression is evaluated for its side effects.)

Is either of these "shall"s violated by (void)*(char *)0?

If you didn't mean to imply that either of these particular "shall"s
is violated by this particular expression, please re-read what you
wrote above, particularly the line "Yes, it does."
Sure. But that's because 42 doesn't invoke undefined behaviour when
evaluated.

Precisely.
 
S

Shao Miller

Open and shut case.
Obviously it "feels" like it should be undefined behaviour, since
we've all been trying to avoid the act of a "null pointer dereference"
for a long time. But how many of these deal with _objects_ assigned a
null pointer value, versus an expression which is merely a null
pointer on its own? I believe that the vast majority are in the
former category, possibly biasing an interpretation of the referenced
text. Think of it this way: Your abstract machine plops a null
pointer value into an object 'foo' ("has been assigned to the pointer"
from (E)). The next operation is "fetch the value of the object 'foo'
points to", perhaps because it's about to be used. If 'foo' contains
a null pointer value, the behaviour is undefined. This is what we're
used to. But given a non-variable operand to '*', (E) does not apply,
since there's no assignment. We can rely on (C) only.
You provided all the necessary references yourself.
I still appreciate your feedback and would still appreciate if you
could directly address which portions of the referenced text support
your valuable discussion.
I'm glad we can share a frame of reference here, then.
<shrug> It's evaluated for its side effects, which are not overly
numerous in this case - in fact, the only side effect is UB.
But _why_ is it UB? Because of some notion that "we are attempting to
get a value using a null pointer"? Some of my argument goes, "we are
_not_ required to get a value via the '*' unary operator." We
certainly must and do yield a result with a type, as per (D).
It isn't the use that matters. It's the evaluation that matters.

char *p;
*p; /* UB, even though *p is not used */
This is a great example. Why is this UB? My conclusion would be
because of a path from (C). In an attempt to determine "if it points
to an object," we require the value of 'p', which is already UB. Note
the use of 'p' versus the non-use of '*p'. In '*(char *)0', we
already know darned well that '(char *)0' shall not point to an
object, so (C) does not apply. Neither does (E), so we're left with
(D), a type. It might help to use parentheses: *(p) First we need
the value of 'p', which is UB.
Syntactically, yes. Semantically, no. Hence, UB.
It syntactically has a value? Where does the syntax describe this?
The syntax says it is a "unary-expression".

How about:

struct {
int x;
} foo, *bar;
bar = &foo;
(*bar).x = 10;

Did we get any value from '*bar' there? Surely we got an object, but
a value? Did we "fetch" the "value" of the object pointed-to by
'bar', simply to toss it away because we are only setting a member?
"Structure and union members", 6.5.2.3 doesn't say anything about
requiring a value for its first operand, does it? It talks about
type. (M) is quite clear about requiring a value.
You are discovering why the behaviour is undefined.
Not really.
It depends on whether (char *)58 points into an object.
Possibly thanks to (C), yes. There's no "depends" on '0', is there?
And the moment you try, if the pointer doesn't point into an object, the
B is U.
I'd really rather read your reasoning about _why_, based on the
referenced text, than "the B is U."

(N) "undefined behavior", 3.4.3, point 1:

"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data,
for which this International Standard imposes no requirements"

Is there a nonportable program construct somewhere? Is there an
erroneous one? Is there erroneous data? I don't see them. I see us
having satisfied all requirements; '*(char *)0' has a type, as per (D).
 
S

Shao Miller

What does the standard say that the result is,
if the operand doesn't point to an object?

What do you think?
I don't know the standard. I can only say that the referenced text (a
standard draft) has (D). It says the result has a type ('char'),
since its operand has type "pointer-to-char". No UB. See the struct
example in another post. We get an object, not a value.
 
B

Ben Bacarisse

Keith Thompson said:
Hmm. Off the top of my head, I can't think of any way to violate the
"shall" in 6.3.2.2p1 without also violating some constraint. For
example:

x = (void)42;

violates the constraint specified in 6.5.16.1p1; void isn't one of
the permitted types for the right operand of a simple assignment.

Hardly important, but I think one such is:

int array[1] = {(void)42};

I think this violates no constraints so 6.3.2.2 p1 is important to
render it undefined.

Initialising a scalar does not provide a simpler example because 6.7.8
p11 imports all of the constraints attached to assignment. Thus

int x = {(void)42};

must be diagnosed.
 
K

Keith Thompson

Ben Bacarisse said:
Keith Thompson said:
Hmm. Off the top of my head, I can't think of any way to violate the
"shall" in 6.3.2.2p1 without also violating some constraint. For
example:

x = (void)42;

violates the constraint specified in 6.5.16.1p1; void isn't one of
the permitted types for the right operand of a simple assignment.

Hardly important, but I think one such is:

int array[1] = {(void)42};

I think this violates no constraints so 6.3.2.2 p1 is important to
render it undefined.

Initialising a scalar does not provide a simpler example because 6.7.8
p11 imports all of the constraints attached to assignment. Thus

int x = {(void)42};

must be diagnosed.

6.7.8p11:
The initializer for a scalar shall be a single expression,
optionally enclosed in braces. The initial value of the object
is that of the expression (after conversion); the same type
constraints and conversions as for simple assignment apply,
taking the type of the scalar to be the unqualified version of
its declared type.

Ah, but it does violate a constraint.

The syntax for "initializer" is:

initializer:
assignment-expression
{ initializer-list }
{ initializer-list , }

where an initializer-list is basically a list of initializers.
So {(void)42} is an initializer (for array), and (void)42 is also an
initializer (for array[0]). Since array[0] is a scalar, (void)42 is
"The initializer for a scalar", and the constraints referred to in
6.7.8p11 apply.

As I studied 6.7.8, I was relieved to discover this; otherwise a
compiler wouldn't be required to diagnose

double arr[] = { "hello" };

and that would be bad.
 
T

Tim Rentsch

Shao Miller said:
Please respond with the _highest_ levels of pedantry you can muster
up.

This e-mail

You mean newsgroup posting.
is in regards to how a C translator/compiler should handle
the expression:

*(char *)0

Consider the following program:

int main(void) {
(void)*(char *)0;
return 0;
}

The question is: Does the above program imply undefined behaviour?
[snip 165 more lines]

Yes.
 
S

Shao Miller

You mean newsgroup posting.
Thanks for the correction, Tim.
Is this the "_highest_" level "of pedantry you" could "muster up"? If
so, then thanks.

How about the following code:

int main(void) {
struct foo {
int x;
char y[2048];
int z;
};
return (*(struct foo*)0).z;
}

Are we attempting to get any sort of "value" when evaluating the
expression '(*(struct foo*)0)'? We have a type, as (D) suggests.

Have you any thoughts on why the text referenced in (E) reads, "has
been assigned to the pointer", Tim? Is that a mistake like my "e-
mail" above, do you suppose? If not, do you see any assignment of a
null pointer value in this code?

Thanks.
 
T

Tim Rentsch

Shao Miller said:
Thanks for the correction, Tim.

Is this the "_highest_" level "of pedantry you" could "muster up"? If
so, then thanks.

I'd used up all my pedantry capital in the earlier "newsgroup"
statement.

How about the following code:

int main(void) {
struct foo {
int x;
char y[2048];
int z;
};
return (*(struct foo*)0).z;
}

Are we attempting to get any sort of "value" when evaluating the
expression '(*(struct foo*)0)'? We have a type, as (D) suggests.

No, but doing so is undefined behavior, for the same reason
as the '*(char*)0' example.
Have you any thoughts on why the text referenced in (E) reads, "has
been assigned to the pointer", Tim? [snip]

I think it's old, carelessly imprecise wording that's
never been revised to correct the imprecision, probably
partly because it means just what it seems to mean to
people who don't think about it too carefully.
 
B

Ben Bacarisse

Keith Thompson said:
Ben Bacarisse said:
Keith Thompson said:
Hmm. Off the top of my head, I can't think of any way to violate the
"shall" in 6.3.2.2p1 without also violating some constraint. For
example:

x = (void)42;

violates the constraint specified in 6.5.16.1p1; void isn't one of
the permitted types for the right operand of a simple assignment.

Hardly important, but I think one such is:

int array[1] = {(void)42};

I think this violates no constraints so 6.3.2.2 p1 is important to
render it undefined.

Initialising a scalar does not provide a simpler example because 6.7.8
p11 imports all of the constraints attached to assignment. Thus

int x = {(void)42};

must be diagnosed.

6.7.8p11:
The initializer for a scalar shall be a single expression,
optionally enclosed in braces. The initial value of the object
is that of the expression (after conversion); the same type
constraints and conversions as for simple assignment apply,
taking the type of the scalar to be the unqualified version of
its declared type.

Ah, but it does violate a constraint.

The syntax for "initializer" is:

initializer:
assignment-expression
{ initializer-list }
{ initializer-list , }

where an initializer-list is basically a list of initializers.
So {(void)42} is an initializer (for array), and (void)42 is also an
initializer (for array[0]). Since array[0] is a scalar, (void)42 is
"The initializer for a scalar", and the constraints referred to in
6.7.8p11 apply.

Obviously I thought that the paragraph about scalars was not to be taken
as applying recursively. Reading your point below, that now seems to a
daft way to read it, but let me just say why I took to be so (if only
for a few hours!). First, just after the paragraph about scalars, p12
reads:

12 The rest of this subclause deals with initializers for objects that
have aggregate or union type.

which I took to mean "and the previous clauses don't" for no good reason
other than I seem to have a habit of reading more than is intended into
phrases that are simply informative. Second, p20 starts:

20 If the aggregate or union contains elements or members that are
aggregates or unions, these rules apply recursively to the
subaggregates or contained unions.

as if to suggest that "these rules" are applied recursively but only
down to the level of sub-aggregates not the scalars within them. Again,
this is just reading too much into it. That the rules about aggregates
apply recursively, does not mean that the others about scalars don't.
As I studied 6.7.8, I was relieved to discover this; otherwise a
compiler wouldn't be required to diagnose

double arr[] = { "hello" };

and that would be bad.

Yes, and for that reason alone it seems clear (now) that p11 must apply
to all enclosed scalars as much as to the top-level ones.
 
S

Stargazer

Hello Readers,

Please respond with the _highest_ levels of pedantry you can muster
up.

This e-mail is in regards to how a C translator/compiler should handle
the expression:

*(char *)0

Consider the following program:

int main(void) {
  (void)*(char *)0;
  return 0;

}

The question is: Does the above program imply undefined behaviour?

References here from the C standard draft with filename 'n1256.pdf'.

Looking at the second line of the program:

(A) "Expression and null statements", 6.8.3, Semantics 2:

"The expression in an expression statement is evaluated as a void
expression for its side effects."

The footnote 134 adds, "Such as assignments, and function calls which
have side effects."

This appears to describe the second line of the program pretty nicely.

(B) "void", 6.3.2.2, point 1:

"The (nonexistent) value of a void expression...shall not be used in
any way...  If an expression of any other type is evaluated as a void
expression, its value or designator is discarded.  (A void expression
is evaluated for its side effects.)"

Note that this doesn't read "_only_ evaluated for its side effects."
However, (A) doesn't read "_only_", either, but one can get that
impression due to the explicit mentioning of "side effects" in both
(A) and (B).

(C) "Address and indirection operators", 6.5.3.2, Semantics 4:

"...if [the operand] points to an object, the result is an lvalue
designating the object."

(D) "Address and indirection operators", 6.5.3.2, Semantics 4:

"If the operand has type 'pointer to type', the result has type
'type'.

(E) "Address and indirection operators", 6.5.3.2, Semantics 4:

"...If an invalid value has been assigned to the pointer, the
behavior...is undefined."

The footnote 87 adds, "Among the invalid values for dereferencing a
pointer...are a null pointer..."  This footnote is referenced from
(E).

(C), (D) and (E) are in regards to the unary '*' operator, and where I
perceive a challenge in interpretation.  This operator is followed by
a cast-expression, so such an expression would make up the operand, if
I'm not mistaken.  The particular cast-expression in line two of the
program is '(char *)0'.  _Is_this_an_assigned_value_?
_Is_"assigned"_meant_there_purposefully_or_not_?

(F) "object", 3.13, point 1:

"region of data storage in the execution environment, the contents of
which can represent values"

(G) "value", 3.17, point 1:

"precise meaning of the contents of an object when interpreted as
having a specific type"

By (G), is '(char *)0' a value?  Maybe not by (G), but there are other
parts in the text which read as though expressions can have values,
without needing any objects.  The "integer constant expression with
the value 0" in (H) below is such an example.  Perhaps it _may_be_ a
value iff _used_ for its value?

(H) "Pointers", 6.3.2.3, points 3 and 4:

"An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer constant.  If
a null pointer constant is converted to a pointer type, the resulting
pointer, called a null pointer, is guaranteed to compare unequal to a
pointer to any object or function."

"Conversion of a null pointer to another pointer type yields a null
pointer of that type.  Any two null pointers shall compare equal."

By (H), it would appear that '(char *)0' is a pointer and a null
pointer.  Also, it cannot point to an object.  Thus, this operand does
not point to an object for (C), and we must forget (C)'s application
to our case.

(I) "Cast operators", 6.5.4, Semantics 4:

"Preceding an expression by a parenthesized type name converts the
value of the expression to the named type. ..."

Another example where an expression _has_ a value.  But the text reads
"value of the expression" to describe the _use_ of that particular
property of the expression.  Similar to "...expression with the value
0" in (H).

The footnote 89 adds, "A cast does not yield an lvalue."

By (I), it would appear that '(char *)0' converts the value of '0' to
a 'char *' type.  But is this value _assigned_to_a_pointer_ in (E)?
We do now know that the type for this operand is 'char *' for (D).
Thus the unary '*' operator should yield a result with type 'char', by
(D).

(J) "The sizeof operator", 6.5.3.4, Semantics 2:

"...The size is determined from the type of the operand. ...the
operand is not evaluated."

In this, we see that a particular property "type" for the operand is
used.  "...the operand is not evaluated" suggests that there is at
least one case in the C language where an expression can yield a
result with a type while avoiding that expression's evaluation.

But compare (J) with (A) and (B), which do describe evaluation, albeit
with "non-existant" values.  (A) and (B) both mention side effects.

(K) "Program execution", 5.1.2.3, point 2:

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects, which are changes in the state of the execution environment.
Evaluation of an expression may produce side effects."

From (K), '*(char *)0' does not access a volatile object, nor does it
modify an object (remember that there's no assignment!), nor does it
modify a file, nor call a function doing any of those operations.  It
does not appear to have any "side effects" at all.  Iff '(char *)0'
can itself be considered an object (beyond being a pointer, a null
pointer, a cast expression, and having type 'char *'), then we _still_
don't have any side effects.  For example: Would it be a volatile
object?  Are we modifying an object?

If we constrain (A) and (B) to mean "_only_ evaluated for any side
effects", then (K) suggests '*(char *)0' has no side effects.  This
constraint is not explicitly in the text, however.  One can ponder if
it is meant or not.

Now then, let us please consider how '*(char *)0' evaluates if we take
"...If an invalid value has been assigned to the pointer..." from (E)
_literally_.  There is no assignment here.  There is conversion of the
value of the expression '0' to a null pointer.  Then we are applying
the '*' operator to that null pointer.  The result has of this
application yields a result with type 'char'.  According to (J), this
expression can even be an operand to 'sizeof', since it has a type.
There is no object and there is no value.

Is there undefined behaviour?  Perhaps consider it in terms of
variables and constants: In '*(char *)0', everything is constant.  In
'*(char *)x', x is variable.  Could be suppose that "has been
assigned" from (E) is used there _intentionally_, specifically because
with constants, we have full knowledge at translation-time, but with
variables, we need objects and an execution environment?  In other
words, is an implementation _allowed_ to attempt to dereference a null
pointer, knowing 100% full well at translation time that that's what
the expression _looks_ like?  With variables, the execution of the
program might or might not dereference a null pointer, and that can
trapped or not.

Consider the usual idea of '*x' as "object pointed-to by x" versus
splitting the idea into the more esoteric "result having a type,
possibly designating an object, and possibly having a value, depending
on properties of x".

What do you think?  Thank you with sincerity for your time,

- Shao Miller

In the many quotes you forgot one which is (along with 6.3.2.3.[3-4])
only relevant to your inquiry:

6.3.2.3.[5] An integer may be converted to any pointer type. Except as
previously specified, the
result is implementation-defined, might not be correctly aligned,
might not point to an
entity of the referenced type, and might be a trap representation.

Without excerpt of integer 0 cast to pointer and "Except as previously
specified..." clause, *(char*)0 would qualify as undefined behavior by
6.3.2.3.[5] and 6.5.3.2.[4]. With the addition of null pointer
specification, (char*)0 is guaranteed NOT to point to an object, but
it's never specicied as a constraint or as requiring a specific
action. So it's somewhere between "undefined" and "illegal". Most
implementations seem to choose the "undefined" side, and previous
discussion here, IIRC, ended with a similar suggestion.

Daniel
 
S

Stargazer

First * (char *) 0 is evaluated, giving an lvalue (no undefined
behaviour yet).

It is either undefined (consider 6.3.2.3.[5] with 6.5.3.2.[4]) or
illegal (consider 6.3.2.3.[3] that null pointer "is guaranteed to
compare unequal to a pointer to any object or function"). So '*' in
*(char*)0 either references address which is not defined to be legal,
or dereferences pointer to something that is necessarily not an
object.

Daniel
 
S

Shao Miller

I think it's old, carelessly imprecise wording that's
never been revised to correct the imprecision, probably
partly because it means just what it seems to mean to
people who don't think about it too carefully.

I'm sorry you seem to be implying that I haven't thought about this
carefully. I'm glad that you think the wording needs to be
addressed. The latter seems like an opinion worth offering.
 
S

Shao Miller

See 6.5.3.2.

"The unary * operator denotes indirection. If the operand points to a
function, the result is a function designator; if it points to an
object, the result is an lvalue designating the object. If the operand
has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the
unary * operator is undefined."
This reference causes me to wonder if you actually read the original
newsgroup posting. For some reason, I get the impression that few
responders actually did. I realize it was a long one, and there's
only so much time in a day. I brought the question up because I
believe that either:
1. The standard needs to be addressed due to an ambiguity, should it
be the case that it has not been already, XOR
2. There is no undefined behaviour
NULL is an invalid value - it is guaranteed not to point to any object
or function. (See 6.3.2.3.)
Yet another item referenced in the original post. Look at your
previous reference then look at this one. Where is a null pointer
value assigned? It's not. Yet the cast expression (the operand)
'(char *)0' _has_ a type, so the result of applying '*' _has_ a type.
That is all that 'sizeof' requires. That is enough for a void
expression. It is enough for the '.' postfix operator. No _value_ is
required in any of those three contexts. It would not be enough for
an assignment or a comparison.
Therefore, using * on a null pointer invokes UB.
Using '*' on a pointer that has been assigned an invalid value is UB.
'(char *)0' is not an lvalue (no cast expression is), hence it cannot
be assigned a value. It is a pointer. It is a null pointer. It is
not a pointer that has been assigned a null pointer value.

I do continue to value your feedback and am hopeful that you or
another responder may pinpoint a definitive reason for UB. So far,
Tim's suggestion that "the wording is imprecise" strikes me as most
likely, iff there really is undefined behaviour.
 
T

Tim Rentsch

Shao Miller said:
I think it's old, carelessly imprecise wording that's
never been revised to correct the imprecision, probably
partly because it means just what it seems to mean to
people who don't think about it too carefully.

I'm sorry you seem to be implying that I haven't thought about this
carefully. [snip]

Oh no, I didn't mean that at all. It's obvious you have
thought about it carefully. If you hadn't, your question
wouldn't have come up. You see what I'm saying now?
 
S

Shao Miller

In the many quotes you forgot one which is (along with 6.3.2.3.[3-4])
only relevant to your inquiry:

6.3.2.3.[5] An integer may be converted to any pointer type. Except as
previously specified, the
result is implementation-defined, might not be correctly aligned,
might not point to an
entity of the referenced type, and might be a trap representation.

Without excerpt of integer 0 cast to pointer and "Except as previously
specified..." clause, *(char*)0 would qualify as undefined behavior by
6.3.2.3.[5] and 6.5.3.2.[4].
char *x = (char *)0;
Here we declare a pointer-to-char and initialize it with '0' cast to
pointer-to-char. There's no UB here, right? So how is application of
the unary '*' operator to "the result" of the conversion in 6.5.3.2.
[3] (a null pointer, '(char *)0') impacted by these further
references? Section 6.3.2.3 has "had its say" by the time we have the
result '(char *)0'. There's no UB. Then we apply the unary '*'
operator.
With the addition of null pointer
specification, (char*)0 is guaranteed NOT to point to an object, but
it's never specicied as a constraint or as requiring a specific
action.
What is "it"? The expression? The result? The value? What
constraint do you need? What action do you need?

I propose that it is well-defined:
1. The expression details the operation of a cast. The result has a
type, pointer-to-char. An object of that type would have size and
alignment.
2. The result is called "a null pointer"
3. The result can be used as a value in an assignment
So it's somewhere between "undefined" and "illegal". Most
implementations seem to choose the "undefined" side, and previous
discussion here, IIRC, ended with a similar suggestion.
You appear to be talking about '(char *)0' and '*(char *)0'
simultaneously.
 
S

Shao Miller

It is either undefined (consider 6.3.2.3.[5] with 6.5.3.2.[4]) or
illegal (consider 6.3.2.3.[3] that null pointer "is guaranteed to
compare unequal to a pointer to any object or function"). So '*' in
*(char*)0 either references address
I believe that you are confused, here. 6.5.3.2 is "Address and
indirection operators". Is it specified which of '&' and '*' is
"address" and which is "indirection"? Do they have a one-to-one
correspondence? It really doesn't matter at all. _Nowhere_ does the
text for '*' read "reference" nor does it read "address" at all.
Please clarify.
which is not defined to be legal,
or dereferences pointer to something that is necessarily not an
object.
The entirety of the 'n1256.pdf' draft doesn't include the word
"dereference," but that's what some of us call it, including me.
However, it might have led to "magical thinking" regarding the '*'
unary operator. It is an operator. It has operands. It doesn't need
to "dereference an object" and it doesn't need to "dereference
something that is not necessarily an object".

But I appreciate your feedback, regardless of that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top