Value of int NULL

K

Keith Thompson

Harald van Dijk said:
Programs that read uninitialised int objects on implementations where int
has no trap representations have defined behaviour. A strictly conforming
program can calculate the number of padding bits and check whether INT_MIN
< -INT_MAX. Combine the two and you find that there are situations where a
strictly conforming program can read uninitialised int objects.
[...]

That may well be true, but it's not obvious (to me) that it's
*necessarily* true.

Assume that type int has no padding bits, and therefore no trap
representations. Given an uninitialized object of type int:

{
int x;
printf("x = %d\n", x);
}

and assuming a simple-minded implementation that just follows the
obvious straighforward semantics of the abstract machine, the above
must print some value in the range [INT_MIN, INT_MAX].

*But* that doesn't necessarily mean that the behavior is well defined.

It's at least logically possible that the standard could say, directly
or indirectly, that the behavior of reading the value of an
uninitialized non-character object is undefined. Then an optimizing
compiler could *assume* that x is initialized, and generate code that
will do Bad Things if that assumption is violated.

I haven't (so far) been able to find wording in the standard that
settles the question either way.
 
K

Keith Thompson

CBFalconer said:
However the pointer to it can (assuming the pointer exists).

I'm afraid I'm missing both the point and the relevance.

Certainly an object of type ``unsigned char*'' can have a trap
representation. How is that relevant?

But a pointer to an object (here we're talking about a pointer value,
not a pointer object), whether that object is initialized or not, is a
valid value.

Concrete example:

unsigned char c; /* c cannot have a trap representation
because it's of type unsigned char */
unsigned char *p; /* p can have a trap representation */
p = &c; /* p cannot now have a trap representation */

Can you clarify what you meant and how it applies to the current
discussion?
 
K

Kaz Kylheku

It's at least logically possible that the standard could say, directly
or indirectly, that the behavior of reading the value of an
uninitialized non-character object is undefined.

The behavior of accessing an uninitialized lvalue is undefined, regardless of
its effective type.

An uninitialized object is ``indeterminately valued'', and access to such
a thing is right in the definition of undefined behavior.

6.2.4 Storage Durations of objects says:

3 An object whose identifier is declared with no linkage and without the
storage-class specifier static has automatic storage duration.

4 For such an object that does not have a variable length array type, storage
is guaranteed to be reserved for a new instance of the object on each entry
into the block with which it is associated; the initial value of the object
is indeterminate.

Uninitialized auto objects: indeterminate.

Consequently, the machine can detect and diagnose accesses to uninitialized
memory even without the help of trap representations.

Valuable tools like Purify or Valgrind are examples of implementations
(or implementation components) which have this capability.

If a compiled C program is run on Valgrind, it is running on no less of a
conforming implementation, even though its uses of uninitialized memory (even
of character type) are now diagnosed.

Of course the information whether or not memory is uninitialized has to be
stored somewhere. It doesn't have to be in the form of a trap representation;
it can be in a database which is stored elsewhere, which can be queried about
the properties and state of any piece of memory, using the address as the key.
 
K

Kaz Kylheku

Harald van D?k said:
Programs that read uninitialised int objects on implementations where int
has no trap representations have defined behaviour. A strictly conforming
program can calculate the number of padding bits and check whether INT_MIN
< -INT_MAX. Combine the two and you find that there are situations where a
strictly conforming program can read uninitialised int objects.
[...]

That may well be true, but it's not obvious (to me) that it's
*necessarily* true.

Assume that type int has no padding bits, and therefore no trap
representations. Given an uninitialized object of type int:

{
int x;
printf("x = %d\n", x);
}

0:flynux:/home/kaz/export$ cat > uninit.c
int main(void)
{
char x;
printf("x = %d\n", x);
return 0;
}
0:flynux:/home/kaz/export$ gcc -g uninit.c
0:flynux:/home/kaz/export$ valgrind --tool=memcheck ./a.out
==4447== Memcheck, a memory error detector for x86-linux.
==4447== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al.
==4447== Using valgrind-2.2.0, a program supervision framework for x86-linux.
==4447== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al.
==4447== For more details, rerun with: -v
==4447==
==4447== Use of uninitialised value of size 4
==4447== at 0x28CE47: _itoa_word (in /lib/tls/libc-2.3.4.so)
==4447== by 0x290291: _IO_vfprintf_internal (in /lib/tls/libc-2.3.4.so)
==4447== by 0x29644F: _IO_printf (in /lib/tls/libc-2.3.4.so)
==4447== by 0x8048395: main (uninit.c:4)
 
K

Keith Thompson

Kaz Kylheku said:
The behavior of accessing an uninitialized lvalue is undefined,
regardless of its effective type.

I'm willing to be convinced that that's correct, but I haven't been
yet.
An uninitialized object is ``indeterminately valued'', and access to such
a thing is right in the definition of undefined behavior.

In C90, but not in C99.

C90 3.16:

undefined behavior

Behavior, upon use of a nonportable or erroneous program
construct, of erroneous data, or of indeterminately valued
objects, for which this International Standard imposes no
requirements. Permissible undefined behavior ranges from ignoring
the situation completely with unpredictable results, to behaving
during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution
(with the issuance of a diagnostic message).

If a "shall" or "shall not" requirement that appears outside of a
constraint is violated. the behavior is undefined. Undefined
behavior is otherwise indicated in this International Standard by
the words undefined behavior or by the omission of any explicit
definition of behavior. There is no difterence in emphasis among
these three: they all describe behavior that is undefined.

C99 3.4.3:

undefined behavior

behavior, upon use of a nonportable or erroneous program construct
or of erroneous data, for which this International Standard
imposes no requirements

NOTE Possible undefined behavior ranges from ignoring the
situation completely with unpredictable results, to behaving
during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution
(with the issuance of a diagnostic message).

EXAMPLE An example of undefined behavior is the behavior on
integer overflow.

I presume that the omission of "indeterminately valued objects" from
the C99 definition was deliberate.

And, as I read it, even the C90 definition doesn't necessarily imply
that accessing such a value UB; it does so only in cases where the
standard imposes no requirements. There are plenty of "nonportable
.... program construct(s)" whose use doesn't necessarily invoke UB.

And as long as I'm reading section 3:

C99 3.17.2:

indeterminate value

either an unspecified value or a trap representation

C99 3.17.3:

unspecified value

valid value of the relevant type where this International Standard
imposes no requirements on which value is chosen in any instance

NOTE An unspecified value cannot be a trap representation.

So, for a type that has no trap representations (e.g., int
on a system with CHAR_BIT*sizeof(int)==32, INT_MIN==-2**31,
INT_MAX==2**31-1), an indeterminate value must be an unspecified
value, which is *by definition" a "valid value". (C90 doesn't have
definitions for "indeterminate value" and "unspecified value".)

[snip]
Uninitialized auto objects: indeterminate.
Agreed.

Consequently, the machine can detect and diagnose accesses to
uninitialized memory even without the help of trap representations.

Certainly an implementation can issue whatever diagnostics it likes.
Valuable tools like Purify or Valgrind are examples of implementations
(or implementation components) which have this capability.

If a compiled C program is run on Valgrind, it is running on no less
of a conforming implementation, even though its uses of
uninitialized memory (even of character type) are now diagnosed.

If my interpretation of the C99 definitions is correct, then this
program:

#include <stdio.h>
#include <limits.h>
int main(void)
{
int indeterminate;
if (sizeof(int) * CHAR_BIT == 16 &&
INT_MIN == -INT_MAX - 1 &&
INT_MAX == +32767)
{
indeterminate;
}
printf("Hello, world\n");
return 0;
}

is strictly conforming, even given the possible access of an
indeterminate object; an implementation may issue any compile-time
warnings it likes, but the program must print "Hello, world" at run
time.
Of course the information whether or not memory is uninitialized has
to be stored somewhere. It doesn't have to be in the form of a trap
representation; it can be in a database which is stored elsewhere,
which can be queried about the properties and state of any piece of
memory, using the address as the key.

Agreed. The question is what an implementation may do with that
information.
 
K

Keith Thompson

Kaz Kylheku said:
0:flynux:/home/kaz/export$ cat > uninit.c
int main(void)
{
char x;
printf("x = %d\n", x);
return 0;
}
0:flynux:/home/kaz/export$ gcc -g uninit.c
0:flynux:/home/kaz/export$ valgrind --tool=memcheck ./a.out
==4447== Memcheck, a memory error detector for x86-linux.
==4447== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al.
==4447== Using valgrind-2.2.0, a program supervision framework for x86-linux.
==4447== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al.
==4447== For more details, rerun with: -v
==4447==
==4447== Use of uninitialised value of size 4
==4447== at 0x28CE47: _itoa_word (in /lib/tls/libc-2.3.4.so)
==4447== by 0x290291: _IO_vfprintf_internal (in /lib/tls/libc-2.3.4.so)
==4447== by 0x29644F: _IO_printf (in /lib/tls/libc-2.3.4.so)
==4447== by 0x8048395: main (uninit.c:4)

You snipped the part immediately after the code fragment where I
wrote:

| and assuming a simple-minded implementation that just follows the
| obvious straighforward semantics of the abstract machine, the above
| must print some value in the range [INT_MIN, INT_MAX].

Running the program under valgrind isn't such an implementation.
 
B

Ben Bacarisse

Keith Thompson said:
Harald van Dijk said:
Programs that read uninitialised int objects on implementations where int
has no trap representations have defined behaviour. A strictly conforming
program can calculate the number of padding bits and check whether INT_MIN
< -INT_MAX. Combine the two and you find that there are situations where a
strictly conforming program can read uninitialised int objects.
[...]

That may well be true, but it's not obvious (to me) that it's
*necessarily* true.

Assume that type int has no padding bits, and therefore no trap
representations. Given an uninitialized object of type int:

{
int x;
printf("x = %d\n", x);
}

and assuming a simple-minded implementation that just follows the
obvious straighforward semantics of the abstract machine, the above
must print some value in the range [INT_MIN, INT_MAX].

*But* that doesn't necessarily mean that the behavior is well
defined.

It's at least logically possible that the standard could say, directly
or indirectly, that the behavior of reading the value of an
uninitialized non-character object is undefined.

I suppose the standard could, even after all the definitions,
introduce the new concept of "uninitialised object" with specific
undefined behaviour but I don't remember anything like that.

I think you are right about the intent, since Appendix J.2 includes:

The value of an object with automatic storage duration is used while
it is indeterminate (6.2.4, 6.7.8, 6.8).

as an example of undefined behaviour, but I can't see how to derive it
from the sections cited.
Then an optimizing
compiler could *assume* that x is initialized, and generate code that
will do Bad Things if that assumption is violated.

I haven't (so far) been able to find wording in the standard that
settles the question either way.

But, to me, it is not an open question that could go either way. If
an int can't trap, then printing an indeterminate one must print a
"valid value of the relevant type".
 
B

Barry Schwarz

I think this is too strong. Doing anything with an indeterminate
value is a Very Bad Idea, but it is not always UB. The reason is that
an indeterminate value is simply either a valid value or a trap
representation. On a system without trap reps., the behaviour is
entirely defined (although unpredictable). It is much simpler to say
"that's UB", because that is what it is in the most general case, but
on many systems it is not.

Realizing that Appendix J is only informative, I believe it
nonetheless expresses the intent of the committee. The 10th item in
the undefined behavior section (J.2) states "The value of an object
with automatic storage duration is used while it is indeterminate."
without reservations about traps or valid values.
 
C

CBFalconer

Keith said:
I'm afraid I'm missing both the point and the relevance.
.... snip ...

unsigned char c; /* c cannot have a trap representation
because it's of type unsigned char */
unsigned char *p; /* p can have a trap representation */

All I said was that, right here, p can have a trap representation.
I.E:
char *q;
q = p; /* can crash accessing p */
p = &c; /* p cannot now have a trap representation */

and now p cannot.
 
B

Ben Bacarisse

Barry Schwarz said:
Realizing that Appendix J is only informative, I believe it
nonetheless expresses the intent of the committee. The 10th item in
the undefined behavior section (J.2) states "The value of an object
with automatic storage duration is used while it is indeterminate."
without reservations about traps or valid values.

Yes, I have even quoted that elsethread myself, but the trouble for me
is that there should be something normative, if that is the intent. I
don't really care either way because, as a programmer, I will never
want to do such a thing -- it is as bad as undefined behaviour to
me -- but to an implementor, UB might open up extra possibilities.
 
H

Harald van Dijk

Harald van Dijk said:
Programs that read uninitialised int objects on implementations where
int has no trap representations have defined behaviour. A strictly
conforming program can calculate the number of padding bits and check
whether INT_MIN < -INT_MAX. Combine the two and you find that there are
situations where a strictly conforming program can read uninitialised
int objects.
[...]

That may well be true, but it's not obvious (to me) that it's
*necessarily* true.
[snip]
It's at least logically possible that the standard could say, directly
or indirectly, that the behavior of reading the value of an
uninitialized non-character object is undefined.

Agreed, that is logically possible, but...
I haven't (so far) been able to find wording in the standard that
settles the question either way.

....the behaviour is defined because the standard does not specifically say
the behaviour is undefined, and the behaviour is not undefined by
omission. There is no one sentence that explicitly states the behaviour is
defined, but it is, because the standard doesn't say otherwise.

This is not necessarily a good thing. In C90, as has been noted, reading
uninitialised objects is undefined regardless of representation, and I
recall reading about cases where the permission to trap is desirable even
for unsigned char. IIRC, the problem was that on some machine, registers
could trap, and it was a pain to make sure that reading an uninitialised
unsigned char did not read from an uninitialised register.

I've looked it up and I see that it is now a DR:
<http://open-std.org/JTC1/SC22/WG14/www/docs/dr_338.htm>

I'm curious to see what will be done.
 
K

Keith Thompson

CBFalconer said:
All I said was that, right here, p can have a trap representation.
[...]

I'll accept that that's what you meant, but it's not what you said.
You referred to "a pointer to it", where "it", from the context, is
"an uninitialized unsigned char object". ``p'', at that point, is not
a pointer to any object.

In any case, I think we're in agreement on the basic facts, so I'll be
glad to drop this.
 
L

lawrence.jones

Keith Thompson said:
(There's a nitpicking argument that ``((void*)0)'' isn't allowed, but
I'm sure the intent is to allow it, and many implementations define it
that way anyway. The problem is 6.5.1p5, which *doesn't* say that a
parenthesized null pointer constant is a null pointer constant.)

Parentheses don't change the "essence" of their contents: ((void*)0) is
still an integer constant expression with the value zero cast to type
void *.
 
K

Keith Thompson

Parentheses don't change the "essence" of their contents: ((void*)0) is
still an integer constant expression with the value zero cast to type
void *.

I have no doubt that that was the intent. However, C99 6.5.1p5
specifically says that a parenthesized expression retains the
following properties of the expression within the parentheses:

type
value
whether it's an lvalue
whether it's a function designator
whether it's a void expression

It doesn't mention "essence". :cool:}

Since the authors went to the trouble of listing all those properties,
it's not unreasonable to infer that the list is meant to be
exhaustive.

I would argue, based on a literal reading of the current wording, that
((void*)0) isn't an integer constant expression with the value zero
cast to type void *; it's a parenthesized expression whose
subexpression is an integer constant expression with the value zero
cast to type void *.

Conceivably that section could have said that a parenthesized
expression retains all the properties of the subexpression, but trying
to define "all the properties" would have opened a large can of worms.
As far as I know, null-pointer-constantness is the only relevant
property that's missing from the list.

IMHO, whether an expression is a null pointer constant should simply
be added to the list. (And in the meantime, I'm not going to worry
about it too much, and I'll just assume that ((void*)0) is a null
pointer constant anyway.)

Interestingly (well, sort of) (0) or ((0)) is a null pointer constant,
since it satisfies the definition of an "integer constant expression".
 
H

Harald van Dijk

C99 6.5.1p5
specifically says that a parenthesized expression retains the following
properties of the expression within the parentheses:

type
value
whether it's an lvalue
whether it's a function designator
whether it's a void expression
[...]
As far as I
know, null-pointer-constantness is the only relevant property that's
missing from the list.

There's another relevant property that's missing: whether it's a string
literal:

char a[] = ("hello");

One supposedly conforming compiler accepts this without any diagnostics.
Another supposedly conforming compiler rejects this in one of its
conforming modes.
 
R

Richard Tobin

Keith Thompson said:
I have no doubt that that was the intent. However, C99 6.5.1p5
specifically says that a parenthesized expression retains the
following properties of the expression within the parentheses:

type
value
whether it's an lvalue
whether it's a function designator
whether it's a void expression

It doesn't mention "essence". :cool:}

It also doesn't mention constantness, though we know that (0) is an
integer constant.

-- Richard
 
K

Keith Thompson

It also doesn't mention constantness, though we know that (0) is an
integer constant.

No, (0) isn't an integer constant; an integer constant is a token
defined by the grammar in C99 6.4.4.1p1.

It *is* an integer constant expression. This can be inferred from the
definition in C99 6.6, which defines an "integer constant expression"
as an expression with certain restrictions; there's no need to state
it separately in the description of parenthesized expressions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
Sean29G025

Latest Threads

Top