# Arithmetic will null pointer

Discussion in 'C Programming' started by Urs Thuermann, Jun 16, 2010.

1. ### Urs ThuermannGuest

I have again a question that I don't find a definitve answer for in my
draft copy of ISO 9899-99: Is it allowed to add an integer to a null
pointer and what will be the result?

See the following short example:

char *a = 0, *b = 0, *p, *q;
void f(void)
{
p = a + 0;
q = a + 10;
}

I only could find constraints for pointers that point to an object in

Constraints

[#2] For addition, either both operands shall have
arithmetic type, or one operand shall be a pointer to an
object type and the other shall have integer type.
(Incrementing is equivalent to adding 1.)

So the pointer must have object type (as opposed to void*), but does
not need to point to an object. Right? Then (char*)0 would be ok.

[#7] For the purposes of these operators, a pointer to a
nonarray object behaves the same as a pointer to the first
element of an array of length one with the type of the
object as its element type.

[#8] When an expression that has integer type is added to or
subtracted from a pointer, the result has the type of the
pointer operand. If the pointer operand points to an
element of an array object, ...

The null pointer is neither a pointer to a nonarray object nor a
pointer to an array element. So [#8] only says that (char*)0 + 1 is
of type (char*), but the rest is irrelvant.

So is that addition allowed? And is there a guarantee, that

(char*)0 + 0 == (char*)0

and

(char*)0 + 1 != (char*)0

It is clear that allmost all implementations will do so but can one
rely on this? This question turned up to me when I was writing a
short code snippet, similar to this:

char *s = 0, *old;
size_t size = 0, len = 0, n;

for (...) {
do {
n = snprintf(s + len, size - len, fmt, ...);
if (n >= size - len) {
old = s;
size += 1024;
s = realloc(s, size);
if (!s) {
goto error;
}
}
} while (s != old);
len += n;
}

The term s + len will be null pointer plus 0 in the first iteration.
(BTW, snprint(NULL, 0, ...) is allowed by ISO C, no problem here).

urs

Urs Thuermann, Jun 16, 2010

2. ### Urs ThuermannGuest

Richard Heathfield <> writes:

> The compiler is not required to diagnose such an addition, but the
> behaviour is undefined. See paragraph 8 of the section you quoted

I don't see where this paragraph says that such an addition yields
undefined behavior. That paragraph starts as follows:

[#8] When an expression that has integer type is added to or
subtracted from a pointer, the result has the type of the
pointer operand.

This only specifies that the type of (char*)0 + 0 is (char*). As I
read it, the whole rest of the paragraph depends on the directly
following condition:

If the pointer operand points to an element of an array
object, and the array is large enough, ...

Since (char*)0 does not point to an element of an array object the
I assume you refer to the sentence

If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the
array object, the evaluation shall not produce an overflow;
otherwise, the behavior is undefined.

Are you sure this not dependent on the first condition cited above?
This sentence refers to "the result" of which we only know how it is
derived when the pointer operand points to an element of an array
object.

Or do I misunderstand this paragraph 8 so fundamentally?

urs

Urs Thuermann, Jun 16, 2010

3. ### Eric SosmanGuest

On 6/16/2010 4:20 PM, Richard Heathfield wrote:
> Urs Thuermann wrote:
>> Richard Heathfield <> writes:
>>
>>> The compiler is not required to diagnose such an addition, but the
>>> behaviour is undefined. See paragraph 8 of the section you quoted

>
> <snip>
>
>> I assume you refer to the sentence
>>
>> If both the pointer operand and the result point to elements
>> of the same array object, or one past the last element of the
>> array object, the evaluation shall not produce an overflow;
>> otherwise, the behavior is undefined.

>
> You assume correctly.
>
>> Are you sure this not dependent on the first condition cited above?

>
> Yes. Since a null pointer doesn't point to an object, neither it nor the
> result can point to "elements of the same array object, or one past the
> last element of the array object", and therefore the behaviour is
> undefined.

Further to this point: Behavior is undefined if the Standard
says so explicitly, *or* if a requirement outside a constraint
section is violated, *or* if the Standard simply doesn't describe
the behavior at all. All of these are "equally undefined" (4p1).

When attempting arithmetic on a null-valued pointer, it's the
third of these situations: The Standard describes what happens if
the pointer points at or just after an object, but a null-valued
pointer doesn't do so. Therefore, the Standard's description does
not cover the case, and the behavior is "undefined by omission."

--
Eric Sosman
lid

Eric Sosman, Jun 16, 2010
4. ### SeebsGuest

On 2010-06-16, Richard Heathfield <> wrote:
> Yes. Since a null pointer doesn't point to an object, neither it nor the
> result can point to "elements of the same array object, or one past the
> last element of the array object", and therefore the behaviour is undefined.

Okay, I haven't had any caffeine today, but.

What tells us that a null pointer cannot point one past the last
element of an array object?

I agree that nothing tells us that it *does* point one past the last
element of an array object. But could it?

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach /
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Seebs, Jun 16, 2010
5. ### Keith ThompsonGuest

Eric Sosman <> writes:
> On 6/16/2010 4:20 PM, Richard Heathfield wrote:
>> Urs Thuermann wrote:
>>> Richard Heathfield <> writes:
>>>> The compiler is not required to diagnose such an addition, but the
>>>> behaviour is undefined. See paragraph 8 of the section you quoted

>>
>> <snip>
>>
>>> I assume you refer to the sentence
>>>
>>> If both the pointer operand and the result point to elements
>>> of the same array object, or one past the last element of the
>>> array object, the evaluation shall not produce an overflow;
>>> otherwise, the behavior is undefined.

>>
>> You assume correctly.
>>
>>> Are you sure this not dependent on the first condition cited above?

>>
>> Yes. Since a null pointer doesn't point to an object, neither it nor the
>> result can point to "elements of the same array object, or one past the
>> last element of the array object", and therefore the behaviour is
>> undefined.

>
> Further to this point: Behavior is undefined if the Standard
> says so explicitly, *or* if a requirement outside a constraint
> section is violated, *or* if the Standard simply doesn't describe
> the behavior at all. All of these are "equally undefined" (4p1).
>
> When attempting arithmetic on a null-valued pointer, it's the
> third of these situations: The Standard describes what happens if
> the pointer points at or just after an object, but a null-valued
> pointer doesn't do so. Therefore, the Standard's description does
> not cover the case, and the behavior is "undefined by omission."

Actually, as Richard pointed out, the standard says explicitly that
it's undefined:

If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; *otherwise, the
behavior is undefined*.

(Unless a null pointer points one past the last element of the array
object, as Seebs suggests.)

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jun 17, 2010
6. ### Keith ThompsonGuest

Seebs <> writes:
> On 2010-06-16, Richard Heathfield <> wrote:
>> Yes. Since a null pointer doesn't point to an object, neither it nor the
>> result can point to "elements of the same array object, or one past the
>> last element of the array object", and therefore the behaviour is undefined.

>
> Okay, I haven't had any caffeine today, but.
>
> What tells us that a null pointer cannot point one past the last
> element of an array object?
>
> I agree that nothing tells us that it *does* point one past the last
> element of an array object. But could it?

In theory, I think so.

C99 6.3.2.3p3:

If a null pointer constant is converted to a pointer type,
the resulting pointer, called a _null pointer_, is guaranteed
to compare unequal to a pointer to any object or function.

An implementation for which a null pointer is equal to a pointer just
past the end of some object could break some seemingly reasonable
code (but only if it happens to be operating on that one object),
but I don't see anything in the standard that forbids it.

An almost-plausible scenario: A null pointer is represented as
all-bits-zero. Automatic objects are allocated on The Stack, which
grows from high addresses to low, and starts at the top end of the
around from 0xffffffff to 0. The last byte of the first allocated
object in the program (perhaps argc, perhaps the string pointed to
by argv[0], perhaps something else) occupies byte 0xffffffff.

I doubt that any such implementations actually exist.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jun 17, 2010
7. ### Peter NilssonGuest

Keith Thompson <> wrote:
> An almost-plausible scenario: A null pointer is represented
> as all-bits-zero.  Automatic objects are allocated on The
> Stack, which grows from high addresses to low, and starts
> Pointer arithmetic quietly wraps around from 0xffffffff to
> 0.  The last byte of the first allocated object in the
> program (perhaps argc, perhaps the string pointed to
> by argv[0], perhaps something else) occupies byte
> 0xffffffff.
>
> I doubt that any such implementations actually exist.

Particularly as 0x00000000 would have to compare greater
than 0xffffffff for an object landing on that address.

In practice, implementations already have to allow
problems with 'one past the end' pointers in memory
allocations.

--
Peter

Peter Nilsson, Jun 17, 2010
8. ### Peter NilssonGuest

Richard Heathfield <> wrote:
> Seebs wrote:
> > What tells us that a null pointer cannot point one past
> > the last element of an array object?
> >
> > I agree that nothing tells us that it *does* point one
> > past the last element of an array object.  But could it?

>
> Who cares? Even if it did, there is no strictly conforming
> way to tell /which/ object it were pointing one past the end
> of,

Well, people do use one past the end pointers as sentinels,
e.g. in for loops. People also compare against null. Whilst
I can't think of a specific circumstance where one would
compare one past the end with null (directly or indirectly,)
it's not beyond the realms of possibility that some code
somewhere might actually rely on the equality not being
equal.

--
Peter

Peter Nilsson, Jun 17, 2010
9. ### Peter NilssonGuest

Richard Heathfield <> wrote:
> Peter Nilsson wrote:
> > Well, people do use one past the end pointers as sentinels,
> > e.g. in for loops.

>
> Yes, but you wouldn't use (NULL + n) as your sentinel, would
> you?

No. That's not my point following from Seebs comment.

> > People also compare against null. Whilst
> > I can't think of a specific circumstance where one would
> > compare one past the end with null (directly or indirectly,)

Change that...

> > it's not beyond the realms of possibility that some code
> > somewhere might actually rely on the equality not being
> > equal.

>
> Either "one past the end" is guaranteed not to be a null
> pointer, or it isn't. Right now, I'm in too much of a hurry
> to look it up. If it *is* guaranteed not to be a null pointer,
> then your scenario is of no consequence and this whole
> subthread is chasing a scotch mist. If it is *not* guaranteed
> not to be a null pointer, then code that assumes it isn't a
> null pointer is relying on a broken assumption.

But it seems a reasonable assumption.

Here's a contrived example:

T *col;
T *found = NULL;

for (col = vector; col < vector + N; col++)
{
if (something(*col))
found = col;
if (something_else(*col))
break;
}

/* did we find something and break at the same spot? */
/* assumption: one past end cannot be null */
if (found == col)
do_something();

Some might call that poor coding, some might call it
efficient.

--
Peter

Peter Nilsson, Jun 17, 2010
10. ### StargazerGuest

On Jun 17, 4:15 am, Keith Thompson <> wrote:

[...]
> An almost-plausible scenario: A null pointer is represented as
> all-bits-zero.  Automatic objects are allocated on The Stack, which
> grows from high addresses to low, and starts at the top end of the
> around from 0xffffffff to 0.  The last byte of the first allocated
> object in the program (perhaps argc, perhaps the string pointed to
> by argv[0], perhaps something else) occupies byte 0xffffffff.
>
> I doubt that any such implementations actually exist.

It is forbidden for an implementation to do things this way: "If both
the pointer
operand and the result point to elements of the same array object, or
one past the last
element of the array object, the evaluation shall not produce an
overflow, otherwise, the
behavior is undefined." (6.5.6[8])

In your proposal incrementing pointer to one past the last element of
an array does produce an overflow.

Daniel

Stargazer, Jun 17, 2010
11. ### StargazerGuest

On Jun 16, 5:52 pm, Seebs <> wrote:
> On 2010-06-16, Richard Heathfield <> wrote:
>
> > Yes. Since a null pointer doesn't point to an object, neither it nor the
> > result can point to "elements of the same array object, or one past the
> > last element of the array object", and therefore the behaviour is undefined.

>
> Okay, I haven't had any caffeine today, but.
>
> What tells us that a null pointer cannot point one past the last
> element of an array object?
>
> I agree that nothing tells us that it *does* point one past the last
> element of an array object. But could it?

Theoretically yes. That would be "undefined behavior", because if it
happens to one platform, the standard cannot impose a requirement that
it will be so on another.

There is however, a more interesting and practical question regarding
NULL pointers.

NULL pointer is defined in 6.3.2.3[3]: "An integer constant expression
with the value 0, or such an expression cast to type
void *, is called a null pointer constant.55) If a null pointer
constant is converted to a pointer type, the resulting pointer, called
a null pointer, is guaranteed to compare unequal
to a pointer to any object or function."

An "object" is defined in 3.14: object -- region of data storage in
the execution environment, the contents of which can represent values"

And integer-to-pointer conversions are defined in 6.3.2.3.[5]: "An
integer may be converted to any pointer type. Except as previously
speciï¬ed, the result is implementation-deï¬ned, might not be correctly
aligned, might not point to an entity of the referenced type, and
might be a trap representation.56)"
[56 -- The mapping functions for converting a pointer to an integer or
an integer to a pointer are intended to be consistent with the
addressing structure of the execution environment.]

Now there are architectures where address 0 represents valid and
important system memory (PC interrupt vectors, ARM exception handlers)
- that is, holds an object. Following the desire "...to be consistent
with the addressing structure of the execution environment" the
compiler would allow accessing this memory by dereferencing NULL
pointer, like "*(unsigned*)0" (GCC for ARM indeed does it in such a
way). Now we got on a particular architecture a pointer to object that
compares equal to NULL pointer. But it's not undefined behavior that
an implementation may define, it's violation of requirement of
6.3.2.3[3] and makes the implementation non-conforming.

What an implementor should do is such a case:

important to system code?
2) sacrifice consistency with the platform's addressing structure by
defining a non-zero address value corresponding to NULL pointer (what
if there are no spare addresses on the architecture)?
3) artifically extend size of data pointers in order to accomodate
NULL pointer that will not correspond to any real address and
sacrifice performance of pointer arithmetics substantially?
4) sacrifice standard conformance?

Daniel

Stargazer, Jun 17, 2010
12. ### Keith ThompsonGuest

Stargazer <> writes:
> On Jun 17, 4:15Â am, Keith Thompson <> wrote:
>
> [...]
>> An almost-plausible scenario: A null pointer is represented as
>> all-bits-zero. Â Automatic objects are allocated on The Stack, which
>> grows from high addresses to low, and starts at the top end of the
>> around from 0xffffffff to 0. Â The last byte of the first allocated
>> object in the program (perhaps argc, perhaps the string pointed to
>> by argv[0], perhaps something else) occupies byte 0xffffffff.
>>
>> I doubt that any such implementations actually exist.

>
> It is forbidden for an implementation to do things this way: "If both
> the pointer operand and the result point to elements of the same array
> object, or one past the last element of the array object, the
> evaluation shall not produce an overflow, otherwise, the behavior is
> undefined." (6.5.6[8])
>
> In your proposal incrementing pointer to one past the last element of
> an array does produce an overflow.

No, it doesn't. My assumption is that pointer arithmetic acts
like unsigned integer arithmetic. 0xffffffff + 1 yields 0; there
is no overflow.

(It's not at all clear what the word "overflow" in 6.5.6 means;
perhaps it just means that the behavior is defined.)

But as somebody else pointed out, a pointer past the end of an
array must be greater than a pointer within the array; with unsigned
comparisons, 0 < 0xfffffff.

Ok, so let's assume pointer arithmetic and comparisons are treated
(on the machine level) as if addresses were *signed* integers.
A null pointer is all-bits-zero, which is right in the middle of
the address space. The Stack starts at address -1 and grows away
from 0. An object could plausibly cover a range of bytes from,
say, address -4 to address -1; then &object + 1 == NULL.

(Such an implementation could easily avoid any such problems
by reserving a few bytes before and/or after the address that
corresponds to a null pointer, but I don't think it would be required
to do so.)

Probably the best fix for this is to update 6.3.2.3p3:

If a null pointer constant is converted to a pointer type, the
resulting pointer, called a _null pointer_, is guaranteed to compare
unequal to a pointer to any object or function.

to state that a null pointer also may not compare equal to a pointer
one past the last element of the array object. This would probably
require yet another repetition of the statement that "a pointer to
an object that is not an element of an array behaves the same as
a pointer to the first element of an array of length one with the
type of the object as its element type" (6.5.6p7, 6.5.8p4, 6.5.9p7
(the latter is new in TC2)).

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jun 17, 2010
13. ### Keith ThompsonGuest

Stargazer <> writes:
> On Jun 16, 5:52 pm, Seebs <> wrote:
>> On 2010-06-16, Richard Heathfield <> wrote:
>> > Yes. Since a null pointer doesn't point to an object, neither it
>> > nor the result can point to "elements of the same array object, or
>> > one past the last element of the array object", and therefore the
>> > behaviour is undefined.

>>
>> Okay, I haven't had any caffeine today, but.
>>
>> What tells us that a null pointer cannot point one past the last
>> element of an array object?
>>
>> I agree that nothing tells us that it *does* point one past the last
>> element of an array object. But could it?

>
> Theoretically yes. That would be "undefined behavior", because if it
> happens to one platform, the standard cannot impose a requirement that
> it will be so on another.

Where is the undefined behavior?

Remember that undefined behavior is an attribute of a program (or of a
construct within a program), not of an implementation.

Given an object "obj", the expressions
&obj
and
&&obj + 1
are both well defined; the first yields the address of obj, and the
second yields the address just past the end of obj. Similarly,
&obj + 1 > &obj
is well defined and yields 1 (true), and
&obj == NULL
is well defined and yields 0 (false). But if my interpretation is
correct then the result of
&obj + 1 == NULL
is *unspecified*, not undefined; it may yield either 0 or 1.

(Actually I suppose the result of &obj is unspecified, since the
implementation doesn't require any specific result -- but it's well
defined that the result must be the address of obj.)

I'm not suggesting that it *should* be possible for &obj + 1 to
be a null pointer, merely that my current interpretation of the
wording of the standard is that it is possible (though a given
implementation can arrange for it never to happen).

> There is however, a more interesting and practical question regarding
> NULL pointers.

You mean "null pointers". NULL is a macro.

[snip]

> Now there are architectures where address 0 represents valid and
> important system memory (PC interrupt vectors, ARM exception handlers)
> - that is, holds an object. Following the desire "...to be consistent
> with the addressing structure of the execution environment" the
> compiler would allow accessing this memory by dereferencing NULL
> pointer, like "*(unsigned*)0" (GCC for ARM indeed does it in such a
> way). Now we got on a particular architecture a pointer to object that
> compares equal to NULL pointer. But it's not undefined behavior that
> an implementation may define, it's violation of requirement of
> 6.3.2.3[3] and makes the implementation non-conforming.
>
> What an implementor should do is such a case:
>
> important to system code?
> 2) sacrifice consistency with the platform's addressing structure by
> defining a non-zero address value corresponding to NULL pointer (what
> if there are no spare addresses on the architecture)?
> 3) artifically extend size of data pointers in order to accomodate
> NULL pointer that will not correspond to any real address and
> sacrifice performance of pointer arithmetics substantially?
> 4) sacrifice standard conformance?

Those do seem to be the options (though I'd say 3 is just a variant
of 2).

Probably the most reasonable approach is 4. It's a minor violation
that will only affect code that deliberately accesses whatever
(Code that does so accidentally has undefined behavior anyway.)

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jun 17, 2010
14. ### StargazerGuest

On Jun 17, 10:06 pm, Keith Thompson <> wrote:
> Stargazer <> writes:
> > On Jun 16, 5:52 pm, Seebs <> wrote:
> >> On 2010-06-16, Richard Heathfield <> wrote:
> >> > Yes. Since a null pointer doesn't point to an object, neither it
> >> > nor the result can point to "elements of the same array object, or
> >> > one past the last element of the array object", and therefore the
> >> > behaviour is undefined.

>
> >> Okay, I haven't had any caffeine today, but.

>
> >> What tells us that a null pointer cannot point one past the last
> >> element of an array object?

>
> >> I agree that nothing tells us that it *does* point one past the last
> >> element of an array object. But could it?

>
> > Theoretically yes. That would be "undefined behavior", because if it
> > happens to one platform, the standard cannot impose a requirement that
> > it will be so on another.

>
> Where is the undefined behavior?

Comparison of null pointer to pointer to one past the last member of
an array. "undefined behavior... behavior, upon use of a nonportable
or erroneous program construct, of erroneous data, or of
indeterminately valued objects, for which this International Standard
imposes no requirements" (3.18).

Unless somebody can quote a relevant requirement, I assume there are
none imposed by the standard. Further note permits that it may be
something valid for a specific environment, so if on some
implementation null pointer happens to compare equal to one-past-the-
last pointer, it would fall under "nonportable construct" term, but
won't be a violation.

> Remember that undefined behavior is an attribute of a program (or of a
> construct within a program), not of an implementation.

It is both. An implementation may have a defition for non-portable
construct, but the latter will remain undefined behavior as long as
the standard doesn't put some relevant requirement on all conforming
implementations.

> Given an object "obj", the expressions
> &obj
> and
> &&obj + 1
> are both well defined; the first yields the address of obj, and the
> second yields the address just past the end of obj. Similarly,
> &obj + 1 > &obj
> is well defined and yields 1 (true), and
> &obj == NULL
> is well defined and yields 0 (false). But if my interpretation is
> correct then the result of
> &obj + 1 == NULL
> is *unspecified*, not undefined; it may yield either 0 or 1.

May a conforming implementation diagnose an error during translation
of such a construct? May &obj + 1 == NULL produce sometimes 1 and
sometimes 0 when "obj" is result of dereferencing of pointer to object
that is being realloc()ed? I think that both "additional" behaviors
are permitted and there are no "two or more possibilities" provided by
the standard to choose from. But IMO it's a rather scholastic
question: pointers to one-past the last element of an array were
introduced for comparisons and pointer arithmetic involving pointers
to elements of that array and are probably not useful for anything
else.

Daniel

Stargazer, Jun 17, 2010
15. ### StargazerGuest

On Jun 17, 9:15 pm, Keith Thompson <> wrote:
> Stargazer <> writes:
> > On Jun 17, 4:15 am, Keith Thompson <> wrote:

>
> > [...]
> >> An almost-plausible scenario: A null pointer is represented as
> >> all-bits-zero.  Automatic objects are allocated on The Stack, which
> >> grows from high addresses to low, and starts at the top end of the
> >> around from 0xffffffff to 0.  The last byte of the first allocated
> >> object in the program (perhaps argc, perhaps the string pointed to
> >> by argv[0], perhaps something else) occupies byte 0xffffffff.

>
> >> I doubt that any such implementations actually exist.

>
> > It is forbidden for an implementation to do things this way: "If both
> > the pointer operand and the result point to elements of the same array
> > object, or one past the last element of the array object, the
> > evaluation shall not produce an overflow, otherwise, the behavior is
> > undefined." (6.5.6[8])

>
> > In your proposal incrementing pointer to one past the last element of
> > an array does produce an overflow.

>
> No, it doesn't.  My assumption is that pointer arithmetic acts
> like unsigned integer arithmetic.  0xffffffff + 1 yields 0; there
> is no overflow.
>
> (It's not at all clear what the word "overflow" in 6.5.6 means;
> perhaps it just means that the behavior is defined.)

In an absence of a distinct definition I think that "overflow" has its
common meaning: that result cannot fit within destination operand. In
your case result of 0xFFFFFFFF + 1 would require 33 bits which will
not fit within the architecture pointer address operand.

On architecture level overflow is not specific to signed or unsigned
arithmetic; both may overflow. 2's complement architectures usually
don't even have separate signed and unsigned operations - there are
the same instructions but different set of condition flags to
interpret the result. C language happens to define behavior for
overflow of unsigned integer types; but it would be wrong to derive
requirements for pointer types from that even if pointers happen to be
the same size as some unsigned integer type and yield the same results
as that type when there's no overflow. The standard treats them
differently: it requires that pointer overflow doesn't occur and for
unsignied types it defines what happens when there's an overflow.

The machine level of implementation may take into account overflow/
carry flag when dealing with pointers (in order to assure that
overflow doesn't happen and, for example, segment limit violation or
stack overflow/underflow exception will not occur), while it would
ignore carry/overflow with ordinary unsigned arithmetic, as required.

[Side note: I don't think that treating addresses as signed integer
types on machine level has any reason. While addresses may correspond
to negative or positive numbers, there's no meaning for addresses'
"absolute value" as there is for numbers.]

Daniel

Stargazer, Jun 18, 2010
16. ### Keith ThompsonGuest

Stargazer <> writes:
> On Jun 17, 10:06 pm, Keith Thompson <> wrote:
>> Stargazer <> writes:
>> > On Jun 16, 5:52 pm, Seebs <> wrote:
>> >> On 2010-06-16, Richard Heathfield <> wrote:
>> >> > Yes. Since a null pointer doesn't point to an object, neither it
>> >> > nor the result can point to "elements of the same array object, or
>> >> > one past the last element of the array object", and therefore the
>> >> > behaviour is undefined.

>>
>> >> Okay, I haven't had any caffeine today, but.

>>
>> >> What tells us that a null pointer cannot point one past the last
>> >> element of an array object?

>>
>> >> I agree that nothing tells us that it *does* point one past the last
>> >> element of an array object. But could it?

>>
>> > Theoretically yes. That would be "undefined behavior", because if it
>> > happens to one platform, the standard cannot impose a requirement that
>> > it will be so on another.

>>
>> Where is the undefined behavior?

>
> Comparison of null pointer to pointer to one past the last member of
> an array. "undefined behavior... behavior, upon use of a nonportable
> or erroneous program construct, of erroneous data, or of
> indeterminately valued objects, for which this International Standard
> imposes no requirements" (3.18).

Yes, thank you, I know what "undefined behavior" means.

By comparison, do you mean "=="? The behavior of
&obj + 1 == NULL
isn't undefined; it's at worst unspecified. It yields either 0
or 1. It can't yield 42, a suffusion of yellow, or nasal demons.
The behavior is defined by the semantics of "==", C99 6.5.9.

> Unless somebody can quote a relevant requirement, I assume there are
> none imposed by the standard. Further note permits that it may be
> something valid for a specific environment, so if on some
> implementation null pointer happens to compare equal to one-past-the-
> last pointer, it would fall under "nonportable construct" term, but
> won't be a violation.

The *validity* of (&obj + 1 == NULL) isn't in question (or it
shouldn't be); the only question is what result it yields.

>> Remember that undefined behavior is an attribute of a program (or of a
>> construct within a program), not of an implementation.

>
> It is both. An implementation may have a defition for non-portable
> construct, but the latter will remain undefined behavior as long as
> the standard doesn't put some relevant requirement on all conforming
> implementations.

Right. So the behavior of a construct can be undefined, regardless of
the implementation.

I emphasized the distinction because it wasn't clear to me what
construct's behavior you were claiming to be undefined.

>> Given an object "obj", the expressions
>> &obj
>> and
>> &&obj + 1

Sorry, that was a typo; it should be a single '&'.

>> are both well defined; the first yields the address of obj, and the
>> second yields the address just past the end of obj. Similarly,
>> &obj + 1 > &obj
>> is well defined and yields 1 (true), and
>> &obj == NULL
>> is well defined and yields 0 (false). But if my interpretation is
>> correct then the result of
>> &obj + 1 == NULL
>> is *unspecified*, not undefined; it may yield either 0 or 1.

>
> May a conforming implementation diagnose an error during translation
> of such a construct?

An implementation may produce any diagnostics it likes. It may
not reject the translation unit because of it (unless it happens
to exceed some capacity limit). Any such diagnostic is not part
of the behavior of the program.

> May &obj + 1 == NULL produce sometimes 1 and
> sometimes 0 when "obj" is result of dereferencing of pointer to object
> that is being realloc()ed?

obj is a single identifier, not a larger expression. My intent is
that obj is the name of a single declared object.

But if you replace "obj" by some lvalue that designates an object
allocated by realloc, then sure, it might sometimes yield 1 and
sometimes yield 0. Like any expression, its value can vary depending
on the values of its subexpressions. So what?

> I think that both "additional" behaviors
> are permitted and there are no "two or more possibilities" provided by
> the standard to choose from.

The same argument would imply that the behavior of evaluating
i == 0
is undefined because i might have different values at different times.

> But IMO it's a rather scholastic
> question: pointers to one-past the last element of an array were
> introduced for comparisons and pointer arithmetic involving pointers
> to elements of that array and are probably not useful for anything
> else.

I agree that it's largely academic; I never said otherwise.
Real-world implementations are unlikely to allocate anything
immediately before the address corresponding to a null pointer,
and even if they did, non-contrived programs are unlikely to compute
the address just past the end of the single object that happens to
live there.

In my opinion, it's a minor flaw in the Standard's definition of
"null pointer". (A bigger flaw in the definition is that it's
not exhaustive, and therefore IMHO not truly a definition; null
pointers can be created by means other than converting a null
pointer constant.)

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jun 18, 2010
17. ### Keith ThompsonGuest

Stargazer <> writes:
> On Jun 17, 9:15Â pm, Keith Thompson <> wrote:
>> Stargazer <> writes:
>> > On Jun 17, 4:15Â am, Keith Thompson <> wrote:

>>
>> > [...]
>> >> An almost-plausible scenario: A null pointer is represented as
>> >> all-bits-zero. Â Automatic objects are allocated on The Stack, which
>> >> grows from high addresses to low, and starts at the top end of the
>> >> around from 0xffffffff to 0. Â The last byte of the first allocated
>> >> object in the program (perhaps argc, perhaps the string pointed to
>> >> by argv[0], perhaps something else) occupies byte 0xffffffff.

>>
>> >> I doubt that any such implementations actually exist.

>>
>> > It is forbidden for an implementation to do things this way: "If both
>> > the pointer operand and the result point to elements of the same array
>> > object, or one past the last element of the array object, the
>> > evaluation shall not produce an overflow, otherwise, the behavior is
>> > undefined." (6.5.6[8])

>>
>> > In your proposal incrementing pointer to one past the last element of
>> > an array does produce an overflow.

>>
>> No, it doesn't. Â My assumption is that pointer arithmetic acts
>> like unsigned integer arithmetic. Â 0xffffffff + 1 yields 0; there
>> is no overflow.
>>
>> (It's not at all clear what the word "overflow" in 6.5.6 means;
>> perhaps it just means that the behavior is defined.)

>
> In an absence of a distinct definition I think that "overflow" has its
> common meaning: that result cannot fit within destination operand. In
> your case result of 0xFFFFFFFF + 1 would require 33 bits which will
> not fit within the architecture pointer address operand.

That's not how C defines the term "overflow" -- and we're talking about
C here, after all.

C99 6.2.5p9:
A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting
unsigned integer type is reduced modulo the number that is one
greater than the largest value that can be represented by the
resulting type.

[snip]

> [Side note: I don't think that treating addresses as signed integer
> types on machine level has any reason. While addresses may correspond
> to negative or positive numbers, there's no meaning for addresses'
> "absolute value" as there is for numbers.]

You might want to reserve positive addresses for one purpose and
negative addresses for another (kernel vs. user, code vs. data,
whatever).

Of course C doesn't treat addresses as numbers at all (except that
pointer/integer conversions "are intended to be consistent with
the addressing structure of the execution environment", which is
extremely and appropriately vague).

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Jun 18, 2010
18. ### StargazerGuest

On Jun 18, 3:11 am, Keith Thompson <> wrote:
[...]
> >                              But IMO it's a rather scholastic
> > question: pointers to one-past the last element of an array were
> > introduced for comparisons and pointer arithmetic involving pointers
> > to elements of that array and are probably not useful for anything
> > else.

>
> I agree that it's largely academic; I never said otherwise.
> Real-world implementations are unlikely to allocate anything
> immediately before the address corresponding to a null pointer,
> and even if they did, non-contrived programs are unlikely to compute
> the address just past the end of the single object that happens to
> live there.

I meant that pitfalls of pointer to one past the last element of an
array and its semantic is more academic, not the situation when there
are objects just "before" null pointer or wrapping past end of
highest addresses and null pointer being represented as all-bits-zero
was very real: at least the mentioned x86 and ARM architectures have
boot address at the highest 64K of address space. The boot monitor/
BIOS may take all the space up to 0xffffffff (it's even natural) and
have some array at the highest addresses; going through such an array
until the end would be exactly your example. I am glad that the issue
was discussed.

Daniel

Stargazer, Jun 18, 2010
19. ### sandeepGuest

Eric Sosman writes:
> When attempting arithmetic on a null-valued pointer, it's the
> third of these situations: The Standard describes what happens if the
> pointer points at or just after an object, but a null-valued pointer
> doesn't do so. Therefore, the Standard's description does not cover the
> case, and the behavior is "undefined by omission."

Would it be a useful extension if the Standard stated that all arithemtic
operations on NULL pointers were well defined and yielded another NULL
pointer? Maybe some code could then save on checks for NULL?

sandeep, Jun 18, 2010
20. ### Tim RentschGuest

Seebs <> writes:

> On 2010-06-16, Richard Heathfield <> wrote:
>> Yes. Since a null pointer doesn't point to an object, neither it nor the
>> result can point to "elements of the same array object, or one past the
>> last element of the array object", and therefore the behaviour is undefined.

>
> Okay, I haven't had any caffeine today, but.
>
> What tells us that a null pointer cannot point one past the last
> element of an array object?
>
> I agree that nothing tells us that it *does* point one past the last
> element of an array object. But could it?

In comp.std.c, I would answer that question this way:
"Clearly that point of view is at least debatable, and it
seems like the plain text of the Standard allows it.
However, if it does, it's very likely that it was just
an oversight."

In comp.lang.c, I have a different answer: Of course
it's not allowed. Even if the Standard doesn't come
right out and say it, the wording used in various
places clearly indicates that pointers past the end
of an array aren't supposed to be null pointers.
Furthermore there are simple and obvious code patterns,
undoubtedly in relatively common use, that would
break if one past the end of an array were allowed to
be a null pointer.

[Here I was going to put in a clever statement about
what would happen to an implementer who dared to do
such a thing, but I decided it wasn't appropriate for