Neatest way to get the end pointer?

  • Thread starter Tomás Ó hÉilidhe
  • Start date
T

Tomás Ó hÉilidhe

I commonly use pointers to iterate thru an array. For example:

int my_array[X];

int *p = my_array;
int const *const pend = my_array + sizeof my_array/sizeof*my_array;

do *p++ = 42;
while (pend != p);

(Yes I realise the lack of spaces in the sizeof thing above is
disgusting, but I've gotten so sick of writing it out that I make it as
compact as possible)

I can't count how many times I use this construct in my code every day.
It's a right pain in the ass to always have to write out the long-winded
intialiser for pend, so I'm considering switching to initialising pend
as follows:

int const *const pend = *(&my_array+1);

1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located after
the current one.
4) *(&my_array+ 1) decays to the address of the first element in the
non-existant array after the current one, which is also the "pend"
address for the array that actually exists.

It's a hell of a lot shorter to write, and also I think it's a little
less vulnerable to typos because you'll most likely get a type mismatch
if its written wrongly.

Anyway, just wondering what people think of the alternative. Saves me
that little rush of pissed-off-ness every time I've to write out the
tedious sizeof thing.

(Oh and by the way, I wouldn't use a macro such as ARRLEN(my_array)
because then I'd have to worry about including the necessary header
file... which is also the reason why I don't use NULL.)
 
A

Army1987

Tomás Ó hÉilidhe said:
I commonly use pointers to iterate thru an array. For example:

int my_array[X];

int *p = my_array;
int const *const pend = my_array + sizeof my_array/sizeof*my_array;

do *p++ = 42;
while (pend != p);
Use a understandable macro for the array size, then you can have pend =
my_array + X.
(Also, I'd write it as `for (p = my_array; p < pend; p++) *p = 42;`. Or
even directly p < my_array + X, since a decent compiler should optimize
that, shouldn't it?)
 
V

vippstar

I commonly use pointers to iterate thru an array. For example:

int my_array[X];

int *p = my_array;
int const *const pend = my_array + sizeof my_array/sizeof*my_array;

do *p++ = 42;
while (pend != p);

(Yes I realise the lack of spaces in the sizeof thing above is
disgusting, but I've gotten so sick of writing it out that I make it as
compact as possible)

I can't count how many times I use this construct in my code every day.
It's a right pain in the ass to always have to write out the long-winded
intialiser for pend, so I'm considering switching to initialising pend
as follows:

int const *const pend = *(&my_array+1);

1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located after
the current one.
And you invoke undefined behavior.
It's a hell of a lot shorter to write, and also I think it's a little
less vulnerable to typos because you'll most likely get a type mismatch
if its written wrongly.
You could have a macro, or not use pointers (use that X).
Anyway, just wondering what people think of the alternative. Saves me
that little rush of pissed-off-ness every time I've to write out the
tedious sizeof thing.
But it invokes undefined behavior.
(Oh and by the way, I wouldn't use a macro such as ARRLEN(my_array)
because then I'd have to worry about including the necessary header
file... which is also the reason why I don't use NULL.)
NULL is declared in many standard C header files, among them is
<stdio.h>
I cannot believe you don't use NULL because you worry about it not
being defined.
 
K

Keith Thompson

I commonly use pointers to iterate thru an array. For example:

int my_array[X]; [snip]
1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located after
the current one.
And you invoke undefined behavior.
[...]

No, he doesn't. &my_array+1 is a valid address, just past the end of
my_array. Computing this address is ok; attempting to dereference it
would invoke UB.
 
W

William Ahern

On Feb 6, 12:40 am, "Tom?s ? h?ilidhe" <[email protected]> wrote:
NULL is declared in many standard C header files, among them is
<stdio.h>
I cannot believe you don't use NULL because you worry about it not
being defined.

I don't use NULL 'cause I don't know to what it's defined. Usually the NULL
macro expands to `0' or `(void *)0'. The former, typical on *BSD, needs to
be cast when used from a comma expression and the return type of your
function is a pointer, otherwise it might be evaluated as an int, which
might not be wide enough if (sizeof int != sizeof (void *)).

if (some_test_fails)
return (errno = ESOMETHING), (void *)NULL;

Likewise, you have to cast it when passing as a vararg.

execl("/bin/true", "true", (char *)NULL);

Usually I just use an unadorned `0' in my code. I do this in part because,
like the OP, I don't feel like including <stddef.h> or <stdio.h> or any
other header that I don't need. I comment my include statements to describe
what I'm [trying] to import; it got really old doing:

#include <stddef.h> /* NULL */

But, to each his own. I don't know very many people who do this. I'm okay
being alone ;)
 
V

vippstar

I commonly use pointers to iterate thru an array. For example:
int my_array[X]; [snip]
1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located after
the current one.
And you invoke undefined behavior.

[...]

No, he doesn't. &my_array+1 is a valid address, just past the end of
my_array. Computing this address is ok; attempting to dereference it
would invoke UB.
I think that when the result of an expression is a non-valid pointer,
the behavior is undefined.
Correct me if I am wrong, but I have also seen comments in GNU code
like this:
 
W

William Ahern

Tom?s ? h?ilidhe said:
(Oh and by the way, I wouldn't use a macro such as ARRLEN(my_array)
because then I'd have to worry about including the necessary header
file... which is also the reason why I don't use NULL.)

I keep all my macros in a single file, e.g. `eval.h'. It has ARRAYLEN,
STRINGIFY, PASTE, XPASTE, PP_NARG (thanks Laurent!), most of *BSD's
<sys/param.h> (MIN, MAX, howmany, etc), and a few other gems I've concocted.

It's rare that I don't use at least one of these in a project. I've never
satisfactorily packaged my other tools, but this seems to work well (though
I'm still torn between `arraylen' and `ARRAYLEN').
 
V

vippstar

I don't use NULL 'cause I don't know to what it's defined. Usually the NULL
macro expands to `0' or `(void *)0'. The former, typical on *BSD, needs to
be cast when used from a comma expression and the return type of your
function is a pointer, otherwise it might be evaluated as an int, which
might not be wide enough if (sizeof int != sizeof (void *)).
I don't think that can happend.
if (some_test_fails)
return (errno = ESOMETHING), (void *)NULL;
As i said, NULL does not need to be casted here.
You don't need the comma operator either. the expression '(x, y, z)'
evaluates to the type of z with value z.
Likewise, you have to cast it when passing as a vararg.
You have to cast 0 to (char *)0 too.
execl("/bin/true", "true", (char *)NULL);

Usually I just use an unadorned `0' in my code. I do this in part because,
like the OP, I don't feel like including <stddef.h> or <stdio.h> or any
other header that I don't need.
That in my opinion is very bad practise for a project.
I comment my include statements to describe
what I'm [trying] to import; it got really old doing:

#include <stddef.h> /* NULL */

But, to each his own. I don't know very many people who do this. I'm okay
being alone ;)
Other programmers in your project might not be okay with that thought,
ask them first.
As for me, I wouldn't. Consider
--
char *p;
/* ... */
if(p == 0) /* did you really mean p == 0 or *p == 0 ? */
if(p == NULL) /* clearly ment p == NULL */
--
 
W

Walter Roberson

William Ahern said:
I don't use NULL 'cause I don't know to what it's defined. Usually the NULL
macro expands to `0' or `(void *)0'. The former, typical on *BSD, needs to
be cast when used from a comma expression and the return type of your
function is a pointer, otherwise it might be evaluated as an int, which
might not be wide enough if (sizeof int != sizeof (void *)).
if (some_test_fails)
return (errno = ESOMETHING), (void *)NULL;

No, if the return type of your function is a pointer then the
return value will be converted to the appropriate type as if by
assignment. Assigning a compile-time 0 to a pointer type is guaranteed
to result in a NULL pointer constant.
 
P

Peter Nilsson

Keith Thompson said:
I commonly use pointers to iterate thru an array.
For example:
    int my_array[X]; [snip]
1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant
array located after the current one.

And you invoke undefined behavior.

No, he doesn't.  &my_array+1 is a valid address,

Though it requires a conversion back to int * if used
as a pointer 'index' compared against an int * iterator.
I think that when the result of an expression is a non-
valid pointer, the behavior is undefined.

True, but one byte past the end of an array _is_ a valid
pointer.
Correct me if I am wrong,

Keith already has.
but I have also seen comments in GNU code like this:

One byte beyond is fine (though you can't dereference it),
but there are no special rights for one (or more) byte(s)
_before_.

Doesn't matter. Implementations do not have to cope with
'one byte before'. An allocation can be made at the start
of a memory page. Merely calculating an address prior to
such a page may trap. In contrast, implementations _must_
cope with one byte beyond, which simply means that at
least one byte (but typically _at most_ one byte) is
reserved at the end of the memory space.
 
V

vippstar

Keith Thompson said:
(e-mail address removed) writes:
I commonly use pointers to iterate thru an array.
For example:
int my_array[X];
[snip]
1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant
array located after the current one.
And you invoke undefined behavior.
No, he doesn't. &my_array+1 is a valid address,

Though it requires a conversion back to int * if used
as a pointer 'index' compared against an int * iterator.
I think that when the result of an expression is a non-
valid pointer, the behavior is undefined.

True, but one byte past the end of an array _is_ a valid
pointer. Ah I see, thanks a lot.
Correct me if I am wrong,

Keith already has.
Hehe, sorry mr Keith for doubting :p.

However, as you said mr Nilsson, he (the OP) is then dereferencing it,
and invoking undefined behavior.
Also, the whole topic is moot.
if sizeof foo / sizeof *foo gives you the elements of foo, then foo
must be an array.
If it's an array it must be defined as 'foo[X]' or similar.
If X is a constant, foo + X is the last pointer (that is not valid to
dereference), if X is an object, it is still foo + X, however foo is
then a variable length array.
Therefore, there is no need for foo + sizeof foo / sizeof *foo.
 
A

A. Sinan Unur

(e-mail address removed) wrote in

I commonly use pointers to iterate thru an array. For example:
int my_array[X]; [snip]
1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located
after the current one.
And you invoke undefined behavior.

[...]

No, he doesn't. &my_array+1 is a valid address, just past the end of
my_array. Computing this address is ok; attempting to dereference it
would invoke UB.
I think that when the result of an expression is a non-valid pointer,
the behavior is undefined.
Correct me if I am wrong, but I have also seen comments in GNU code
like this:

That is different, though. Pointing one past the end of the array is
valid whereas pointing oen before the start of the array is not.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

§6.5.6, page 83

Moreover, if the expression P points to the last
element of an array object, the expression (P)+1 points one past the
last element of the array object, and if the expression Q points one
past the last element of an array object, the expression (Q)-1 points to
the last element of the array object. If both the pointer operand and
the result point to elements of the same array object, or one past the
last element of the array object, the evaluation shall not produce an
overflow; otherwise, the behavior is undefined. If the result points one
past the last element of the array object, it shall not be used as the
operand of a unary * operator that is evaluated.


Sinan
 
B

Ben Bacarisse

I commonly use pointers to iterate thru an array. For example:

int my_array[X];

int *p = my_array;
int const *const pend = my_array + sizeof my_array/sizeof*my_array;
It's a right pain in the ass to always have to write out the long-winded
intialiser for pend, so I'm considering switching to initialising pend
as follows:

int const *const pend = *(&my_array+1);

1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located after
the current one.
And you invoke undefined behavior.

Presumably not from the pointer arithmetic alone? Is it the deference
that you think causes UB?

That is my worry. One can construct a pointer "one past the end" of
any object, but * can't be applied "if it is evaluated" (6.5.6 p8). I
take that to refer to the rule that says that when & is applied to *,
neither are evaluated (6.5.3.2 p3).

I can't see anything wrong with:

int *pend = (void *)(&my_array + 1);

from a language point of view, but it is fragile in that it breaks
when the code moves to a function and my_array becomes a pointer.

Since "C has no array values" is one way of looking at the while
array/pointer relationship in C, one could argue that the * is not
evaluated in

int *pend = *(&my_array + 1);

Is the utility of this code sufficient to warrant a change to the
meaning of * (or definition of pointer arithmetic) to permit its use
when the "value" would be an array and would therefore be immediately
converted to a pointer?
 
R

Richard Tobin

if (some_test_fails)
return (errno = ESOMETHING), (void *)NULL;
[/QUOTE]
No, if the return type of your function is a pointer then the
return value will be converted to the appropriate type as if by
assignment.

But the code is revolting anyway. I can see the temptation to
do something like

if(condition) a=1, b=2;

to avoid a block, but shoving an extra assignment into a return
statement is an unprovoked assault on readability.

-- Richard
 
K

Keith Thompson

I commonly use pointers to iterate thru an array. For example:
int my_array[X]; [snip]
1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located after
the current one.
And you invoke undefined behavior.

[...]

No, he doesn't. &my_array+1 is a valid address, just past the end of
my_array. Computing this address is ok; attempting to dereference it
would invoke UB.
I think that when the result of an expression is a non-valid pointer,
the behavior is undefined.
Correct me if I am wrong, but I have also seen comments in GNU code
like this:

Computing a pointer before the beginning of an array invokes UB;
computing a pointer just past the end of an array does not.

This is mentioned in passing in the answer to question 6.17 in the
comp.lang.c FAQ, <http://www.c-faq.com/> (can anyone find a more
explicit reference?).

The standard's rather long-winded explanation of this is in C99
6.5.6p8 (quoting from n1256; there are no change bars on this
paragraph):

When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If
the pointer operand points to an element of an array object, and
the array is large enough, the result points to an element offset
from the original element such that the difference of the
subscripts of the resulting and original array elements equals the
integer expression. In other words, if the expression P points to
the i-th element of an array object, the expressions (P)+N
(equivalently, N+(P)) and (P)-N (where N has the value n) point
to, respectively, the i+n-th and in-th elements of the array
object, provided they exist. Moreover, if the expression P points
to the last element of an array object, the expression (P)+1
points one past the last element of the array object, and if the
expression Q points one past the last element of an array object,
the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to
elements of the same array object, or one past the last element of
the array object, the evaluation shall not produce an overflow;
otherwise, the behavior is undefined. If the result points one
past the last element of the array object, it shall not be used as
the operand of a unary * operator that is evaluated.
 
P

Peter Nilsson

Hehe, sorry mr Keith for doubting :p.

However, as you said mr Nilsson, he (the OP) is then
dereferencing it, and invoking undefined behavior.

I never said the OP dereferenced it. The actual code had
been snipped by that point. Here it is...

int *p = my_array;
int const *const pend = my_array + sizeof my_array/
sizeof*my_array;
do *p++ = 42;
while (pend != p);

At no stage is pend dereferenced; and the loop exits on
p == pend, so the address is not dereferenced by p
either.
 
V

vippstar

I never said the OP dereferenced it. The actual code had
been snipped by that point. Here it is...

int *p = my_array;
int const *const pend = my_array + sizeof my_array/
sizeof*my_array;
do *p++ = 42;
while (pend != p);

At no stage is pend dereferenced; and the loop exits on
p == pend, so the address is not dereferenced by p
either.
Ah, I was a bit confused I guess.
However, OP _does_ dereference it. Not in that snipped, but in 4)
 
V

vippstar

(e-mail address removed) writes:
I commonly use pointers to iterate thru an array. For example:
int my_array[X];
[snip]
1) my_array is an int[X]
2) &my_array is an int(*)[X]
3) &my_array+1 is the address of the non-existant array located after
the current one.
And you invoke undefined behavior.
[...]
No, he doesn't. &my_array+1 is a valid address, just past the end of
my_array. Computing this address is ok; attempting to dereference it
would invoke UB.
I think that when the result of an expression is a non-valid pointer,
the behavior is undefined.
Correct me if I am wrong, but I have also seen comments in GNU code
like this:

Computing a pointer before the beginning of an array invokes UB;
computing a pointer just past the end of an array does not.

This is mentioned in passing in the answer to question 6.17 in the
comp.lang.c FAQ, <http://www.c-faq.com/> (can anyone find a more
explicit reference?).

The standard's rather long-winded explanation of this is in C99
6.5.6p8 (quoting from n1256; there are no change bars on this
paragraph):

When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If
the pointer operand points to an element of an array object, and
the array is large enough, the result points to an element offset
from the original element such that the difference of the
subscripts of the resulting and original array elements equals the
integer expression. In other words, if the expression P points to
the i-th element of an array object, the expressions (P)+N
(equivalently, N+(P)) and (P)-N (where N has the value n) point
to, respectively, the i+n-th and in-th elements of the array
object, provided they exist. Moreover, if the expression P points
to the last element of an array object, the expression (P)+1
points one past the last element of the array object, and if the
expression Q points one past the last element of an array object,
the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to
elements of the same array object, or one past the last element of
the array object, the evaluation shall not produce an overflow;
otherwise, the behavior is undefined. If the result points one
past the last element of the array object, it shall not be used as
the operand of a unary * operator that is evaluated.
I have thought about this, and clearly the standard talks about
arrays.
If we have foo[N], then foo + N is a valid pointer that cannot be
dereferenced.
However, in foo = &bar; foo+1 is *not* a valid pointer because &bar is
a pointer, not an array.
Therefore, in OPs example, the expression cannot be computed and does
invoke undefined behavior.
Here is an example of what i am trying to say
--
int * foo;
int bar;
int baz[N];
foo = baz + N; /* valid */
foo = &bar + 1; /* invalid */
foo = &bar; /* valid */
foo++; /* invalid */
--
 
P

Peter Nilsson

Ah, I was a bit confused I guess.
However, OP _does_ dereference it. Not in that snipped,
but in 4)

Again, it is not derefenced...

The type of my_array is int[]
The type of &my_array is int (*)[]
The type of &my_array + 1 is int (*)[]
The type of *(&my_array + 1) is int []

The last expression will decay to an int * when used in the
assignment to pend. At no stage is that pointer dereferenced.
 
V

vippstar

Ah, I was a bit confused I guess.
However, OP _does_ dereference it. Not in that snipped,
but in 4)

Again, it is not derefenced...

The type of my_array is int[]
The type of &my_array is int (*)[]
The type of &my_array + 1 is int (*)[]
The type of *(&my_array + 1) is int []

The last expression will decay to an int * when used in the
assignment to pend. At no stage is that pointer dereferenced.
What i ment is that (&myarray + 1) is dereferenced, which would be
invalid, and if my reply to mr Thompson is correct, then even
computing &my_array+1 is invalid.
Remember; We are no longer talking about arrays but pointers, the
pointer-after-the-last-element rule does not apply.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top