Pointer arithmetics in 2D arrays

Army1987 · May 13, 2007

Is this code legal (according to the strictest possible sane[1]
interpretation of the standard, regardless of wheter it does work
on all implementations in the known universe)?

#include <stdio.h>
void print_square_matrix(int *ptr, int order)
{
int i, j;
for (i = 0; i < order; i++) {
for (j = 0; j < order; j++)
printf("%3d ", *ptr++);
putchar('\n');
}
}

int main(void)
{
int matrix[3][3] = { {157, 64, 13},
{ 0, -16, 128},
{ 54, 42, -23} }
int *p = &matrix[0][0]
print_square_matrix(*matrix, 3);
return 0;
}

I never directly overflow an array. After 13 is printed, ptr points
to an element past the end of the first row, which is explicitly
allowed. That element happens to exist and is matrix[1][0]. So,
after the newline, the printf prints " 0 " and then *ptr is -16.
The question is "Am I missing something, or (p + 3) + 1 is ok and
points to -16, while p + (3 + 1) causes UB?"

[1] "sane" here means "excluding those who say int main(void)
{ return 0; } is conforming but not strictly conforming because an
implementation-defined form of the status successful termination is
returned to the host environment".

Richard Heathfield · May 13, 2007

Army1987 said:

Is this code legal (according to the strictest possible sane[1]
interpretation of the standard, regardless of wheter it does work
on all implementations in the known universe)?

Er, no. It doesn't compile. You missed a semicolon. But let's take a
closer look anyway...

#include <stdio.h>
void print_square_matrix(int *ptr, int order)
{
int i, j;
for (i = 0; i < order; i++) {
for (j = 0; j < order; j++)
printf("%3d ", *ptr++);
putchar('\n');
}
}

int main(void)
{
int matrix[3][3] = { {157, 64, 13},
{ 0, -16, 128},
{ 54, 42, -23} }
int *p = &matrix[0][0]

Here's where you missed your semicolon. Incidentally, what is p for?

print_square_matrix(*matrix, 3);

matrix has type int[3][3], which decays to int(*)[3] when used in a
value context. So *matrix has type int[3] - but you use this in a value
context too, so it becomes int *. That's legal to pass to
print_square_matrix, which takes int *. But in print_square_matrix, you
are, strictly speaking, breaking the rules. Why? Well, you pass a
pointer that points to the first element in an array of three ints, but
you roll through nine ints in your function. Therefore, you're
overstepping the bounds of an array. The fact that you know perfectly
well that you have six more ints immediately following is what has
tempted you to do this, and of course you are overwhelmingly likely to
get away with it, but I believe that the behaviour is nevertheless,
strictly speaking, undefined for the same reason that any array bounds
violation leads to undefined behaviour.

<snip>

Army1987 · May 13, 2007

Richard Heathfield said:
Army1987 said: [snip]

int main(void)
{
int matrix[3][3] = { {157, 64, 13},
{ 0, -16, 128},
{ 54, 42, -23} }
int *p = &matrix[0][0]

Click to expand...

Here's where you missed your semicolon. Incidentally, what is p for?

What's worse is that I missed *two* semicolons...
(p is for the example below.)

Suppose I have int *p, *q; p = &matrix[0][0]; q = &matrix[1][0];
p += 3;
Now p and q are equal. PROOF: p + 3 equals (int *)((unsigned char *)p + 3 *
sizeof *p);
q equals &*(matrix[1] + 0) = matrix[1] = *(matrix + 1) = (int *)((unsigned
char *)matrix + sizeof *matrix)
= (int *)((unsigned char *)matrix + sizeof (int [3])) = (int *)((unsiged
char *)matrix + 3 * sizeof(int))
But, IIUC, q++ now is legal and p++ isn't.
Right?

Richard Heathfield · May 13, 2007

Army1987 said:

Richard Heathfield said:
Richard Heathfield said:

Army1987 said: [snip]

int main(void)
{
int matrix[3][3] = { {157, 64, 13},
{ 0, -16, 128},
{ 54, 42, -23} }
int *p = &matrix[0][0]

Click to expand...

Here's where you missed your semicolon. Incidentally, what is p for?

Click to expand...

What's worse is that I missed *two* semicolons...

So you did.

(p is for the example below.)

Suppose I have int *p, *q; p = &matrix[0][0]; q = &matrix[1][0];
p += 3;

Um, yeah, you can do that, and you're now pointing one beyond the end of
matrix[0], which is legal as long as you don't dereference.

Now p and q are equal.

I don't think you're allowed to compare them, are you? After all, one of
them is pointing into-or-one-past-the-end-of matrix[0], whereas the
other is pointing into-or-one-past-the-end-of matrix[1], and these are
different objects. (I must say I find the Standard rather silly at this
point, given that arrays are guaranteed to be contiguous.)

Bart van Ingen Schenau · May 13, 2007

Army1987 said:
Richard Heathfield said:

Army1987 said: [snip]

int main(void)
{
int matrix[3][3] = { {157, 64, 13},
{ 0, -16, 128},
{ 54, 42, -23} }
int *p = &matrix[0][0]

Click to expand...

Here's where you missed your semicolon. Incidentally, what is p for?

Click to expand...

What's worse is that I missed *two* semicolons...
(p is for the example below.)

Suppose I have int *p, *q; p = &matrix[0][0]; q = &matrix[1][0];
p += 3;
Now p and q are equal.

No, they are not guaranteed to be equal. This just happens to be the
case if the implementation does not keep bounds information in pointer
values.
Suppose I have an implementation that DOES store bounds information, the
array matrix is store in location 0x100, and we have 2-byte ints.
Initially we have:
p = &matrix[0][0]; /* valueof(p) = {.address=0x100, .low=0, .high=3} */
q = &matrix[1][0]; /* valueof(q) = {.address=0x106, .low=0, .high=3} */
p += 3; /* valueof(p) = {.address=0x106, .low=-3, .high=0} */

As you can see, p and q refer to the same address, but with different
bounds information. Therefor p != q.

PROOF: p + 3 equals (int *)((unsigned char *)p
+ 3 * sizeof *p);
q equals &*(matrix[1] + 0) = matrix[1] = *(matrix + 1) = (int
*)((unsigned char *)matrix + sizeof *matrix)
= (int *)((unsigned char *)matrix + sizeof (int [3])) = (int
*)((unsiged char *)matrix + 3 * sizeof(int))
But, IIUC, q++ now is legal and p++ isn't.
Right?

Your proof is not, as it assumes an implementation without bounds
information. The C standard does not make such an assumption.

Bart v Ingen Schenau

Pierre Asselin · May 13, 2007

Richard Heathfield said:
Army1987 said:

matrix has type int[3][3], which decays to int(*)[3] when used in a
value context. So *matrix has type int[3] - but you use this in a value
context too, so it becomes int *. That's legal to pass to
print_square_matrix, which takes int *. But in print_square_matrix, you
are, strictly speaking, breaking the rules. Why? Well, you pass a
pointer that points to the first element in an array of three ints, but
you roll through nine ints in your function. Therefore, you're
overstepping the bounds of an array.

Such a strict interpretation would cripple the use of C in numerical
computations. I don't think that was the intent, based in part on a
a similar example I noticed in the C99 Rationale v5.10, end of
section 6.7.5.3 .

<quote>
void g(double *ap, int n)
{
double (*a)[n] = (double (*)[n]) ap;
/* ... */ a[1][2] /* ... */
}

[ ... ] the function g can be called as in

{
double x[10][10];
g(&x[0][0], 10);
}
</quote>

The cited example has "&x[0][0]" as opposed to "*matrix" in the
orignal post, but that seems like a very subtle distinction. In
the Rationale code, "x[0][0]" is a mere scalar and "&x[0][0]" is
its address, but it is clear that g() is allowed to treat this
address as the start of an array of 100 doubles. (The example
goes on to show how to view said array as a two-dimensional object
by means of a VLA, but that's beside the point here.)

So in my view, the OP's code is de-facto valid C (C89 since he
doesn't use a VLA). If the OP needs to survive a code review, he
could write

print_square_matrix(&matrix[0][0], 3);

and cite the C99 rationale 6.7.5.3 in a comment.

Eric Sosman · May 13, 2007

Richard said:
Army1987 said:

Richard Heathfield said:

Army1987 said: [snip]
int main(void)
{
int matrix[3][3] = { {157, 64, 13},
{ 0, -16, 128},
{ 54, 42, -23} }
int *p = &matrix[0][0]
Here's where you missed your semicolon. Incidentally, what is p for?

Click to expand...

What's worse is that I missed *two* semicolons...

Click to expand...

So you did.

(p is for the example below.)

Suppose I have int *p, *q; p = &matrix[0][0]; q = &matrix[1][0];
p += 3;

Click to expand...

Um, yeah, you can do that, and you're now pointing one beyond the end of
matrix[0], which is legal as long as you don't dereference.

Now p and q are equal.

Click to expand...

I don't think you're allowed to compare them, are you? After all, one of
them is pointing into-or-one-past-the-end-of matrix[0], whereas the
other is pointing into-or-one-past-the-end-of matrix[1], and these are
different objects.

It's all right to compare them for equality or inequality,
using == or !=. p==q is true, but p-q is undefined behavior!

> (I must say I find the Standard rather silly at this
> point, given that arrays are guaranteed to be contiguous.)

I imagine that the Standard is trying to leave some leeway
for implementations that are able to check array bounds. At
least one implementation has done such checking, essentially by
giving each pointer value a "pedigree" describing the legitimate
bounds within which it points. When you derive a new pointer from
an old one (as in p+=3), the pedigree is carried along unchanged.
Thus, although p and q compare equal they have different pedigrees:
p is derived from matrix[0] and q from matrix[1], and the allowed
ranges of the two pointers are different even though their values
are identical.

Joe Wright · May 13, 2007

CBFalconer said:
Army1987 said:

Is this code legal (according to the strictest possible sane[1]
interpretation of the standard, regardless of wheter it does work
on all implementations in the known universe)?

#include <stdio.h>
void print_square_matrix(int *ptr, int order)
{
int i, j;
for (i = 0; i < order; i++) {
for (j = 0; j < order; j++)
printf("%3d ", *ptr++);
putchar('\n');
}
}

int main(void)
{
int matrix[3][3] = { {157, 64, 13},
{ 0, -16, 128},
{ 54, 42, -23} }
int *p = &matrix[0][0]
print_square_matrix(*matrix, 3);
return 0;
}

Click to expand...

Looks OK to me.

Even without the semicolons?

Army1987 · May 18, 2007

Pierre Asselin said:
Richard Heathfield said:

Army1987 said:

Click to expand...

matrix has type int[3][3], which decays to int(*)[3] when used in a
value context. So *matrix has type int[3] - but you use this in a value
context too, so it becomes int *. That's legal to pass to
print_square_matrix, which takes int *. But in print_square_matrix, you
are, strictly speaking, breaking the rules. Why? Well, you pass a
pointer that points to the first element in an array of three ints, but
you roll through nine ints in your function. Therefore, you're
overstepping the bounds of an array.

Click to expand...

Such a strict interpretation would cripple the use of C in numerical
computations. I don't think that was the intent, based in part on a
a similar example I noticed in the C99 Rationale v5.10, end of
section 6.7.5.3 .

<quote>
void g(double *ap, int n)
{
double (*a)[n] = (double (*)[n]) ap;
/* ... */ a[1][2] /* ... */
}

[ ... ] the function g can be called as in

{
double x[10][10];
g(&x[0][0], 10);
}
</quote>

The cited example has "&x[0][0]" as opposed to "*matrix" in the
orignal post, but that seems like a very subtle distinction. In
the Rationale code, "x[0][0]" is a mere scalar and "&x[0][0]" is
its address, but it is clear that g() is allowed to treat this
address as the start of an array of 100 doubles. (The example
goes on to show how to view said array as a two-dimensional object
by means of a VLA, but that's beside the point here.)

So in my view, the OP's code is de-facto valid C (C89 since he
doesn't use a VLA). If the OP needs to survive a code review, he
could write

print_square_matrix(&matrix[0][0], 3);

and cite the C99 rationale 6.7.5.3 in a comment.

Here the Standard quietly whispers what the behaviour is, but officially
says it is undefined. Annex J, under "Undefined behaviour":
An array subscript is out of range, even if an object is apparently
accessible with the

given subscript (as in the lvalue expression a[1][7] given the declaration
int

a[4][5]) (6.5.6).

(I wasn't aware that there has been an implementation which actually checks
array bounds, since I thought that would defeat the purpose why array
overflow was declared UB in the first place (i.e. speeding up run time by
trusting the program). But if there is one, I understand why even
overflowing the innermost array is UB.)

Richard Heathfield · May 18, 2007

Army1987 said:

Here the Standard quietly whispers what the behaviour is, but
officially says it is undefined. Annex J, under "Undefined behaviour":
An array subscript is out of range, even if an object is apparently
accessible with the

given subscript (as in the lvalue expression a[1][7] given the
declaration int

a[4][5]) (6.5.6).

(I wasn't aware that there has been an implementation which actually
checks array bounds, since I thought that would defeat the purpose why
array overflow was declared UB in the first place (i.e. speeding up
run time by trusting the program).

The point is that the implementor gets to *choose* - and may even pass
that choice on to the programmer, who may opt for bounds-checking
during development, switching it off for performance reasons once he's
satisfied that no bounds are being violated.

But if there is one, I understand
why even overflowing the innermost array is UB.)

Right.

David Thompson · Jun 30, 2007

Such a strict interpretation would cripple the use of C in numerical
computations. I don't think that was the intent, based in part on a

I mostly agree that preventing this would be bad for numerics, but I
believe it was indeed the intent to allow that. C wasn't originally
intended for numerics and for quite a while couldn't dream of
displacing Fortran. Only fairly recently has it even made an effort,
and only in C99 were some features standardized (complex/imag,
restrict, more mathlib and generics) to help this, but C99 did NOT
change the pointer arithmetic and hence subscripting rules. In
practice all reasonable implementations do allow it, but I think it's
clear the standard writers were careful not to actually require it.

a similar example I noticed in the C99 Rationale v5.10, end of
section 6.7.5.3 .

<quote>
void g(double *ap, int n)
{
double (*a)[n] = (double (*)[n]) ap;
/* ... */ a[1][2] /* ... */
}

[ ... ] the function g can be called as in

{
double x[10][10];
g(&x[0][0], 10);
}
</quote>

The cited example has "&x[0][0]" as opposed to "*matrix" in the
orignal post, but that seems like a very subtle distinction. In
the Rationale code, "x[0][0]" is a mere scalar and "&x[0][0]" is
its address, but it is clear that g() is allowed to treat this
address as the start of an array of 100 doubles. (The example
goes on to show how to view said array as a two-dimensional object
by means of a VLA, but that's beside the point here.)

But this example doesn't help your argument. It very carefully does
NOT access the 2D as 1D; the temporary 1D pointer is carefully
described as just a method to get around the lexical ordering of
parameter declaration to get from one 2D pointer to another
(correctly-sized) 2D pointer which is actually used. And is better
style in C99 because it more accurately shows in the source the actual
intent of what you are doing.

(Aside from the Rationale not being normative anyway.)

So in my view, the OP's code is de-facto valid C (C89 since he
doesn't use a VLA). If the OP needs to survive a code review, he
could write

print_square_matrix(&matrix[0][0], 3);

and cite the C99 rationale 6.7.5.3 in a comment.

I would just not cite anything and rely on the fact that 'everyone
knows' 2D arrays are contiguous and you will not in practice run into
any problems -- and if you do, it will be on an implementation so
weird that plenty of other things will be broken as well and this one
will get lost in the noise.

- formerly david.thompson1 || achar(64) || worldnet.att.net

How to accept text and put each letter into a 2d matrix?	0	Jun 3, 2022
Array of structs function pointer	10	Jul 16, 2023
Portable pointer arithmetics?	6	Jul 24, 2010
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
typedef, 2D arrays, design: question	11	Sep 1, 2012
2D arrays, pointers, and pointer arithmetic: question	32	Aug 24, 2012
Pointer arithmetics in Solaris	8	Mar 30, 2007
Return pointer from void only gives the memory address	0	Nov 23, 2024

Pointer arithmetics in 2D arrays

Army1987

Richard Heathfield

Army1987

Richard Heathfield

Bart van Ingen Schenau

Pierre Asselin

Eric Sosman

Joe Wright

Army1987

Richard Heathfield

David Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads