pointer arithmetic and multi-dimensional arrays

B

Bernd Gaertner

Dear experts,

according to my interpretation of the Standard, loop 1 in the
following program is legal, while loop 2 is not (see explanation
below). This looks a bit counterintuitive, though; do you know
the truth?

Thanks a lot in advance,
Bernd Gaertner.

#include<iostream>

int main()
{
int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
int* first = a[0]; // pointer to a[0][0]

// loop 1
for (int* p = first; p < first+6; ++p)
std::cout << *p; // 012345

// loop 2
for (int* p = first; p < first+6; p+=2)
std::cout << *p; // 024

return 0;
}

Explanation: [decl.array] 7-9 implies that the memory layout of
"int a[3][2]" is as for "int a[6]" - the six ints are consecutive
in memory. This means that during any iteration of loop 1, both p
and ++p are pointers to elements (or past-the-end pointers) of the
*same* three-element array, namely either a[0] or a[1]. In this
case, [expr.add] 5 guarantees well-defined behavior. In loop 2,
if p == first+2 (pointer to last element of a[0]), p+=2 points to
the second element of a[1], so p and p+=2 do not refer to the same
array. In this case, [expr.add] 5 stipulates undefined behavior.
 
A

Alf P. Steinbach

* Bernd Gaertner:
Dear experts,

according to my interpretation of the Standard, loop 1 in the
following program is legal, while loop 2 is not (see explanation
below). This looks a bit counterintuitive, though; do you know
the truth?

Answer's formal no to both, in-practice yes to first (unless a very
unusual computer and C++ implementation) and in-practice no to second.

Thanks a lot in advance,
Bernd Gaertner.

#include<iostream>

int main()
{
int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
int* first = a[0]; // pointer to a[0][0]

// loop 1
for (int* p = first; p < first+6; ++p)
std::cout << *p; // 012345

// loop 2
for (int* p = first; p < first+6; p+=2)
std::cout << *p; // 024

return 0;
}

Explanation: [decl.array] 7-9 implies that the memory layout of
"int a[3][2]" is as for "int a[6]" - the six ints are consecutive
in memory. This means that during any iteration of loop 1, both p
and ++p are pointers to elements (or past-the-end pointers) of the
*same* three-element array, namely either a[0] or a[1]. In this
case, [expr.add] 5 guarantees well-defined behavior.

You have no guarantee that a pointer to the first array element can be
treated as a pointer to the array. On a segmented architecture the
pointers might theoretically (although not in practice) be completely
different beast, and the two inner arrays might reside in different
segments, "you can't get there from here". Interestingly, if instead of
inner arrays you had inner POD structs, there would be a relevant
guarantee for the PODs (from the rule for reinterpret_cast of pointer to
first POD element), but then there could be padding...

In loop 2,
if p == first+2 (pointer to last element of a[0]), p+=2 points to
the second element of a[1], so p and p+=2 do not refer to the same
array. In this case, [expr.add] 5 stipulates undefined behavior.

The reasoning here seems to not be meaningful in any way, but the
conclusion is just about right.

On the suspicion that this is HOMEWORK, I'll leave it as an exercise to
figure out why the second loop has formally Undefined Behavior in
addition to the UB considerations that apply to the first loop.

Cheers, & hth.,

- Alf
 
J

James Kanze

according to my interpretation of the Standard, loop 1 in the
following program is legal, while loop 2 is not (see explanation
below). This looks a bit counterintuitive, though; do you know
the truth?

Neither are legal, although both are likely to work on most
implementations.
#include<iostream>

int main()
{
int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
int* first = a[0]; // pointer to a[0][0]

More precisely: a pointer to the first element in a[0]. More
precisely, a pointer to an int which is the first element of an
array of 3 ints. Legal values for this pointer a thus first,
first+1, first+2 and first+3; the last may not be dereferenced.

In practice, the only time this will fail is with a bounds
checking implementation using fat pointers. (CenterLine once
sold such a compiler; I don't know what the current status is,
but ICS is still selling a compiler under the CenterLine mark.)
// loop 1
for (int* p = first; p < first+6; ++p)
std::cout << *p; // 012345
// loop 2
for (int* p = first; p < first+6; p+=2)
std::cout << *p; // 024
return 0;
}
Explanation: [decl.array] 7-9 implies that the memory layout of
"int a[3][2]" is as for "int a[6]" - the six ints are consecutive
in memory.

Yes, but it's not too clear what you can do with that. The
standard has been very carefully worded to allow compiler bounds
checking (even if no one, or almost no one, does it). When you
assign a[0] to first, the bounds are a[0][0]...a[0][3]. A
compiler is allowed to maintain this information with the
pointer (i.e. a fat pointer), and check it each time you modify
the pointer.
This means that during any iteration of loop 1, both p
and ++p are pointers to elements (or past-the-end pointers) of the
*same* three-element array, namely either a[0] or a[1]. In this
case, [expr.add] 5 guarantees well-defined behavior. In loop 2,
if p == first+2 (pointer to last element of a[0]), p+=2 points to
the second element of a[1], so p and p+=2 do not refer to the same
array. In this case, [expr.add] 5 stipulates undefined behavior.

That's a different problem. Obviously, a bounds checking
implementation would fail here; supposedly, at least, there have
also been cases where it would fail in special cases without
bounds checking. (I don't think it would ever fail without
bounds checking for an on stack array of small elements like
int. It might fail if the array were dynamically allocated,
however, or if the elements were significantly larger---things
that could cause the pointer to point to memory that didn't
exist or that wasn't mapped.)
 
B

Bernd Gaertner

Thanks a lot, for your answer, Alf! There are some things I still don't
understand, though (and yes, it's homework-related, but not my homework
but that of my students).
#include<iostream>

int main()
{
int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
int* first = a[0]; // pointer to a[0][0]

// loop 1
for (int* p = first; p < first+6; ++p)
std::cout << *p; // 012345

// loop 2
for (int* p = first; p < first+6; p+=2)
std::cout << *p; // 024

return 0;
}
You have no guarantee that a pointer to the first array element can be
treated as a pointer to the array. On a segmented architecture the
pointers might theoretically (although not in practice) be completely
different beast, and the two inner arrays might reside in different
segments, "you can't get there from here".

This seems to be in contradiction to [dcl.array] 1 where it is stated
that "An object of array type contains a contiguously allocated nonempty
subset of N subobjects of type T". And a twodimensional array is simply
an array where T is another array type.
In loop 2,
if p == first+2 (pointer to last element of a[0]), p+=2 points to
the second element of a[1], so p and p+=2 do not refer to the same
array. In this case, [expr.add] 5 stipulates undefined behavior.

The reasoning here seems to not be meaningful in any way, but the
conclusion is just about right.

The reasoning was based on the assumed fact that the two subarrays are
contiguoulsy allocated in memory.
On the suspicion that this is HOMEWORK, I'll leave it as an exercise to
figure out why the second loop has formally Undefined Behavior in
addition to the UB considerations that apply to the first loop.

I figured one thing out that also applies to loop 1 (stupid me): the
behavior of the operation "first+6" is already undefined since first and
the result first+6 definitely do not point to elements (or past the end)
of the same array. But what if we loop as follows?

// loop 1
int* p = first;
for (int i=0; i<6; ++i)
std::cout << *p++; // 012345
 
K

Kai-Uwe Bux

Bernd said:
Thanks a lot, for your answer, Alf! There are some things I still don't
understand, though (and yes, it's homework-related, but not my homework
but that of my students).
#include<iostream>

int main()
{
int a[2][3] = { {0, 1, 2}, {3, 4, 5} };
int* first = a[0]; // pointer to a[0][0]

// loop 1
for (int* p = first; p < first+6; ++p)
std::cout << *p; // 012345

// loop 2
for (int* p = first; p < first+6; p+=2)
std::cout << *p; // 024

return 0;
}
You have no guarantee that a pointer to the first array element can be
treated as a pointer to the array. On a segmented architecture the
pointers might theoretically (although not in practice) be completely
different beast, and the two inner arrays might reside in different
segments, "you can't get there from here".

This seems to be in contradiction to [dcl.array] 1 where it is stated
that "An object of array type contains a contiguously allocated nonempty
subset of N subobjects of type T". And a twodimensional array is simply
an array where T is another array type.

That only concerns the layout of the arrays in memory. It does not concern
what you can do with pointer arithmetic. The standard does not preclude an
implementation to perform bounds checking. In particular, an implementation
is allowed to use decorated pointers that keep track of the range of
validity for pointer arithmetic through extra data not used in
dereferencing. Although this may not correspond to the intuitive notion of
a pointer, it is a perfectly valid implementation. The compiler would
deduce such ranges of validity from type information. In the above cases,
range checking would find an access violation once you increment p past the
bound of the 1D-array used to initialize it.


[snip]

Best

Kai-Uwe Bux
 
B

Bernd Gaertner

Thanks for all your in-depth answers! I understand the issue much better
now, and in retrospect it was somewhat naive to expect defined behavior
for the loops in my original posting.

Best regards,
Bernd.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top