c99 multidimensional arrays contiguous?

P

pemo

Maybe this was in a nightmare, but I seem to remember reading
something that said in C99 multidimensional arrays *need not* be laid
out contiguously in memory, i.e., that one should not treat them as a
single contiguous blob of memory!

Was this just a bad dream of mine please??
 
S

Seebs

Maybe this was in a nightmare, but I seem to remember reading
something that said in C99 multidimensional arrays *need not* be laid
out contiguously in memory, i.e., that one should not treat them as a
single contiguous blob of memory!
Was this just a bad dream of mine please??

Sorta yes, sorta no.

1. They must indeed be laid out contiguously in memory.
2. If you derive a pointer from one of the sub-arrays, you should not
then try to derive pointers outside that sub-array from it.

So:

int a[10][10];
int *p = &a[0][0];
p = (int *) a;
p[11] = 0; /* fine, writes to the 12th of 100 members */
p = &a[0][0];
p[11] = 0; /* bad, oversteps boundaries of a[0] */

-s
 
A

Andrew Poelstra

Maybe this was in a nightmare, but I seem to remember reading
something that said in C99 multidimensional arrays *need not* be laid
out contiguously in memory, i.e., that one should not treat them as a
single contiguous blob of memory!

Well, the each dimension needs to be contiguous on its own,
and contiguity is transitive, so the entire array /does/ in
fact need to be laid out contiguously in memory.

HOWEVER, it is undefined behavior to rely on this, by, say
accessing array members by bad indices, etc. Basically, the
compiler is allowed to optimize assuming you won't ever do
this, and conversely, it is allowed to do bounds checking
to make sure you don't ever do this.
Was this just a bad dream of mine please??

Why would it be a bad dream? A boring one, perhaps, and
maybe even a waste of a night, if you are that sort of
person.

But not a bad dream, IMHO.
 
P

pemo

Why would it be a bad dream?

Well because I'd perhaps like to be able to do this, and now I can't -
seems like 'health and safety' or the 'nanny state' creeping up on me.

pemo
 
P

pemo

Why the initiailser and then the assignment please?
int *p = &a[0][0];
p = (int *) a;
pemo

Maybe this was in a nightmare, but I seem to remember reading
something that said in C99 multidimensional arrays *need not* be laid
out contiguously in memory, i.e., that one should not treat them as a
single contiguous blob of memory!
Was this just a bad dream of mine please??

Sorta yes, sorta no.

1.  They must indeed be laid out contiguously in memory.
2.  If you derive a pointer from one of the sub-arrays, you should not
then try to derive pointers outside that sub-array from it.

So:

        int a[10][10];
        int *p = &a[0][0];
        p = (int *) a;
        p[11] = 0; /* fine, writes to the 12th of 100 members */
        p = &a[0][0];
        p[11] = 0; /* bad, oversteps boundaries of a[0] */

-s
 
P

pemo

Ah, ok, I've got it - changing the type to a plain ol' int * ?

pemo

Why the initiailser and then the assignment please?
        int *p = &a[0][0];
        p = (int *) a;

pemo

Sorta yes, sorta no.
1.  They must indeed be laid out contiguously in memory.
2.  If you derive a pointer from one of the sub-arrays, you should not
then try to derive pointers outside that sub-array from it.

        int a[10][10];
        int *p = &a[0][0];
        p = (int *) a;
        p[11] = 0; /* fine, writes to the 12th of 100 members */
        p = &a[0][0];
        p[11] = 0; /* bad, oversteps boundaries of a[0] */
 
S

Seebs

Why the initiailser and then the assignment please?
int *p = &a[0][0];
p = (int *) a;

I was originally just going to demonstrate the &a[0][0] thing, then
I was going to show the other one, but then I realized that if I
did the "(int *) a" after dreferencing p[12] (aka a[0][12]), that it
would still be undefined behavior because undefined behavior had already
occurred.

Please don't top-post. Quote the material you're responding to, and
put your response under the specific thing you're responding to. Or,
as the .sig goes:

A: Because it breaks the normal flow of communication.
Q: What's wrong with top-posting?

-s
 
P

pemo

Why the initiailser and then the assignment please?
        int *p = &a[0][0];
        p = (int *) a;

I was originally just going to demonstrate the &a[0][0] thing, then
I was going to show the other one, but then I realized that if I
did the "(int *) a" after dreferencing p[12] (aka a[0][12]), that it
would still be undefined behavior because undefined behavior had already
occurred.

Please don't top-post.  Quote the material you're responding to, and
put your response under the specific thing you're responding to.  Or,
as the .sig goes:

A:  Because it breaks the normal flow of communication.
Q:  What's wrong with top-posting?

-s

It's been a while since I've posted: damn it - I deliberately went and
top-posted because I seemed to remember that it *should* be that way
around, rather than the reverse!

Maybe I should have trusted Google's reader's correctness then!

Cheer,

pemo
 
A

Andrew Poelstra

Maybe I should have trusted Google's reader's correctness then!

No, no no no no no no no no. Don't trust Google.

It happens to correct on this point because we (meaning, other
people on Usenet) bitched at them for a few solid years before
they noticed and fixed it.
 
J

Johannes Schaub (litb)

Seebs said:
Maybe this was in a nightmare, but I seem to remember reading
something that said in C99 multidimensional arrays *need not* be laid
out contiguously in memory, i.e., that one should not treat them as a
single contiguous blob of memory!
Was this just a bad dream of mine please??

Sorta yes, sorta no.

1. They must indeed be laid out contiguously in memory.
2. If you derive a pointer from one of the sub-arrays, you should not
then try to derive pointers outside that sub-array from it.

So:

int a[10][10];
int *p = &a[0][0];
p = (int *) a;
p[11] = 0; /* fine, writes to the 12th of 100 members */
p = &a[0][0];
p[11] = 0; /* bad, oversteps boundaries of a[0] */

-s

I must be completely blind, but aren't both cases exactly the same? In fact,
the first may be worse since the cast may yield a different address (int(*)
[10] -> int* - the standard doesn't require the pointer value resulting from
the cast to stay the same, i think). And if it stays the same, it looks like
it does completely the same as the second case: Both are undefined behavior
in C.

What do you say is different here?
 
S

Seebs

Seebs said:
int a[10][10];
int *p = &a[0][0];

(the initializer here is irrelevant).
p = (int *) a;
p[11] = 0; /* fine, writes to the 12th of 100 members */
p = &a[0][0];
p[11] = 0; /* bad, oversteps boundaries of a[0] */
I must be completely blind, but aren't both cases exactly the same?

No.

In one, p is being derived from the address of an object known to contain
at least 100*sizeof(int) bytes of storage, in the other, it's being derived
from the address of an object known to contain at least 10*sizeof(int) bytes
of storage.
In fact,
the first may be worse since the cast may yield a different address (int(*)
[10] -> int* - the standard doesn't require the pointer value resulting from
the cast to stay the same, i think).

It does. Pointers to the first member of an aggregate must compare equal.

Two pointers compare equal if and only if both are null pointers,
both are pointers to the same object (including a pointer to an
object and a subobject at its beginning) or function, both are
pointers to one past the last element of the same array object, or
one is a pointer to one past the end of one array object and the
other is a pointer to the start of a different array object that
happens to immediately follow the first array object in the address
space.)
And if it stays the same, it looks like
it does completely the same as the second case: Both are undefined behavior
in C.

I don't think so.
What do you say is different here?

a is an array of 10 arrays of ints. The integers in each array are
necessarily contiguous, and the arrays are necessarily contiguous, so
it's going to point to a region containing 100 integers. If you derive
a pointer from a, it is a pointer into that whole object.

However, each of the sub-objects (a[0], a[1], etcetera) is also an object,
so if you derive a pointer from a[0], it is a pointer only into that
subobject.

The only way they can be in any way different is bounds checking; as
noted above, the standard does guarantee that &(a[0][10]) == &(a[1][0]).

But two pointers can be equal, and not be fully interchangeable, in that
they can have different bounds.

-s
 
L

Luca Forlizzi

a is an array of 10 arrays of ints.  The integers in each array are
necessarily contiguous, and the arrays are necessarily contiguous, so
it's going to point to a region containing 100 integers.  If you derive
a pointer from a, it is a pointer into that whole object.

This conclusion is clear and logical, but is it really deducible from
the standard?
The only sentence in the standard I was able to find to support the
fact that a pointer to an object can be considered to point into the
largest possible array containing the pointed object, is 7.20.3, but
this explicitely refers to objects returned from a memory alloc
function.
On the other hand, the int object p points to is an element of the
array a[0], not of a, which is not an arrat of ints.

Some days ago in a thread about exiting from "double loops" Ben
Bacarisse and Tim Rentsch have the same opinion as you, so I am pretty
sure that there is something in the standard that I do not fully
understand. Could you please enlighten me?

-- Luca Forlizzi
 
L

Luca Forlizzi

Yes.

sizeof(<type> [10]) == sizeof(<type>) * 10

There's no room for any padding.  The array of 10 ints must have size
precisely 10*sizeof(int).  The array of 100 ints must have size
precisely 10*(10*sizeof(int)).

So a is a region of 100*sizeof(int) bytes, while a[0] is a region of
10*sizeof(int) bytes.

Yes that is clear. I apologise for my bad english, I didn't express my
question well. Let me try again!

According to my knowledge of the standard, the semantics of indexing
is defined in 6.5.6 p. 8
for "a pointer to an element of an array object". It says nothing
about how is the pointer obtained.
So let's take again the example:

int a[10][10];
int *p;
p = (int *) a;
p[11] = 0; /* fine, writes to the 12th of 100 members */

I agree that p points to an integer objects which is the first of a
series of 100 contiguously allocated
integer objects.
But is that series of objects, from the "legal" point of view (i.e.
according to the standard) an array?
Does the standard says that *any* collection of N contigously
allocated objects of type T is an
array of size N of type T ?
The int object pointed by p is indeed an element of an array, but of
a[0]. Can it be at the same time
element of a[0] and of the unnamed array of 100 ints "derived" from
the memory allocated for object a
(which is not an array of 100 ints) ?
IMHO, no, and I don't see why the way p is obtained change the fact
that the object pointed by p is an
element of array a[0]. Therefore the array bound for p should still be
10.

I hope to have clarifyied my reasoning. Which is my mistake?

Luca
 
S

Seebs

According to my knowledge of the standard, the semantics of indexing
is defined in 6.5.6 p. 8
for "a pointer to an element of an array object". It says nothing
about how is the pointer obtained.
So let's take again the example:

int a[10][10];
int *p;
p = (int *) a;
p[11] = 0; /* fine, writes to the 12th of 100 members */
I agree that p points to an integer objects which is the first of a
series of 100 contiguously allocated
integer objects.
But is that series of objects, from the "legal" point of view (i.e.
according to the standard) an array?

It's not an array of integers, but it's an array -- of arrays of integers.
The key is that the object's size is 100*sizeof(int), so pointing inside
it is okay.
Does the standard says that *any* collection of N contigously
allocated objects of type T is an
array of size N of type T ?

No. However, it doesn't have to be.
The int object pointed by p is indeed an element of an array, but of
a[0]. Can it be at the same time
element of a[0] and of the unnamed array of 100 ints "derived" from
the memory allocated for object a
(which is not an array of 100 ints) ?
Yes.

I hope to have clarifyied my reasoning. Which is my mistake?

Bounds checking only allows you to check the bounds of the actual object
you're looking at. If you're looking at a, its bounds are the whole
range from a[0][0] to a[9][9], and anything that is inside that range
is fair game.

-s
 
B

Ben Bacarisse

Seebs said:
According to my knowledge of the standard, the semantics of indexing
is defined in 6.5.6 p. 8
for "a pointer to an element of an array object". It says nothing
about how is the pointer obtained.
So let's take again the example:

int a[10][10];
int *p;
p = (int *) a;
p[11] = 0; /* fine, writes to the 12th of 100 members */
I agree that p points to an integer objects which is the first of a
series of 100 contiguously allocated
integer objects.
But is that series of objects, from the "legal" point of view (i.e.
according to the standard) an array?

It's not an array of integers, but it's an array -- of arrays of integers.
The key is that the object's size is 100*sizeof(int), so pointing inside
it is okay.

I hope the following clarification helps, because I am not sure we've
got to the bottom of Luca's question. I'm not replying to you to tell
you stuff, I am hanging this post here because this is the place with
the right context.

There are three situations:

int X[10][10];
int *p1 = &X[0][0]; /* or = X[0]; since X[0] gets converted */
int *p2 = (void *)&X[0]; /* or = X; since X gets converted */
int *p3 = (void *)&X;

(I've used void * simply to avoid questions about implementation
defined conversions and I've written the addresses in the most
explicit form I can, without any array to pointer conversions. I've
also used X as the array name because 'a' is confusing in English
text).

I would summarise the majority view as being that p1[10] is an invalid
access and that p3[10] (indeed p3[99]) is valid. Luca's example is
the same as p2 and I think the majority view is that p2[10] is also
fine.

The arguments all revolve around 6.5.6 p8 about adding to a pointer.
That clause defines the result of the addition only when the result is
within the array pointed "into" by the pointer. Specifically: "if the
pointer operand points to an element of an array object, and the array
is large enough...".

"Large enough" can also mean "one past the end" but that pointer can't
be dereferenced and, since array access using []s has an implied
dereference, we can ignore these special "one past the end" pointers
in this discussion.

The array X consists of 11 arrays -- the whole one and 10 sub-arrays.
The central question is what is the array into which the various
p[1-3] pointers point?

p1 points to an element of X[0] so it is natural to deduce the array
over which is ranges is just X[0] and not X as a whole. I don't think
there any support in the standard for the idea that p1 points to an
element of X, at least not formally.

p2 is a converted from a pointer that clearly points to an element of
X (the first one) so here one can reasonably say that the converted
pointer may range of the whole of X.

p3 is interesting. At first sight it seems to fall outside of the
wording altogether. The pointer from which it is converted does not
point to an element of an array -- it points to the single object X.
Here paragraph 7 comes into play:

"For the purposes of these operators, a pointer to an object that is
not an element of an array behaves the same as a pointer to the
first element of an array of length one with the type of the object
as its element type."

So &X is considered (for this purpose) to be a pointer into to the
first element of a one-element array. Thus "the array" referred to in
paragraph 8 is the whole of X.

I hope this helps rather than hinders. If I've made a mistake in
summarising the majority view I will have complicated matters so I
hope I have it straight.

<snip>
 
L

Luca Forlizzi

I redirect to comp.std.c maybe it's more appropiate
According to my knowledge of the standard, the semantics of indexing
is defined in 6.5.6 p. 8
for "a pointer to an element of an array object". It says nothing
about how is the pointer obtained.
So let's take again the example:
        int a[10][10];
        int *p;
        p = (int *) a;
        p[11] = 0; /* fine, writes to the 12th of 100 members */
I agree that p points to an integer objects which is the first of a
series of 100 contiguously allocated
integer objects.
But is that series of objects, from the "legal" point of view (i.e.
according to the standard) an array?

It's not an array of integers, but it's an array -- of arrays of integers..
The key is that the object's size is 100*sizeof(int), so pointing inside
it is okay.

I can't deduce this from my reading of the standard. Since, as you
say, the element type of a is not int,
the object pointed by p does not qualify as element of a, although its
adress is inside a. On the other hand, the object pointed by p is an
element of array a[0], therefore we are in the case defined by 6.5.6
p. 8 where
the array is a[0] which has 10 elements.
The int object pointed by p is indeed an element of an array, but of
a[0]. Can it be at the same time
element of a[0] and of the unnamed array of 100 ints "derived" from
the memory allocated for object a
(which is not an array of 100 ints) ?

Yes.

You have to convince me that this is implied from the standard.
I hope to have clarifyied my reasoning. Which is my mistake?

Bounds checking only allows you to check the bounds of the actual object
you're looking at.  If you're looking at a, its bounds are the whole
range from a[0][0] to a[9][9], and anything that is inside that range
is fair game.

I really would like to share this conclusion.

regards, Luca
 
R

Richard Bos

Seebs said:
This conclusion is clear and logical, but is it really deducible from
the standard?

Yes.

sizeof(<type> [10]) == sizeof(<type>) * 10

There's no room for any padding. The array of 10 ints must have size
precisely 10*sizeof(int). The array of 100 ints must have size
precisely 10*(10*sizeof(int)).

So a is a region of 100*sizeof(int) bytes, while a[0] is a region of
10*sizeof(int) bytes.

Yes, so you have the sizes stitched up neatly. Now consider fat
pointers.

Richard
 
L

Luca Forlizzi

No, you didn't; there was no Followup-To: header in your article.

ouch... I am a usenet newbe :)

Do you think it's appropriate to redirect the topic in comp.std.c ?
 
L

Luca Forlizzi

I hope the following clarification helps, because I am not sure we've
got to the bottom of Luca's question.  I'm not replying to you to tell
you stuff, I am hanging this post here because this is the place with
the right context.

it does help to me! But I am still not convinced that p2 and p3 can
legally access any int inside X
(see below)
There are three situations:

  int X[10][10];
  int *p1 = &X[0][0];      /* or = X[0]; since X[0] gets converted */
  int *p2 = (void *)&X[0]; /* or = X; since X gets converted */
  int *p3 = (void *)&X;

<snip>

I would summarise the majority view as being that p1[10] is an invalid
access and that p3[10] (indeed p3[99]) is valid.  Luca's example is
the same as p2 and I think the majority view is that p2[10] is also
fine.

The arguments all revolve around 6.5.6 p8 about adding to a pointer.
That clause defines the result of the addition only when the result is
within the array pointed "into" by the pointer.  Specifically: "if the
pointer operand points to an element of an array object, and the array
is large enough...".

The problem that I have is not exactly the array size. You informally
say *array pointed into*
but the standard says "pointer operand points to an element of an
array object".
What exactly is an element of an array. In my mind, since X is an
array of arrays, the elements of X are arrays,
so p2 and p3 do not point to element of X (so 6.5.6 p8 does not apply
to them and X).
You (and Peter, too) seem to imply that the int objects that are
elements of the elements of X are also elements
of X (i.e. "being element" is a transitive relation). But I can't find
this in the standard.

Please note that I would love yours to be the right interpretation of
the standard, I find it more
comfortable and close to real usage of the language.

Luca
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top