It does if you interpret "same object" as being "same location".
What is an object? "A region of data storage in the execution environment,
the contents of which can represent values".
Pointers to the same object are pointers to the same region of data storage.
A nice conclusion. There is a fellow in another C-devoted forum who
insists on using the word "into" when discussing pointers. Pointers
always point "into" something, and that something would make sense as
"region of data storage."
Surely in:
int i = 42;
char * p = &i;
the pointer 'p' doesn't point "to" the 'int'-typed object designated by
'i'. After all, a 'char *' points "to" a 'char'. But it certainly
points "into" the object designated by 'i'.
I don't think there can be any expectation of bounds checking between two views
of the same storage through a union. There is just no sense there of a pointer
from one straying out of bounds and into the other.
I'm sorry that the demonstration failed. I intended to highlight that
if we are talking about what a pointer points to (or into), the notion
ought to be consistent throughout an interpretation of the definitions
of Standard C.
If we are going to say that the pointers compare as equal, it's because
they point to the same object (or subobject). But then with pointer
arithmetic, we have the vague ("if the array is large enough"). Well
what array? All [non-bit-field] objects are an array of bytes. That
array? For single-dimensional arrays, that array? For
multi-dimensional arrays, which dimension do we pick for our notion of
"the array" in order to determine if it's large enough for the pointer
arithmetic to be defined?
Essentially, given:
int a[2][2] = { { 0 } };
if we say that 'a[1] + 1' yields a pointer value X and that pointer
arithmetic is only defined for { X - 1, X, X + 1 }, we can potentially
justify that conclusion by saying "the array object" is not the larger
containing array, but the second 'int[2]' of the larger containing
array, only.
But if _that's_ the object being pointed into, I think we ought to stick
to that for pointer equality. So then the two pointers in the original
code do _not_ point to the same object or a sub-object at its beginning.
On the other hand, if we justify the pointer equality by saying that the
objects occupy the same location or that the pointers point into the
same region of data storage, well then pointer arithmetic should be
defined across that region of data storage, not just a particular
partition of it.
That is, it seems odd if two pointers with the same type, and pointing
into the same region of data storage, and pointing at the same byte in
that region of data storage, and comparing as equal, have different
defined boundaries for "the array" when under consideration for pointer
arithmetic.
And the pointer equality definition is so specific with its "if and only
if."
Add to that the definition of all object representations being
accessible via 'unsigned char' type and it seems that any bounds are at
the beginning of the contiguous region of memory and at one byte past
the end.
If they are not distinct objects, then they are the same object, right?
Well what type is the object? Is it an 'int'? If so, is it treated
as an 'int[1]' for the purposes of pointer arithmetic? Or is it an
array object, instead? If so, how many elements in the array? Could
you do:
It is all these things simultaneously. You could ask these same questions
of a non-union object. Given int x[5], what is x[1]? Is it just an int?
Or is it a portion of the whole array? So what type is it?
Agreed. I'm glad you find them analogous.
ISO C uses the term "subobject" for embedded objects. x[1] is an int object
which s a subobject of the int[5] array.
Both arrays involved in the union have contain subobjects of type int which
correspond together.
Agreed. And it seems that an 'int[4][5]' array has 20 sub-objects of
type 'int' that correspond to each of the combinations of index that can
be used with the 'int[4][5]' and that point to 'int' objects (not one past).
If two views of a different type are aliased through a union, then
the type is whichever one is the last through which a value is stored.
This is from the special rules about unions.
And of course there's "type punning." In your example below, is
'a[0][3]' not similarly type punning the 'int[2][2]' as an 'int[4]' and
designating/accessing the fourth element?
I think it should hold for this kind of array aliasing. (But you're not even
asking about if it's okay to store to one array and read the other; just
whether certain pointers are reliably equal.)
Also consider a[0][3]. This kind of "wrong geometry" access is in a kind of
gray area, where the standard isn't of a lot of help.
Well actually, your example there was just given in another thread,
which is why this thread was "more bounds-checking." The gray area is
what I'm trying to explore... Definitions, consequences, consistency, etc.
Right or wrong, some C programs are going to do this and find it to be quite
portable. So if you're doing bounds-checking, it's not entirely realistic to
insist on diagnose such things. A programmer who doesn't consider that to be a
bounds error will be irked by the diagnostics, perceived to be a nuisance.
On the other hand, the "wrong geometry" access to a[0][3] could be unintended,
and indicate a bug. Anotehr programmer might be thankful to have that diagnosed
(maybe even the same programmer, in a different programming situation).
The clearest case I can think of is where a 'for' loop can be known at
translation-time to allow for an array index to go out-of-bounds without
any fancy business happening to the index. This seems worth warning about!
A run-time check might set up traps at one byte before and one byte
after a range of data storage. That seems sensible, too!
But for any stricter run-time bounds-checking, such as catching
'a
[j]' where the ranges for 'i' and 'j' aren't known at
translation-time and go out-of-bounds, there could be checks for each
dimension of a multi-dimensional array, but is that consistent with C?
There have been c.l.c discussions about this once in a while; I remember some
from many years ago. Some of the arguments hinged on whether such an access is
done blatantly with array indexing, or displacement of pointer directly obtaind
from "array" decay. Under that kind of hair-splitting&a[0][0] + 3 seems less
wrong than than a[0][3] because the former has a pointer to just an int, which
is then being displaced, in a way that is disconnected from the geometry of the
array.
Yeah, but I wouldn't call it hair-splitting... The array subscripting
operator is defined with an identity given in terms of the binary
addition operator and the unary indirection operator (and parentheses).
I think that's rather important and worth discussion if it's at all a
"gray area."
It would seem unfair to give the array subscripting notation 'a[0][3]'
some kind of bounds-preferential treatment versus the identical '*(*(a +
0) + 3)' notation.
But we do see references to "provenance" in at least one defect report,
and this seems related to the "provenance" of a pointer. If it "came
from" an array with certain boundaries, then pointer arithmetic is only
defined for its use with those boundaries, despite the fact that another
pointer with difference boundaries is identical in every other way.
Of course "provenance" seems like a "gray area," since you can combine
things (such as via bit-wise operators). Then whence did they come? An
example would be combining two objects into a destination such that the
effective type of the destination cannot be determined.
a[0][3]
*( *( a + 0 ) + 3 )
*( *( 'int[2][2]' + 0 ) + 3 )
*( *( 'int (*)[2]' + 0 ) + 3 )
*( *( 'int (*)[2]' ) + 3)
*( 'int[2]' + 3 )
*( 'int *' + 3 )
*( 'int *' )
'int'