contiguity of arrays

J

James Kuyper

David Hopwood said:
James Kuyper wrote: ....

Since it's a *definition* of "array type" and "array of T", that's
effectively what it does say.

No - as a definition, it says that every array type describes a piece
of memory with those characteristics. That is quite different from
saying that every piece of memory with those characteristics can be
described using an array type.
For that to be the case, there would have to be text that explicitly
excludes some "contiguously allocated non-empty sets of objects with
a particular member object type" from being arrays. I can't see any
such text.

A zip code describes a particular geographic subset of the United
States. Does that mean that every geographic subset of the United
States can be described with a zip code?
 
P

pete

James said:
No - as a definition, it says that every array type describes a piece
of memory with those characteristics. That is quite different from
saying that every piece of memory with those characteristics can be
described using an array type.

At best, that could only be considered
as a partial definition of an array type.
A zip code describes a particular geographic subset of the United
States. Does that mean that every geographic subset of the United
States can be described with a zip code?

It means that
"a code which describes a particular
geographic subset of the United States"
is not the definition of "zip code"

Definitions are reversable.
If an X is defined as a red A,
then every X is a red A, and every red A is an X.
I leaned that in math.
 
D

Dan Pop

In said:
No, the standard does not resort to the notion of an object's
address space, but rather it guarantees what a s.c. program
can do (thus what a conforming implementation must support)
with regard to pointer arithmetic.

This notion is derived from from what the standard actually says.

The standard doesn't explicitly defines the notions of operator precedence
and associativity either, yet they can be derived from the actual text
of the standard (and even the standard itself uses them in examples and
footnotes).
It would be consistent
for an implementation to take advantage of the guarantees
when generating code to access a *declared* type, as
described earlier in this thread.

Only if the standard allowed that. Which it doesn't. 6.5.6p8 says:
"If the pointer operand points to an element of an array object" but it
doesn't require that array object to be declared as such. This is
precisely what makes pointer arithmetic work inside dynamically allocated
memory blocks. The standard doesn't have one rule for pointers
inside dynamically allocated memory and another for pointers inside
statically/automatically memory. This is why it is only the outermost
object that matters when deciding whether pointer arithmetic has a well
defined result or not. If this outermost object also satisfies the
definition of an array of type T, then it is this unnamed and undeclared
array that counts when pointer arithmetic is performed. Note that
footnote 88 (original C99 numbering) is consistent with this view.

The particular case when the aliasing array is of character type is even
*explicitly* mentioned in the standard:

pointer. When a pointer to an object is converted to a pointer
to a character type, the result points to the lowest addressed
byte of the object. Successive increments of the result, up to
the size of the object, yield pointers to the remaining bytes
of the object.
You can't really see this
for small examples, which is why I keep urging consideration
of the case when the subarray is nearly the size that can be
spanned by an offset field of a composite address.

It doesn't matter: the outermost object has created an address space that
is large enough for any subarray. There is no danger of pointer
arithmetic overflow as long as the result stays within the outermost
object.

Dan
 
D

David Hopwood

James said:
No - as a definition, it says that every array type describes a piece
of memory with those characteristics. That is quite different from
saying that every piece of memory with those characteristics can be
described using an array type.

So what is the necessary and sufficient definition of an array type,
then? Same question for "array".
A zip code describes a particular geographic subset of the United
States. Does that mean that every geographic subset of the United
States can be described with a zip code?

"A zip code describes a particular geographic subset of the United
States." is not a definition in the precise sense that should be
required of definitions of technical terms in a language standard.
 
D

Dan Pop

In said:
But it's possible to detect whether a, b, and c happen to be
contiguous; this is specifically mentioned in C99 6.5.9p6, discussing
equality operators on pointers. So one could argue that this program:

#include <stdio.h>

int main(void)
{
int a, b, c;
int *ptr;
a = c = 12345;

if (&a + 1 == &b && &b + 1 == &c) {
ptr = &a;
printf("ptr[2] = %d\n", ptr[2]);
}
else if (&c + 1 == &b && &b + 1 == &a) {
ptr = &c;
printf("ptr[2] = %d\n", ptr[2]);
}
else {
printf("The objects are not contiguous\n");
}

return 0;
}

will print either "ptr[2] = 12345" or "The objects are not
contiguous", but in this case I think a bounds-checking implementation
can put its foot down and trap on the evaluation of ptr[2]. (I don't
have chapter and verse for this.)

This is a borderline case that requires an official judgment from the
committee. One could find wording in the standard supporting both views.
One cannot invoke bounds-checking implementations as an argument *before*
establishing that the standard unambiguously rules out the scenario in
question. This is what makes such implementations practically unfeasible.

If the program has established that the three integers are contiguous,
then they do satisfy the definition of an array, according to the
standard:

- An array type describes a contiguously allocated nonempty set
of objects with a particular member object type, called
the element type.

All objects (a, b and c) have the same type and are contiguously
allocated, so they do compose an (ad hoc) array of 3 int. We know,
from the malloc case, that an explicit declaration is not needed this
array in order for pointer arithmetic to have well defined behaviour
inside it.

The opposing view has been amply expressed in this thread, so I'm not
going to reiterate it.

Since this issue is very much similar to the struct hack, one could easily
guess that the committee would reject my argument above. OTOH, as long
as the actual wording of the standard allowed me to build such an
argument, we *do* have a problem.

Dan
 
D

Douglas A. Gwyn

pete said:
Definitions are reversable.
If an X is defined as a red A,
then every X is a red A, and every red A is an X.
I leaned that in math.

Then you learned wrong.
 
D

Douglas A. Gwyn

Dan said:
The particular case when the aliasing array is of character type is even
*explicitly* mentioned in the standard:

Aliasing as array of character type is a special dispensation,
not applicable to aliasing as array of some other type.
 
K

Keith Thompson

Douglas A. Gwyn said:
Then you learned wrong.

Then I guess a lot of people "learned wrong", myself included.

Would you at least agree that a definition of an X that lets you
determine that any given entity either is an X or is not an X is more
useful than a definition that doesn't do so?

In the absence of such a definition, how is one to determine whether
something is an X?
 
D

Dan Pop

In said:
Aliasing as array of character type is a special dispensation,
not applicable to aliasing as array of some other type.

The dynamic memory allocation provides the dispensation for aliasing
with other array types. Otherwise, one couldn't use dynamically
allocated arrays of arbitrary types.

Dan
 
K

Keith Thompson

The dynamic memory allocation provides the dispensation for aliasing
with other array types. Otherwise, one couldn't use dynamically
allocated arrays of arbitrary types.

It's an open question whether the special dispensation for dynamically
allocated memory applies to declared objects.

I get the impression that the standard is not entirely self-consistent
in this area.

It would have been reasonable, IMHO, to have an explicit guarantee
that proper alignment (along with read/write access) is the only
necessary criterion for accessing an object as a given type. It might
be possible to infer such a principle from the description in 7.20.3:

The pointer returned if the allocation succeeds is suitably
aligned so that it may be assigned to a pointer to any type of
object and then used to access such an object or an array of such
objects in the space allocated (until the space is explicitly
deallocated).

but that looks more like a consequence of such a principle than a
statement of it. (Specifically, the alignment guarantee is
significant; the ability to access objects is a consequence of the
alignment and the suggested principle.)

The guarantee that any object can be aliased as an array of unsigned
char would also be a consequence of the principle (since unsigned char
has no alignment requirement above the single byte level), and could
have been relegated to a footnote.
 
P

pete

Keith said:
Then I guess a lot of people "learned wrong", myself included.

Would you at least agree that a definition of an X that lets you
determine that any given entity either is an X or is not an X is more
useful than a definition that doesn't do so?

In the absence of such a definition, how is one to determine whether
something is an X?

If every A be a B, then a baby A be a baby B ;)
 
D

Douglas A. Gwyn

Keith said:
It would have been reasonable, IMHO, to have an explicit guarantee
that proper alignment (along with read/write access) is the only
necessary criterion for accessing an object as a given type.

Well, no, only for write access. Read access is constrained
by the previously written type. The rule for unions captures
this aspect of the situation.
The guarantee that any object can be aliased as an array of unsigned
char would also be a consequence of the principle (since unsigned char
has no alignment requirement above the single byte level), and could
have been relegated to a footnote.

No, it's special since array-of-char type need not have
been previously impressed upon the object being accessed
as an array of char (actually unsigned char for C99, but
let's not get into that).
 
D

Dan Pop

In said:
Well, no, only for write access. Read access is constrained
by the previously written type. The rule for unions captures
this aspect of the situation.

Actually, the rule for unions is gone in C99, being replaced by a more
general rule:

7 An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:73)

- a type compatible with the effective type of the object,

- a qualified version of a type compatible with the effective
type of the object,

- a type that is the signed or unsigned type corresponding to
the effective type of the object,

- a type that is the signed or unsigned type corresponding to
a qualified version of the effective type of the object,

- an aggregate or union type that includes one of the
aforementioned types among its members (including, recursively,
a member of a subaggregate or contained union), or

- a character type.

____________________

73) The intent of this list is to specify those circumstances
in which an object may or may not be aliased.

But this is not really relevant to a discussion focused on pointer
arithmetic.
 
D

Dan Pop

In said:
It's an open question whether the special dispensation for dynamically
allocated memory applies to declared objects.

The one and only special property of dynamically allocated memory is the
universal alignment.
I get the impression that the standard is not entirely self-consistent
in this area.

The standard *is* self-consistent, some of its interpretations aren't.
Including the one rejecting the struct hack.

My interpretation, posted upthread, is perfectly consistent with itself
and with the actual wording of the standard.

Other interpretations require the definition of array to be a one-way
definition (which is sheer nonsense) and make pointer arithmetic inside
dynamically allocated objects follow other (unwritten) rules than pointer
arithmetic inside declared objects.

And, for those insisting on declared array being the only relevant arrays
in the context of 6.5.6p8 (and thus leaving pointer arithmetic inside
dynamically allocated arrays invoking undefined behaviour), how about
the following example:

int array[3] = { 0 };
unsigned *p = (unsigned *)array;

is p[2] legal or not? p is certainly not pointing in any declared array
of unsigned int.

Dan
 
M

Michael Mair

Dan said:
It's an open question whether the special dispensation for dynamically
allocated memory applies to declared objects.


The one and only special property of dynamically allocated memory is the
universal alignment.

I get the impression that the standard is not entirely self-consistent
in this area.


The standard *is* self-consistent, some of its interpretations aren't.
Including the one rejecting the struct hack.

My interpretation, posted upthread, is perfectly consistent with itself
and with the actual wording of the standard.

Other interpretations require the definition of array to be a one-way
definition (which is sheer nonsense) and make pointer arithmetic inside
dynamically allocated objects follow other (unwritten) rules than pointer
arithmetic inside declared objects.

And, for those insisting on declared array being the only relevant arrays
in the context of 6.5.6p8 (and thus leaving pointer arithmetic inside
dynamically allocated arrays invoking undefined behaviour), how about
the following example:

int array[3] = { 0 };
unsigned *p = (unsigned *)array;

is p[2] legal or not? p is certainly not pointing in any declared array
of unsigned int.

Legal, as unsigned is the corresponding unsigned type to int.
Someone posted the section with accessibility related to the effective
types lately but I do not bother to look it up as I think this example
is completely beside the point.


Cheers,
Michael
 
D

Dan Pop

In said:
Dan said:
(e-mail address removed) (Dan Pop) writes:
And, for those insisting on declared array being the only relevant arrays
in the context of 6.5.6p8 (and thus leaving pointer arithmetic inside
dynamically allocated arrays invoking undefined behaviour), how about
the following example:

int array[3] = { 0 };
unsigned *p = (unsigned *)array;

is p[2] legal or not? p is certainly not pointing in any declared array
of unsigned int.

Legal, as unsigned is the corresponding unsigned type to int.

You completely missed the point. This applies to dereferencing p itself,
but says nothing about pointer arithmetic on p.

Dan
 
M

Michael Mair

Legal, as unsigned is the corresponding unsigned type to int.
You completely missed the point. This applies to dereferencing p itself,
but says nothing about pointer arithmetic on p.

You are right. Maybe I have some time later on for a peek into
the standard...


Cheers
Michael
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top