On Wed, 03 Nov 2004 11:57:44 +0100, Michael Mair wrote:
Since I'm replying to a fairly old article I won't snip.
buda said:
See 6.5.6/8.
'&array[0][0]' is a pointer to the first element of the array
'array[0]'. This array has NCOLUMNS elements in it.
'(&array[0][0])[x]' is equivalent to '*(&array[0][0] + x)'. According to
the rules of pointer arithmetics (6.5.6/8), the expression '&array[0][0]
+ x' is only allowed to form pointers to elements of array 'array[0]'
and one imaginary element immediately after the last one. Otherwise, the
behavior is undefined. I.e. in this case negative values of 'x' and
values that are greater than NCOLUMNS lead to UB.
--
Best regards,
Andrey Tarasevich
P.S. BTW, '(&array[0][0])[x]' is equivalent to 'array[0][x]'.
OK, this is perfectly clear and logical to me, but I still see this kind of
code somewhat often (even the FAQ uses it, and only mentions as a side note
that it is not strictly conforming). The use being passing a "flattened"
array to a function and then doing manual subscripting like a[i*NCOL+j]
where a is the "flattened array" passed to the function like
f(&array[0][0]). AFAIK, the Standard requires that multidimensional arrays
are stored in row major order (last subscript varies fastest), so people
expect this to work. Doesn't the above mean that multidimensional arrays are
stored (are required to be stored by the Standard) in one continuous block
of memory like [[row0][row1]...[rowN-1]] or is that my misconception ?
Hint: Read comp.lang.c ;-)
Use this link:
http://groups.google.com/groups?thl...94027,883915280,883890862,883862102,883726282
or search for the thread "contiguity of arrays" to find a
discussion/debate about that issue.
In short: Some people think it is as you think, some people
say that the wording of the standard permits that between
array[0][NCOLUMNS-1] and array[1][0] padding or debug info
can be inserted, in short: the memory for the array is contiguous
but the effectively used parts of it are not necessarily.
That doesn't make much sense to me. The standard requires there to be no
padding between array elements. When you have an array of arrays this
applies at all levels. Consider this:
ELTYPE (*p)[NCOLUMNS] = malloc(sizeof(ELTYPE) * NROWS * NCOLUMNS);
If the malloc() succeeds (and the byte size of the desired array is
representable as a size_t) is it guaranteed to allocate enough space
to hold the array? I would be worried if not. If it is then there
is exactly enough space for NROWS * NCOLUMNS elements and no
room for extra padding. Don't worry about the possibility of malloc()
being able to allocate more than the requested amount, there are similar
examples using memcpy(), fread()/fwrite() etc. as well as accessing
an object as an array of unsigned char. Essentially, given
ELTYPE array[NELEMS];
then sizeof array == sizeof(ELTYPE) * NELEMS
which applies to every level of an array of arrays.
By playing tricks with pointers in a way not permitted by the
standard, you might hide the access to not used parts from the
compiler, ending up with strange effects in your program.
The question is whether "not used parts" can exist at all. There
can certainly be padding in elements but I would suggest not
between them anywhere in the array.
However, nobody did give an example of a conforming compiler
which does not work with the flattened access.
A compiler that implements strict bounds checking would catch this.
There is a bounds checking gcc, I'm not sure if it does however.
It is a legitimate thing to want to catch: bounds overflow
resulting in data in a different row being corrupted would
be as much a problem as any other bounds overflow.
There are also code generation/optimisation issues. Let's say you
have an architecture that can calculate shorter offsets more
efficiently. An example of this might be the "huge" memory model
used on 16 bit x86 architectures. This permits objects larger
than 64K but since registers are 16 bits wide offset calculations
on large objects i.e. >= 64K take more effort. If you have an array
of arrays such as
int values[100][1000];
then given that sizeof(int) is 2 on the implementation the overall
object is 200000 bytes but each subarray is 2000 bytes. So when
calculating the address of
values[x][y]
it needs internally to evaluate effectively (char *)&values + 2000*x +
2*y. 2*y can be evaluated using 16 bit arithmetic because the result must
always be in the range 0-2000 (allowing for 1 past the end pointers). So
if I write
values[0][40000]
this will not evaluate as values[40][1000] because 2*40000 i.e. 80000 is
not representable in 16 bits.
(This is a description of a potentially conforming implementation,
not of real-word huge memory models which are much weirder and generally
not conforming).
Personally, I tell my students: Do not do this (if you can
avoid it ;-)).
It is always avoidable. The next question then is given
int *p = (int *)values; /* or (int *)&values */
whether p[40000] is valid C.
Lawrence