multi dimensional arrays as one dimension array

V

vippstar

The only way I can think of to get overlapping arrays in the way you're
looking for is by using a union:

union {
double singledim[6];
double multidim[2][3];

} x;

but this is only possible if you know the length of the array at compile
time. Do you?

What about C99's VLAs?

#include <stdio.h>
#include <stdlib.h>

int main(void) {

int x = 5;
int y = 10;

union { double my1D[x * y], my2D[x][y]; } *p = malloc(sizeof
*p);
if(p == NULL) return EXIT_FAILURE;
free(p);
return 0;
}
 
H

Harald van Dijk

The only way I can think of to get overlapping arrays in the way you're
looking for is by using a union:

union {
double singledim[6];
double multidim[2][3];

} x;

but this is only possible if you know the length of the array at
compile time. Do you?

What about C99's VLAs?

VLAs aren't allowed as structure or union members. I can't find the
normative wording in the standard (references are appreciated), but the
example in 6.7.5.2p10 is explicit enough that I trust it to correctly
reflect the intent.
 
H

Harald van Dijk

VLAs aren't allowed as structure or union members. I can't find the
normative wording in the standard (references are appreciated),

Never mind. It's 6.7.2.1p8:
"A member of a structure or union may have any object type other than a
variably modified type."
 
J

James Tursa

James Tursa said:
[...]
No. There cannot be padding between array elements; in particular,
given:

double arr[10][10];

the size of arr is guaranteed to be exactly 100*sizeof(double).

Padding isn't the issue.

Well, I didn't really believe that padding was an issue but that's
what seem to be implied by the response.
The issue is that the standard doesn't
require implementations to support indexing past the end of an array.
So if I write

arr[0][15]

I'm trying to refer to an element of arr[0] that doesn't exist.
There's a valid object, accessible as arr[1][5], at the intended
location in memory -- and *most* C compilers will let you access that
object either as arr[0][15] or as arr[1][5]. But arr[1][5] is
guaranteed to work, and arr[0][15] isn't, because it attempts to index
beyond the end of the double[10] array arr[0].

In other words, implementations are allowed, but not required, to
perform bounds checking.

Well, I am still trying to understand how that argument applies to the
original OP posted code. Your argument is based on using arr directly.
But you seem to be saying that OP can't do this:

double *dp = (double *) arr;

and then traverse the entire array using dp. Is that what you are
saying?

Close. I'm saying that you most likely *can* get away with that
(treating an array of array of double as if it were an array of
double), but the standard doesn't require an implementation to make it
work. The most likely ways it can fail are if an implementation
performs run-time or compile-time bounds checking, or if an optimizer
assumes (as it's permitted to do) that you're not doing something like
this, causing the generated code not to do what you expected it to do.

Harald's explanation elsewhere in this thread makes the point more
clearly than I did, I think:

The fact that double[2][3] doesn't have elements such as
x[0][5]. There must be a valid double, 5*sizeof(double) bytes into
x. However, x[0][5] doesn't mean just that. x[0][5] (or
((double*)x)[5]) means you're looking 5*sizeof(double) bytes into
x[0]. x[0] doesn't have that many elements.

I get the part about x[0][5] maybe not working.

But the only way I can make sense of everything else that everyone is
trying to tell me is that there are special rules for unsigned char
that prevent bounds checking for this type of code that don't apply to
other types. For example:

double d[2][3];
double *dp = (double *) d;
unsigned char *ucp = (unsigned char *) d;
double *xdp = malloc(sizeof(d));
unsigned char *xucp = malloc(sizeof(d));
mydoublecopy(xdp,dp,6); // assume prototype present
myucharcopy(xucp,ucp,6*sizeof(double));

where

void mydoublecopy(double *t, double *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

void myucharcopy(unsigned char *t, unsigned char *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

What you are saying is that mydoublecopy isn't guaranteed to work
because the compiler is free to bounds check the result of s++ and see
if it violates the bounds of the original d[0] row. i.e., it is free
to recognize that the current value of s is within the d[0] row and
then bounds check the s++ result and cause a fault if it is beyond the
end of that row and you try to dereference it. And this may happen
regardless of any typecasting that I may have done in an explicit
attempt to aviod this d[0] check, and regardless of the fact that the
compiler may know for certain that the memory for d[0] and d[1] is
contiguous.

But myucharcopy *is* guaranteed to work even though it also violates
the bounds of the original d[0] row in the s++ calculation. It seems
the standard must explicitly forbid bounds checking like this for
unsigned char type in order for this to work.

It this what you are saying? There are special rules for unsigned char
here (which would seem to then raise other questions)?

James Tursa
 
V

vippstar

On Sep 1, 2:03 am, James Tursa <[email protected]>
wrote:

But the only way I can make sense of everything else that everyone is
trying to tell me is that there are special rules for unsigned char
that prevent bounds checking for this type of code that don't apply to
other types. For example:

Yes there is. Because you can treat any ptr to object as an array of
unsigned char, to observe the objects representation.
 
J

James Tursa

On Sep 1, 2:03 am, James Tursa <[email protected]>
wrote:



Yes there is. Because you can treat any ptr to object as an array of
unsigned char, to observe the objects representation.

So is this conforming?

double d[2][3];
double *dp = (double *) d;
int i;
for( i=0; i<6; i++ )
dp = (double *)(((unsigned char *)dp) + sizeof(double));

James Tursa
 
V

vippstar

Because you can treat any ptr to object as an array of
unsigned char, to observe the objects representation.

So is this conforming?

double d[2][3];
double *dp = (double *) d;
int i;
for( i=0; i<6; i++ )
dp = (double *)(((unsigned char *)dp) + sizeof(double));

I don't know, but it's a good question; I'd like to know as well.
 
O

Old Wolf

I get the part about x[0][5] maybe not working.

But the only way I can make sense of everything else that everyone is
trying to tell me is that there are special rules for unsigned char
that prevent bounds checking for this type of code that don't apply to
other types.

There is no special case for unsigned char
w.r.t bounds checking. You can't create
a pointer into an object, and then use that
pointer to access beyond the object's bounds,
regardless of the pointer type.

For example:
double d[2][3];
double *dp = (double *) d;
unsigned char *ucp = (unsigned char *) d;
double *xdp = malloc(sizeof(d));
unsigned char *xucp = malloc(sizeof(d));
mydoublecopy(xdp,dp,6);  // assume prototype present
myucharcopy(xucp,ucp,6*sizeof(double));

where

void mydoublecopy(double *t, double *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

void myucharcopy(unsigned char *t, unsigned char *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

Note that the expression "d" is defined to
mean &d[0], in a value context (which is what
we have here). That is, the address of an array
of three doubles.

'mydoublecopy' and 'myucharcopy' both cause
undefined behaviour because they try to access
beyond the bounds of this array.


Now, if you had written:
double *dp = (double *) &d;
unsigned char *ucp = (unsigned char *) &d;

then there would be no undefined behaviour,
because d is an object containing six doubles
(as opposed to d[0] which is an object
containing three doubles).
 
J

James Tursa

I get the part about x[0][5] maybe not working.

But the only way I can make sense of everything else that everyone is
trying to tell me is that there are special rules for unsigned char
that prevent bounds checking for this type of code that don't apply to
other types.

There is no special case for unsigned char
w.r.t bounds checking. You can't create
a pointer into an object, and then use that
pointer to access beyond the object's bounds,
regardless of the pointer type.

Doesn't memcpy and the like do just that? See below.
For example:
double d[2][3];
double *dp = (double *) d;
unsigned char *ucp = (unsigned char *) d;
double *xdp = malloc(sizeof(d));
unsigned char *xucp = malloc(sizeof(d));
mydoublecopy(xdp,dp,6);  // assume prototype present
myucharcopy(xucp,ucp,6*sizeof(double));

where

void mydoublecopy(double *t, double *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

void myucharcopy(unsigned char *t, unsigned char *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

Note that the expression "d" is defined to
mean &d[0], in a value context (which is what
we have here). That is, the address of an array
of three doubles.

'mydoublecopy' and 'myucharcopy' both cause
undefined behaviour because they try to access
beyond the bounds of this array.


Now, if you had written:
double *dp = (double *) &d;
unsigned char *ucp = (unsigned char *) &d;

then there would be no undefined behaviour,
because d is an object containing six doubles
(as opposed to d[0] which is an object
containing three doubles).

(sigh) I thought I was finally beginning to understand this based on
the other posted replies and then you go and post this reply. I've got
more learnin' to do ...

When I increment dp, how can the compiler possibly know where it
originally came from with certainty in order to do meaningful bounds
checking? How can one behavior be undefined and the other be defined?

For example:

double d[2][3];
double *dp;
int i;
bool b;
(some code to determine b)
if( b )
dp = (double *) &d; // (1)
else
dp = (double *) d; // (2)
for( i=0; i<6; i++ )
dp++;

So how does the compiler know how to bounds check the dp++ value if
you say (1) is OK but (2) is not OK?

And then I would wonder about memcpy and the like. How does that work
with regards to passing it d? Are you saying that I would have to pass
&d instead of d to ensure that I didn't invoke undefined behavior
inside the routine? How would memcpy even know the differernce? How
would *any* function that could be passed (double *)d or (double *)&d
be able to tell the difference ... and more to the point how can the
compiler possibly set up bounds checking inside the function at
compile time if it has no way of knowing what version will be passed?

James Tursa
 
B

Ben Bacarisse

Old Wolf said:
I get the part about x[0][5] maybe not working.

But the only way I can make sense of everything else that everyone is
trying to tell me is that there are special rules for unsigned char
that prevent bounds checking for this type of code that don't apply to
other types.

There is no special case for unsigned char
w.r.t bounds checking. You can't create
a pointer into an object, and then use that
pointer to access beyond the object's bounds,
regardless of the pointer type.

For example:
double d[2][3];
double *dp = (double *) d;
unsigned char *ucp = (unsigned char *) d;
double *xdp = malloc(sizeof(d));
unsigned char *xucp = malloc(sizeof(d));
mydoublecopy(xdp,dp,6);  // assume prototype present
myucharcopy(xucp,ucp,6*sizeof(double));

where

void mydoublecopy(double *t, double *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

void myucharcopy(unsigned char *t, unsigned char *s, size_t n) {
while( n-- ) { *t++ = *s++; } }

Note that the expression "d" is defined to
mean &d[0], in a value context (which is what
we have here). That is, the address of an array
of three doubles.

'mydoublecopy' and 'myucharcopy' both cause
undefined behaviour because they try to access
beyond the bounds of this array.

The reasoning seem wrong when you go down to 1D arrays. With double
d[2][3]; d is &d[0] and you are saying that this pointer can't be used
to access beyond that first element of d. But if I declare double
x[2]; I most certainly can use x (== &x[0]) to access beyond that
first element.

I think a case can be made that d[0][5] is undefined, but the case
must rest on something that is particular to 2D arrays or an except
must be made for 1D ones.
Now, if you had written:
double *dp = (double *) &d;
unsigned char *ucp = (unsigned char *) &d;

then there would be no undefined behaviour,
because d is an object containing six doubles
(as opposed to d[0] which is an object
containing three doubles).

People habitually pass a 1D array, x, to a function by writing x
rather than &x.
 
K

Keith Thompson

On Sep 1, 2:03 am, James Tursa <[email protected]>
wrote:



Yes there is. Because you can treat any ptr to object as an array of
unsigned char, to observe the objects representation.

Slight correction: You can treat any *object* as an array of unsigned
char. Remember: arrays are not pointers, and pointers are not arrays.
 
K

Keith Thompson

pete said:
James said:
But the only way I can make sense of everything else that everyone is
trying to tell me is that there are special rules for unsigned char
that prevent bounds checking for this type of code that don't apply to
other types.

That's how the string functions work.

N869
7.21 String handling <string.h>
7.21.1 String function conventions
[#1] The header <string.h> declares one type and several
functions, and defines one macro useful for manipulating
arrays of character type and other objects treated as arrays
of character type.

The latter refers to the mem* functions (memcpy, memmove, memcmp,
memchr, memset). Though they're described in the "String handling"
section and declared in <string.h>, I wouldn't call them string
functions.

(The type is size_t, and the macro is NULL; both are also declared in
<stddef.h> and other headers.)
 
V

vippstar

Slight correction: You can treat any *object* as an array of unsigned
char. Remember: arrays are not pointers, and pointers are not arrays.

I don't think your correction is valid.
You're saying you can treat any object as an array of unsigned char.

int i = 123;
(unsigned char *)i[0] wouldn't be valid.

I said you can treat any pointer to object as an array of unsigned
char:

int i = 123;
*(unsigned char*)&i is valid.

Perhaps we are saying the same thing but you misunderstood me and I
misunderstood you?
 
V

vippstar

Slight correction: You can treat any *object* as an array of unsigned
char. Remember: arrays are not pointers, and pointers are not arrays.

I don't think your correction is valid.
You're saying you can treat any object as an array of unsigned char.

int i = 123;
(unsigned char *)i[0] wouldn't be valid.
correction:
*(unsigned char *)i
 
J

James Kuyper

James said:
The key point is the pointer conversion. At the point where that
conversion occurs, the compiler knows that (double*)array == array[0].
It's undefined if any number greater than 1 is added to that pointer
value, and also if that pointer is dereferenced after adding 1 to it.

Trying to understand your answer as it relates to the original post. I
don't see how the original function gets an address 2 beyond the end,
or 1 beyond the end and attempts to dereference it, as you seem to be
saying. Can you point this out? Did I misunderstand you?

Quite possibly. The key point you need to understand is what array the
pointer points at. It's important to understand that, given the
following declaration:

double array[2][1];

"array" is not an array of "double". The element type for "array" is
"double[1]". On the other hand, array[0] is itself an array; the element
type for that array is "double".

The rules governing the behavior of pointer arithmetic are described by
6.5.6p8 in terms of an array whose element type is the type that the
pointer points at; they make no sense when interpreted in terms of an
array with any other element type.

The standard does NOT clearly state where it is that (double*)array
points. I will assume what everyone "knows", which is that it points at
the same location in memory as the original pointer.

There is only one array with an element type of "double" that starts at
that location. It isn't "array", it's "array[0]". Therefore, the rules
concerning pointer arithmetic are described relative to array[0]. Since
array[0] has a length of 1, the behavior is undefined if any integer
other than 0 or 1 is added to it, and it is not legal to dereference it
after 1 has been added to it; the same must also be true of (double*)array.
 
K

Keith Thompson

Slight correction: You can treat any *object* as an array of unsigned
char. Remember: arrays are not pointers, and pointers are not arrays.

I don't think your correction is valid.
You're saying you can treat any object as an array of unsigned char.

int i = 123;
(unsigned char *)i[0] wouldn't be valid.

No, it wouldn't.

First, [] binds more tightly than the cast operator, so your
expression is equivalent to (unsigned char *)(i[0]), which doesn't
make much sense. Assuming you meant ((unsigned char*)i)[0], that's
*not* treating the object i as an array of unsigned char. It's
converting the value of i to a pointer value, and then applying the []
operator to that pointer value.
I said you can treat any pointer to object as an array of unsigned
char:

int i = 123;
*(unsigned char*)&i is valid.

Yes, that's valid. It's treating the object i as an array of unsigned
char.
Perhaps we are saying the same thing but you misunderstood me and I
misunderstood you?

We may be trying to say the same thing. I think you're just saying it
incorrectly.

You can't treat a pointer as an array (except that you can treat a
pointer object as an array of unsigned char, but that has nothing to
do with the fact that it's a pointer object). Arrays and pointers are
two entirely different things (though you need to use pointers to
access array elements).

Here's a concrete example:

#include <stdio.h>
int main(void)
{
int obj = 123;
unsigned char *ptr = (unsigned char*)&obj;
int i;
for (i = 0; i < sizeof obj; i ++) {
printf("ptr = 0x%x\n", ptr);
}
return 0;
}

This treats the object obj as if it were an array of unsigned char.
 
K

Keith Thompson

Slight correction: You can treat any *object* as an array of unsigned
char. Remember: arrays are not pointers, and pointers are not arrays.

I don't think your correction is valid.
You're saying you can treat any object as an array of unsigned char.

int i = 123;
(unsigned char *)i[0] wouldn't be valid.
correction:
*(unsigned char *)i

That wouldn't be valid either -- and it's also not treating the object
as an array of unsigned char. It's still converting the value of i to
a pointer.
 
V

vippstar

I don't think your correction is valid.
You're saying you can treat any object as an array of unsigned char.
int i = 123;
(unsigned char *)i[0] wouldn't be valid.

No, it wouldn't.

First, [] binds more tightly than the cast operator, so your
expression is equivalent to (unsigned char *)(i[0]), which doesn't
make much sense. Assuming you meant ((unsigned char*)i)[0], that's
*not* treating the object i as an array of unsigned char. It's
converting the value of i to a pointer value, and then applying the []
operator to that pointer value.

Yes, that is what I meant, I corrected it in a follow-up.
Yes, that's valid. It's treating the object i as an array of unsigned
char.


We may be trying to say the same thing. I think you're just saying it
incorrectly.

Indeed. :)

Here's a concrete example:

#include <stdio.h>
int main(void)
{
int obj = 123;
unsigned char *ptr = (unsigned char*)&obj;
int i;
for (i = 0; i < sizeof obj; i ++) {
printf("ptr = 0x%x\n", ptr);
}
return 0;

}


To be more concrete, I suggest size_t i; instead of int i; (with
INT_MAX < sizeof (int) your example is not concrete anymore, but
that's not likely)

Also, even if i is changed to size_t, if sizeof (int) == SIZE_MAX,
that would be an infinite loop...

....
size_t i;
for(printf("ptr = 0x%x\n", ptr[i = 0]); i && i < sizeof obj; i++)
printf("ptr = 0x%x\n", ptr);


Anyway, thanks for clearing this up. I hope I'll remember this so I
won't use incorrect terminology in the future.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top