multidimensional arrays and cute tricks

C

Charles Banas

i've got an interesting peice of code i'm maintaining, and i'd like to
get some opinions and comments on it, hopefully so i can gain some
sort of insight as to why this works.

at the top of the function (which was translated from Fortran code),
among other heinous and numerous declarations, is this bit:

static float bbuff[TBUF_SIZE][60][60];
static int bkey[TBUF_SIZE];
static int buse[TBUF_SIZE];
static int bmsg[TBUF_SIZE];
static int bptr;
int ii, jj;
float buff[60][60];

register char *r1buff;
register char *r2buff;

the purpose of bbuff is to make available multiple terrain buffers so
that it doesn't have to repeatedly read the terrain data file (which
is small and insignificant, so it's not that much of a bottleneck).
performance aside, here's a snippet from a section that returns one of
the full buffers in lieu of reading the original data file:

if (stat&TBUF_USE)
for (i = 0;i < TBUF_SIZE;i++)
{
if ((*key == bkey) && buse)
{
r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[0][0]),
sizeof(float)*3600); */

/* transpose index buffer */
for (ii = 0; ii < 60 ; ii++)
for (jj = 0; jj < 60; jj++)
sbuff[jj][ii] = buff[ii][jj];
if (fd != NULL)
fclose(fd);
return (bmsg);
}
}

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher? i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?
 
C

Chris Torek

static float bbuff[TBUF_SIZE][60][60];
float buff[60][60];
register char *r1buff;
register char *r2buff;

r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[0][0]),
sizeof(float)*3600); */

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher?[/QUOTE]

Let me answer this in code order, rather than question order.

First, the loop version runs j from 0 up to (but not including)
3600, and 3600 is 60*60. The two pointers here have type "char *",
and the loop is indeed similar to the commented-out memcpy --
but they are also quite different. The "for" loop copies 3600
bytes (C bytes, "sizeof char" being 1 even if CHAR_BIT > 8).

If the memcpy() call were to be uncommented, it would copy
3600 * sizeof(float) bytes.

These are the same if and only if sizeof(float) is 1. On most
implementations, including all those supported by GCC, sizeof(float)
is at least 4.

As for whether the loop is "kosher", the strict ANSI/ISO C answer
is apparently "no". After looping over the first 60*sizeof(float)
bytes in the arrays named buff[0] and bbuff[0], the next accesses
are to elements in adjacent but different arrays, buff[1] and
bbuff[1]. Some over in comp.std.c say this is allowed to
fail with an "out of array bounds" type of exception. Somewhat
more realistically, plain char might be signed char, which might
have problems copying data -- on ones' complement systems like
the Univac, all-1s bit patterns (-0) might be turned into all-0s
(+0), ruining the "float" value. This part is easily fixed, by
using "unsigned char *" instead of plain "char *".

Certainly the memcpy() call, and most likely even just use of
"unsigned char *", will avoid any sort of "out of array bounds"
exception that might be allowed. (I remain somewhat skeptical that
they even *are* allowed.) On most practical machines, too, even
plain "char *" copies will "work right".
i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

The fact that C does not *have* multidimensional arrays is important.
Instead, C has arrays whose elements are in turn arrays. Here
bbuff is an object whose type is "array TBUF_SIZE of <something>",
and the <something> is in turn another array type: "array 60 of
<something2>". Here <something2> is yet another array type: "array
60 of float", and now it stops, because "float" is a fundamental
C type. Arrays in C (which are by definition "1 dimensional" even
if the element-type is another array) are just contiguously-arranged
elements: a[0] is immediately followed in memory by a[1], which is
immediately followed by a[2], and so on. (This is why so many
"struct"s have padding built into them at the end: in order to go
into an array, a[1] *must* come right after a[0], so if a[0] is a
struct, and that struct would need padding between adjacent array
elements, the padding must be part of the struct type. The array
is not allowed to introduce the padding!)

Now, if a[0] is immediately followed by a[1], and if a[0] is itself
an array of (say) size 2, then a[0][0] is immediately followed by
a[0][1], which -- logically -- must be immediately followed by
a[1][0]. In other words, the memory-picture:

+-----------------------+
a: | a[0][0] | a[0][1] |
| a[1][0] | a[1][1] |
+-----------------------+

can always be "linearized":

+-----------------------------------------------+
a: | a[0][0] | a[0][1] | a[1][0] | a[1][1] |
+-----------------------------------------------+

In the above code, copying 3600 / sizeof(float) -- probably 3600/4
or 900 -- elements starting from bbuff[0][0] will copy bbuff[0][0]
through bbuff[14][59] into buff[0][0] through buff[14][59]
respectively (assuming the 900 number is correct; and note that
900 is 15 * 60). Apparently this suffices for whatever tasks are
being run now.
 
B

Barry Schwarz

i've got an interesting peice of code i'm maintaining, and i'd like to
get some opinions and comments on it, hopefully so i can gain some
sort of insight as to why this works.

at the top of the function (which was translated from Fortran code),
among other heinous and numerous declarations, is this bit:

static float bbuff[TBUF_SIZE][60][60];
static int bkey[TBUF_SIZE];
static int buse[TBUF_SIZE];
static int bmsg[TBUF_SIZE];
static int bptr;
int ii, jj;
float buff[60][60];

register char *r1buff;
register char *r2buff;

the purpose of bbuff is to make available multiple terrain buffers so
that it doesn't have to repeatedly read the terrain data file (which
is small and insignificant, so it's not that much of a bottleneck).
performance aside, here's a snippet from a section that returns one of
the full buffers in lieu of reading the original data file:

if (stat&TBUF_USE)
for (i = 0;i < TBUF_SIZE;i++)
{
if ((*key == bkey) && buse)
{
r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[0][0]),
sizeof(float)*3600); */


Since r1buff and r2buff are char*, the loop only copies 3600 bytes,
not 3600 floats. If you want to copy 3600 floats, delete the loop and
uncomment the call to memcpy.

Since memcpy takes void*, the casts are completely superfluous.
/* transpose index buffer */
for (ii = 0; ii < 60 ; ii++)
for (jj = 0; jj < 60; jj++)
sbuff[jj][ii] = buff[ii][jj];
if (fd != NULL)
fclose(fd);
return (bmsg);
}
}

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught


This comment makes no sense. The for loop is exactly the same as the
memcpy except it is off by a factor of sizeof(float).
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher? i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

Actually, that particular for loop is almost exactly how many books
attempt to explain memcpy (usually as some kind of exercise). Please
voice a specific concern that we can address. Why do you think it
should not work?


<<Remove the del for email>>
 
C

Charles Banas

Barry Schwarz said:
Since r1buff and r2buff are char*, the loop only copies 3600 bytes,
not 3600 floats. If you want to copy 3600 floats, delete the loop and
uncomment the call to memcpy.
i'm wary of changing the code because of the function it performs.
Since memcpy takes void*, the casts are completely superfluous.
not surprising, becuase i don't think the programmer who "fixed" it
knew what he was doing. i won't name him specifically, but he works
at the FCC.
/* transpose index buffer */
for (ii = 0; ii < 60 ; ii++)
for (jj = 0; jj < 60; jj++)
sbuff[jj][ii] = buff[ii][jj];
if (fd != NULL)
fclose(fd);
return (bmsg);
}
}

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught


This comment makes no sense.


i'll rephrase then: i think the memcpy was dumped because C arrays
aren't always stored linearly.
The for loop is exactly the same as the
memcpy except it is off by a factor of sizeof(float).
which is something i missed when i looked through it.
Actually, that particular for loop is almost exactly how many books
attempt to explain memcpy (usually as some kind of exercise). Please
voice a specific concern that we can address. Why do you think it
should not work?
it's a multidimensional array. so in theory, it's not copying the
data verbatim (as it would have in the original Fortran), but rather
is copying the pointers in addition to the data. but i may be
misunderstanding the layout in memory.

are the pointers stored in an array in memory immediately before the
actual data? if so, that would make sense. but that still relies
heavily on the data being linearly stored in one location. something
that surprised me and left me quite unsettled.
 
C

Charles Banas

Chris Torek said:
static float bbuff[TBUF_SIZE][60][60];
float buff[60][60];
register char *r1buff;
register char *r2buff;

r1buff = (char *) & (buff[0][0]);
r2buff = (char *) & (bbuff[0][0]);

for (j = 0;j < 3600;j++, r1buff++, r2buff++)
*r1buff = *r2buff;

/* memcpy((char *)&(buff[0][0]),
(char *)&(bbuff[0][0]),
sizeof(float)*3600); */

now you can see that the memcpy was originally tried, but later
dumped, likely because of how C arrays work. but the for loop caught
my attention: they're basically just copying bewteen two
de-referenced pointers. is this kosher?


Let me answer this in code order, rather than question order.

First, the loop version runs j from 0 up to (but not including)
3600, and 3600 is 60*60. The two pointers here have type "char *",
and the loop is indeed similar to the commented-out memcpy --
but they are also quite different. The "for" loop copies 3600
bytes (C bytes, "sizeof char" being 1 even if CHAR_BIT > 8).

If the memcpy() call were to be uncommented, it would copy
3600 * sizeof(float) bytes.
[/QUOTE]
i missed that, but i see it now.
These are the same if and only if sizeof(float) is 1. On most
implementations, including all those supported by GCC, sizeof(float)
is at least 4.

As for whether the loop is "kosher", the strict ANSI/ISO C answer
is apparently "no". After looping over the first 60*sizeof(float)
bytes in the arrays named buff[0] and bbuff[0], the next accesses
are to elements in adjacent but different arrays, buff[1] and
bbuff[1]. Some over in comp.std.c say this is allowed to
fail with an "out of array bounds" type of exception. Somewhat
more realistically, plain char might be signed char, which might
have problems copying data -- on ones' complement systems like
the Univac, all-1s bit patterns (-0) might be turned into all-0s
(+0), ruining the "float" value. This part is easily fixed, by
using "unsigned char *" instead of plain "char *".

i hadn't noticed that either. but strangely, the code seems to work
fine, returning expected values.

it may surprise you, but i'm instructed not to change this code -
because it's the code we were given by the FCC for the purpose of
calculating height above average terrain, and this is dealing with
60x60 blocks of actual terrain data.
Certainly the memcpy() call, and most likely even just use of
"unsigned char *", will avoid any sort of "out of array bounds"
exception that might be allowed. (I remain somewhat skeptical that
they even *are* allowed.) On most practical machines, too, even
plain "char *" copies will "work right".
which is why i was concerned. it "works right" now, but my concern is
whether the multidimensional array is actually consistently stored in
memory in proper order in every case. it's not something i would want
to depend on. but it seems to have worked on a VAX, a Sun
SparcStation, and on our Windows machines.
i know it works with both
Visual C++ 6 and GCC 3.3, but to me, it doesn't look like it should
work. can someone explain to me why the buffer copies correctly and
successfully despite the rather strange-looking method?

The fact that C does not *have* multidimensional arrays is important.
Instead, C has arrays whose elements are in turn arrays. Here
bbuff is an object whose type is "array TBUF_SIZE of <something>",
and the <something> is in turn another array type: "array 60 of
<something2>". Here <something2> is yet another array type: "array
60 of float", and now it stops, because "float" is a fundamental
C type. Arrays in C (which are by definition "1 dimensional" even
if the element-type is another array) are just contiguously-arranged
elements: a[0] is immediately followed in memory by a[1], which is
immediately followed by a[2], and so on. (This is why so many
"struct"s have padding built into them at the end: in order to go
into an array, a[1] *must* come right after a[0], so if a[0] is a
struct, and that struct would need padding between adjacent array
elements, the padding must be part of the struct type. The array
is not allowed to introduce the padding!)
aha! i didn't know this was defined behavior. it's not behavior i
would have depended on, at least.
Now, if a[0] is immediately followed by a[1], and if a[0] is itself
an array of (say) size 2, then a[0][0] is immediately followed by
a[0][1], which -- logically -- must be immediately followed by
a[1][0]. In other words, the memory-picture:

+-----------------------+
a: | a[0][0] | a[0][1] |
| a[1][0] | a[1][1] |
+-----------------------+

can always be "linearized":

+-----------------------------------------------+
a: | a[0][0] | a[0][1] | a[1][0] | a[1][1] |
+-----------------------------------------------+

In the above code, copying 3600 / sizeof(float) -- probably 3600/4
or 900 -- elements starting from bbuff[0][0] will copy bbuff[0][0]
through bbuff[14][59] into buff[0][0] through buff[14][59]
respectively (assuming the 900 number is correct; and note that
900 is 15 * 60). Apparently this suffices for whatever tasks are
being run now.


apparently, but again, it's not something i would have depended on.
there are cases where an array of arrays may not be linearly addressed
in memory, but in fact may be out of order.

but i am quite surprised i didn't notice the casts to (char *). this
code should in fact be copying all 3600 elements, and not just 900. i
don't know why it was changed, and it "works" now, so i don't really
have authority to "fix" it.

but thank you for clarifying some things i misunderstood.
 
C

Chris Torek

i'll rephrase then: i think the memcpy was dumped because C arrays
aren't always stored linearly.

C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.
it's a multidimensional array. so in theory, it's not copying the
data verbatim (as it would have in the original Fortran), but rather
is copying the pointers in addition to the data. but i may be
misunderstanding the layout in memory.

are the pointers stored in an array in memory immediately before the
actual data?

There are no pointers in memory here, only the array. ("There is
no Dana, only Zuul". :) ) When you mention the name of an array
in a value context, the C compiler must *construct* a pointer value.
See <http://web.torek.net/torek/c/pa.html>.
 
C

Charles Banas

Chris Torek said:
C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.
let me see if i understand correctly. for:

int array[2][60];
int *parray = array;

then &parray[60] == &array[1][0] in all cases?
There are no pointers in memory here, only the array. ("There is
no Dana, only Zuul". :) ) When you mention the name of an array
in a value context, the C compiler must *construct* a pointer value.
See <http://web.torek.net/torek/c/pa.html>.

that makes sense for arrays allocated on the stack, but honestly, i
didn't know this was the case.

thanks!

-- Charles Banas
 
B

Barry Schwarz

Chris Torek said:
C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.
let me see if i understand correctly. for:

int array[2][60];
int *parray = array;

This is a syntax error. parray is an int*. In this context, array
evaluates to a pointer to the first element with type pointer to
element. The first element of array is array[0] which has type
pointer to array of 60 int. In order for this assignment to work, you
would need
int (*parray)[60] = array;
then &parray[60] == &array[1][0] in all cases?

Once your premise is wrong, your conclusions fall apart.
that makes sense for arrays allocated on the stack, but honestly, i
didn't know this was the case.

thanks!

-- Charles Banas



<<Remove the del for email>>
 
G

glen herrmannsfeldt

Charles said:

In the sense given here, they are always stored linearly.
C arrays *are* always linear. C imposes a "locally flat" memory
structure upon all systems that implement C. There is no guarantee
that one array object's address space be comparable in any way with
some other array object's, but each object lives within some
(perhaps small-as-possible and widely-scattered) "flat" region.
(snip)

let me see if i understand correctly. for:

int array[2][60];
int *parray = array;

then &parray[60] == &array[1][0] in all cases?

I believe you have to say it in the form

int *parray=(int*)array;

That is sort of saying "do it anyway, even if I shouldn't".

If you say array[1], which is the same as &array[1][0], the
compiler will create the proper pointer value, but there are
(normally) not any pointers stored in memory.

I was told that some Fortran compilers on the PDP-11 would
index arrays through pointers, instead of calculating the
offset using multiplication. This was transparent to the
user, the array itself must be allocated contiguously, as
it must in C. I presume C compilers could create such a hidden
pointer array to speed up the calculation (on machines with
a slow multiply), but it isn't normally done, and you shouldn't
be able to tell if it were.
that makes sense for arrays allocated on the stack, but honestly, i
didn't know this was the case.

Well, there is another way to do this problem, which is for
the program itself to allocate pointers. In that case, you
would have something declared

float ***a;

Which could still be referenced as a[j][k].

Personally, I disagree with Chris that C doesn't have multidimensional
arrays, though he is likely right. They may call them by a
different name, but they have the properties that multidimensional
arrays should have.

-- glen
 
D

Dave Thompson

Chris Torek <[email protected]> wrote in message news:<[email protected]>...
C arrays *are* always linear. <snip>
let me see if i understand correctly. for:

int array[2][60];
int *parray = array;

This is a syntax error. parray is an int*. In this context, array

Technically it's a constraint violation not a syntax error. But it is
wrong for the reason you state.
evaluates to a pointer to the first element with type pointer to
element. The first element of array is array[0] which has type
pointer to array of 60 int. In order for this assignment to work, you
would need
int (*parray)[60] = array;
Or, more likely, int *parray = array[0] /* or equivalent and arguably
clearer &array[0][0] */.
then &parray[60] == &array[1][0] in all cases?
I believe this is officially guaranteed by the new wording of 6.5.9p6
with footnote 91, and in practice was always true due to the
(required) memory model of contiguous (and unpadded) arrays.
Technically it isn't guaranteed you can _access_ array[1][0] by
parray[60] etc., but I don't believe there is or ever will be an
implementation that doesn't except debugging and _maybe_JVM kludges,
partly because I believe it _is) guaranteed to access the (same)
representation using any of (uchar*)array[1]
((uchar*)array)+60*intsize
(uchar*)parray+60*intsize
et seq

Stack or not doesn't matter; it makes sense for the C definition of
array (and pointer, particularly to element) across the board.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top