Variable size arrays and pointers

C

Carl-Olof Almbladh

Already in the 1st edition of the "White book",
Kerigham and Ritchie states the "C is a general purpose
language". However, without what is usually called
"assumed size arrays" and built-in complex, C is not
entirely suitable for numerical work. In particular,
in order to have functions which can manipulate
matrices whose dimensions are not determined at compile
time one needs pointers of the form, say,
double (*a)[n]
where n is not a compile time constant but determined
when control passes the pointer declaration. The C89
standard requires that n is a compile time constant.

After the declaration one then fetches matrix elements
with syntax
a[j].

In the absence of this feature, the typical
C/C++ solution is to use double pointers

double **b,

an intermediate pointer array,

and in this way we can again access matrix elements
with syntax
b[j].

From performance point of view, the latter solution is
a disaster, since it requires 2 defererencing steps
(just think of matrix multiply where we need to loop
over the "slow" index too).

GCC has supported variable-size pointers (and arrays)
for many years. It has been supported in fortran
at least from fortran66.

My question is simply:
Are variable-size pointers allowed in C99, which
would make the language fully suitable also for
numerical work.

Carl-Olof Almbladh
 
?

=?ISO-8859-1?Q?=22Nils_O=2E_Sel=E5sdal=22?=

My question is simply:
Are variable-size pointers allowed in C99, which
would make the language fully suitable also for
numerical work.
No, but variable sized arrays are.
 
D

Dan Pop

In said:
My question is simply:
Are variable-size pointers allowed in C99, which
would make the language fully suitable also for
numerical work.

I've no idea what you mean by "variable-size pointers", but variable
length arrays and pointers to variable length arrays are supported by C99.

And, as the rest of your post makes perfectly clear, this is exactly what
you need.

However, they come too late to make much of a difference: the people
heavily involved in numerical analysis didn't wait for C to acquire this
feature and adopted other languages that already provided decent support
for *efficient* matrix processing.

Dan
 
M

Method Man

After the declaration one then fetches matrix elements
with syntax
a[j].

In the absence of this feature, the typical
C/C++ solution is to use double pointers

double **b,

an intermediate pointer array,

and in this way we can again access matrix elements
with syntax
b[j].

From performance point of view, the latter solution is
a disaster, since it requires 2 defererencing steps
(just think of matrix multiply where we need to loop
over the "slow" index too).


It's not always true that two dereference steps are required in your latter
case. Consider the following method for allocating a dynamic MxN array:

int ** arr = (int**) malloc(sizeof(int*) * M);
arr[0] = (int*) malloc(sizeof(int) * M * N);
for (i = 1; i < M; i++)
arr = arr[0] + i*M;

Now, all elements are contiguous and you can use arr[j] syntax to access
any array element with only one dereference (eg. arr[i*M + j]).
 
S

S.Tobias

From performance point of view, the latter solution is
a disaster, since it requires 2 defererencing steps
(just think of matrix multiply where we need to loop
over the "slow" index too).

You have measured the difference, haven't you?

My simple tests show that for unoptimized compilation with gcc the
difference between accessing an array-of-double through a "single" or
"double dereference" was about 1% of total program run. I wouldn't call
that a "disaster", rather a measurement error.

Using variable size array, the program ran 20% slower.

I just can't imagine how an additional dereference could make a difference
beside other operations that must be there, such as multiplication.

With -O3 optimization (gcc 3.3.4) the results were somewhat strange.
The tests with "double dereference" ran up to 20% faster. Perhaps that
might be a result of better loop translation (2 smaller loops) than in the
"single dereference" (1 large loop). I haven't time to investigate this.

Tests with variable size array ran about the same as "double dereference".

What are your results?
 
O

Old Wolf

Method Man said:
From performance point of view, the latter solution is
a disaster, since it requires 2 defererencing steps
(just think of matrix multiply where we need to loop
over the "slow" index too).

It's not always true that two dereference steps are required in your latter
case. Consider the following method for allocating a dynamic MxN array:

int ** arr = (int**) malloc(sizeof(int*) * M);
arr[0] = (int*) malloc(sizeof(int) * M * N);

What are those casts for?
for (i = 1; i < M; i++)
arr = arr[0] + i*M;

Now, all elements are contiguous and you can use arr[j] syntax to access
any array element with only one dereference (eg. arr[i*M + j]).


There are still two dereferences in the syntax: arr[j].
By definition this is: *( *(arr + i) + j).

You have saved on construction costs but not saved on access costs.

arr[i*M + j] would give an int pointer (probably an out of bounds one).
You could write:
int *a = arr[0];
and then have accesses with a single subsequent dereference:
a[i*M + j] ...
but this is a far cry from arr[j].
 
C

Carl-Olof Almbladh

From my original question about variable size arrays
and pointers I have got a clear answer form Dan Pop that they are
included in C99.

There has also been some discussion about the actual permformance
penalties of using double pointers
double **a;
and intermediate pointer arrays
instead of variable-size pointers and arrays
double b[n][m];
double (*a)[m] = b; /* n, m not compile time constants */
for matrix manipulations. (The declaration of a of course
involves no memory allocation but just informs the compliler
how to interpret a[j];)

My own experience for a variety of platforms (Sun, Cray, HP ..)
is that it is platform dependent but not at all negligible in most cases.
(In the absence of pointers (*a)[n] one has to work with 1D
arrays and do the indexing by hand.)
I agree with Dan Pop that the extension comes 20 years
too late - most people doing numerical work have not
had time to wait but moved to fortran where "assumed-size arrays"
have been supported for some 40 years.

Carl-Olof Almbladh
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top