Two-indexes initialized array

pozzugno · Apr 19, 2012

#define NUM_JHON 1
#define NUM_RICHARD 2
#define NUM_ERIC 3
#define MYLIST NUM_JHON, RICHARD, ERIC

int addressbook[] = { MYLIST };

In the above code, I can change MYLIST #define and
automatically I'll have a correcly sized addressbook[]
array.

Now I want to declare 3 addressbooks:

#define MYLIST1 NUM_JHON, RICHARD, ERIC
#define MYLIST2 NUM_JHON, RICHARD, ERIC
#define MYLIST3 NUM_JHON, RICHARD, ERIC

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated
based on the initialization.
Why the second can't be calculated at compile-time as the
first index? I understand MYLST1, MYLST2 and MYLIST3 could be
of different lengths, but the compiler could assume the maximum.

James Kuyper · Apr 19, 2012

#define NUM_JHON 1

Should that be NUM_JOHN?

#define NUM_RICHARD 2
#define NUM_ERIC 3
#define MYLIST NUM_JHON, RICHARD, ERIC

int addressbook[] = { MYLIST };

In the above code, I can change MYLIST #define and
automatically I'll have a correcly sized addressbook[]
array.

Now I want to declare 3 addressbooks:

#define MYLIST1 NUM_JHON, RICHARD, ERIC
#define MYLIST2 NUM_JHON, RICHARD, ERIC
#define MYLIST3 NUM_JHON, RICHARD, ERIC

Why do you need three identical macros with different names? Why not use
MYLIST1 three times?

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated
based on the initialization.
Why the second can't be calculated at compile-time as the
first index? I understand MYLST1, MYLST2 and MYLIST3 could be
of different lengths, but the compiler could assume the maximum.

There's no inherent reason why a compiler couldn't determine all of the
the dimensions of a multi-dimensional array based upon the size of the
initializer list - but that's not how the C language is defined. Only
the leading dimension of the array is inferred that way; all other
dimensions must be explicitly specified. I've no idea why that decision
was made that way. The C Rationale provides no insight on this issue.

Ben Pfaff · Apr 19, 2012

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated
based on the initialization.

Yes. It's just the way the C standard is written, to allow the
first [] in a sequence of them to lack a size, but not latter ones.

Stefan Ram · Apr 19, 2012

#define MYLIST1 NUM_JHON, RICHARD, ERIC
#define MYLIST2 NUM_JHON, RICHARD, ERIC
#define MYLIST3 NUM_JHON, RICHARD, ERIC
int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

Something like the following might work, but it was not thoroughly tested:

#define MYLIST1 NUM_JHON, RICHARD, ERIC
#define MYLIST2 NUM_JHON, RICHARD, ERIC
#define MYLIST3 NUM_JHON, RICHARD, ERIC

int count1[] ={ MYLIST1 };
int count2[] ={ MYLIST2 };
int count3[] ={ MYLIST3 };

int addressbook[]
[ ( sizeof count1 > sizeof count2 )*( sizeof count1 > sizeof count3 )*( sizeof count1 )+
( sizeof count2 > sizeof count1 )*( sizeof count2 > sizeof count3 )*( sizeof count2 )+
( sizeof count3 > sizeof count1 )*( sizeof count3 > sizeof count2 )*( sizeof count3 )+
( sizeof count1 == sizeof count2 )*( sizeof count1 > sizeof count3 )*( sizeof count1 )+
( sizeof count1 == sizeof count2 )*( sizeof count3 > sizeof count1 )*( sizeof count3 )+
( sizeof count1 == sizeof count3 )*( sizeof count1 > sizeof count2 )*( sizeof count1 )+
( sizeof count1 == sizeof count3 )*( sizeof count2 > sizeof count1 )*( sizeof count2 )+
( sizeof count2 == sizeof count3 )*( sizeof count1 > sizeof count2 )*( sizeof count1 )+
( sizeof count2 == sizeof count3 )*( sizeof count2 > sizeof count1 )*( sizeof count2 )+
( sizeof count1 == sizeof count3 )*( sizeof count2 == sizeof count3 )*( sizeof count1 )]
= { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

BartC · Apr 19, 2012

James Kuyper said:
On 04/19/2012 11:03 AM, (e-mail address removed) wrote:

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated

Click to expand...

There's no inherent reason why a compiler couldn't determine all of the
the dimensions of a multi-dimensional array based upon the size of the
initializer list - but that's not how the C language is defined. Only
the leading dimension of the array is inferred that way; all other
dimensions must be explicitly specified. I've no idea why that decision
was made that way. The C Rationale provides no insight on this issue.

The first dimension can only have one value. The others, could have
different lengths in each row:

{ {10,20}, (30,40,50}, {60,70,80,90}};

Here the first dimension is 3; the second could be 2, 3 or 4 (or perhaps N).
It gets confusing as to what was the intended maximum row length, and what
rows were intentionally partly initialised or not.

You can make the same argument for the number of dimensions; that could also
be inferred from the initialiser list instead of needing [][]; but there
would be even more scope for the compile to get it wrong.

James Kuyper · Apr 19, 2012

James Kuyper said:
James Kuyper said:

On 04/19/2012 11:03 AM, (e-mail address removed) wrote:

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated

Click to expand...

Click to expand...

There's no inherent reason why a compiler couldn't determine all of the
the dimensions of a multi-dimensional array based upon the size of the
initializer list - but that's not how the C language is defined. Only
the leading dimension of the array is inferred that way; all other
dimensions must be explicitly specified. I've no idea why that decision
was made that way. The C Rationale provides no insight on this issue.

Click to expand...

The first dimension can only have one value. The others, could have
different lengths in each row:

{ {10,20}, (30,40,50}, {60,70,80,90}};

There's a single unique minimum value for N (which is 3), and another
for M (which is 4), such that "int addressbook[N][M]" could be
initialized by such an initializer. Why shouldn't N and M both default
to those minimums if left unspecified?

Here the first dimension is 3; the second could be 2, 3 or 4 (or perhaps N).

It seems to me that N>=3, M>=4 is required; I don't see how you can
identify either 2 or 3 as acceptable values for M. The C standard
specifies that N's values can be inferred to be 3; I see no reason why
M's value could not be inferred to be 4; the C standard just chose not
to do so.

It gets confusing as to what was the intended maximum row length, and what
rows were intentionally partly initialised or not.

Why should the possibility that M was intended to be 6 prevent M from
defaulting to 4? How is that different from possibility that N was
intended to be 4?

BartC · Apr 19, 2012

James Kuyper said:
James Kuyper said:

On 04/19/2012 11:03 AM, (e-mail address removed) wrote:

Click to expand...

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated

Click to expand...

There's no inherent reason why a compiler couldn't determine all of the
the dimensions of a multi-dimensional array based upon the size of the
initializer list - but that's not how the C language is defined. Only
the leading dimension of the array is inferred that way; all other
dimensions must be explicitly specified. I've no idea why that decision
was made that way. The C Rationale provides no insight on this issue.

Click to expand...

The first dimension can only have one value. The others, could have
different lengths in each row:

{ {10,20}, (30,40,50}, {60,70,80,90}};

Click to expand...

There's a single unique minimum value for N (which is 3), and another
for M (which is 4), such that "int addressbook[N][M]" could be
initialized by such an initializer. Why shouldn't N and M both default
to those minimums if left unspecified?

Sure, it's possible. It's just a more undisciplined way of doing things. The
second M dimension can be significant, yet you'd be relying on N (possibly
hundreds) of different rows of arbitrary length to supply that information.
A "},{" gets accidently deleted, and now M is double what it should be. If
only one row has the maximum M length, suppose that row gets deleted? What
if no row happens to be the maximum M width that is needed? How about if one
row has an extra element by mistake?

There's more chance for things going wrong in a way that is harder to
detect. With a one-dimension vector, the problems aren't so bad.

It seems to me that N>=3, M>=4 is required; I don't see how you can
identify either 2 or 3 as acceptable values for M. The C standard
specifies that N's values can be inferred to be 3; I see no reason why
M's value could not be inferred to be 4; the C standard just chose not
to do so.

If M is supposed to be 2 or 3, then rows of length 3 or 4 would be errors;
supplying M=2 or 3 would disclose those errors.

Why should the possibility that M was intended to be 6 prevent M from
defaulting to 4? How is that different from possibility that N was
intended to be 4?

There are a few situations where you just want the smallest rectangular
matrix that accommodates the data supplied, and the actual dimensions are
not important. But in general I'd want to be more certain of the width of
such a matrix than relying on an absence of typos.

However, while it's obviously feasible to leave out M, I can't tell you
exactly why C didn't allow this as an option.

Richard Damon · Apr 21, 2012

There's no inherent reason why a compiler couldn't determine all of the
the dimensions of a multi-dimensional array based upon the size of the
initializer list - but that's not how the C language is defined. Only
the leading dimension of the array is inferred that way; all other
dimensions must be explicitly specified. I've no idea why that decision
was made that way. The C Rationale provides no insight on this issue.

I believe that, at least initially, a number of the restrictions in the
language were put in place to make it easier to write the compiler, and
to minimize the "back tracking" that the compiler needs to do. The
processing is divided into phases, and in general, each phase can make a
single pass through the output of the previous phase to generate its
output.

To handle something like

int i[][]= { {1}, {2, 3}};

after parsing the {1} the compiler doesn't know how many 0's (if any) it
needs to emit to finish the initialization of that row. By requiring
that all but the last dimension be provided it doesn't run into that
problem. To handle this case would require the compiler to keep ALL the
initializers for the array until it got to the end of the object,
updating it as it saw longer rows, then it could emit them at the end.
This requires significantly more "trace-back" storage than elsewhere in
the language.

Kaz Kylheku · Apr 21, 2012

I believe that, at least initially, a number of the restrictions in the
language were put in place to make it easier to write the compiler, and
to minimize the "back tracking" that the compiler needs to do. The
processing is divided into phases, and in general, each phase can make a
single pass through the output of the previous phase to generate its
output.

To handle something like

int i[][]= { {1}, {2, 3}};

Hi Richard,

To handle something like this is not necessary. You cannot have
two unspecified dimensions in an array definition or declaration.

Only the outermost dimension can be left unspecified (to be determined
by the number of top-level constituents which occur in the initializer).

after parsing the {1} the compiler doesn't know how many 0's (if any) it
needs to emit to finish the initialization of that row.

So you see it must know that because in reality, the declaration has
to be something like this:

int i[][INTEGRAL_CONSTANT]= { {1}, {2, 3}};

So the number of zeros (or perhaps more pertinently, the offset of every
value within the implied memory space) is known.

To handle this case would require the compiler to keep ALL the
initializers for the array until it got to the end of the object,

But anyway more than a decade before C systems, Lisp systems were parsing cruft
like ((1) (2 3)) and keeping it all in memory to do multiple passes of
whatever on it.

BartC · Apr 21, 2012

To handle something like

int i[][]= { {1}, {2, 3}};

after parsing the {1} the compiler doesn't know how many 0's (if any) it
needs to emit to finish the initialization of that row. By requiring
that all but the last dimension be provided it doesn't run into that
problem. To handle this case would require the compiler to keep ALL the
initializers for the array until it got to the end of the object,
updating it as it saw longer rows, then it could emit them at the end.
This requires significantly more "trace-back" storage than elsewhere in
the language.

It that ever was an issue, perhaps with the very first compilers that might
have been single-pass and running in limited memory, it's not a limitation
now. C has other reasons for not allowing it. Or perhaps there just isn't
enough demand for the feature.

Richard Damon · Apr 21, 2012

I believe that, at least initially, a number of the restrictions in the
language were put in place to make it easier to write the compiler, and
to minimize the "back tracking" that the compiler needs to do. The
processing is divided into phases, and in general, each phase can make a
single pass through the output of the previous phase to generate its
output.

To handle something like

int i[][]= { {1}, {2, 3}};

Click to expand...

Hi Richard,

To handle something like this is not necessary. You cannot have
two unspecified dimensions in an array definition or declaration.

Only the outermost dimension can be left unspecified (to be determined
by the number of top-level constituents which occur in the initializer).

That is based on the current C rules, the OP was asking WHY this
restriction, and I was showing what the cost would be to have allowed
other dimensions to be left unspecified and determined by parsing the
initializer.

after parsing the {1} the compiler doesn't know how many 0's (if any) it
needs to emit to finish the initialization of that row.

Click to expand...

So you see it must know that because in reality, the declaration has
to be something like this:

int i[][INTEGRAL_CONSTANT]= { {1}, {2, 3}};

So the number of zeros (or perhaps more pertinently, the offset of every
value within the implied memory space) is known.

To handle this case would require the compiler to keep ALL the
initializers for the array until it got to the end of the object,

Click to expand...

But anyway more than a decade before C systems, Lisp systems were parsing cruft
like ((1) (2 3)) and keeping it all in memory to do multiple passes of
whatever on it.

But Lisp stored that as a tree, not an array. Storing that structure as
you go in a tree is trivial, but at the cost of space.

Ark · Apr 22, 2012

#define NUM_JHON 1
#define NUM_RICHARD 2
#define NUM_ERIC 3
#define MYLIST NUM_JHON, RICHARD, ERIC

int addressbook[] = { MYLIST };

In the above code, I can change MYLIST #define and
automatically I'll have a correcly sized addressbook[]
array.

Now I want to declare 3 addressbooks:

#define MYLIST1 NUM_JHON, RICHARD, ERIC
#define MYLIST2 NUM_JHON, RICHARD, ERIC
#define MYLIST3 NUM_JHON, RICHARD, ERIC

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated
based on the initialization.
Why the second can't be calculated at compile-time as the
first index? I understand MYLST1, MYLST2 and MYLIST3 could be
of different lengths, but the compiler could assume the maximum.

As others commented, that's how the language is defined. To work within
it, I'd suggest a "jagged array" technique, i.e. a 1-dimensional array
of pointers to 1-dimensional arrays, like
int *addressbook[] = {as in your text};
This requires support of compound literals (C99) and perhaps explicit
casts of inner {}'s.
This is bad enough if you want const array because compound literals may
be RAM-based even though the emitted code is correct.
A better plan may be in that case
const int MYLIST1[] = {NUM_JHON, RICHARD, ERIC};
....
const int * const addressbook[][] = {MYLIST1, MYLIST2, MYLIST3};

David Thompson · May 13, 2012

Nit: ITYM NUM_RICHARD and NUM_ERIC

int addressbook[][] = { { MYLIST1 }, { MYLIST2 }, { MYLIST3 } };

The compiler gives me an error. It seems the second index
must be defined and the first will be automatically calculated
based on the initialization.

Click to expand...

Nit: The second *size* aka bound aka dimension. An *index* aka
subscript is from 0 up to the size/bound (exclusive) and identifies
one of the elements in the array as opposed to the whole array.

Why the second can't be calculated at compile-time as the
first index? I understand MYLST1, MYLST2 and MYLIST3 could be
of different lengths, but the compiler could assume the maximum.

Click to expand...

As others commented, that's how the language is defined. To work within
it, I'd suggest a "jagged array" technique, i.e. a 1-dimensional array
of pointers to 1-dimensional arrays, like
int *addressbook[] = {as in your text};
This requires support of compound literals (C99) and perhaps explicit
casts of inner {}'s.

Compound literals look somewhat like casts but aren't.

This is bad enough if you want const array because compound literals may
be RAM-based even though the emitted code is correct.
A better plan may be in that case
const int MYLIST1[] = {NUM_JHON, RICHARD, ERIC};
...
const int * const addressbook[][] = {MYLIST1, MYLIST2, MYLIST3};

Nit: only one [] there not two.

A compound literal can be qualified const:
const int * const addressbook [] = {
(const int []) { MYLIST1 },
(const int []) { MYLIST2 },
(const int []) { MYLIST3 } };

The standard doesn't (and can't) require any particular memory layout,
but if an implementation can put (usually only static) objects in ROM,
it should and likely will so for static const compound literals.

An Array of Arrays using Array.sort_by	2	Feb 9, 2010
Using Enumerated Types as Array Indexes	51	Aug 16, 2011
Sorted indexes in Python	0	Oct 17, 2008
Accessing array index addresses with custom datatype in a function	0	Jun 2, 2022
length of strings in a two dimensional array	16	Apr 19, 2011
Distributed array initialization	11	Jan 4, 2013
const and array of array (of array ...)	3	Sep 2, 2009
Assigning an array to another array using C's assignment operator	0	Feb 1, 2013

Two-indexes initialized array

pozzugno

James Kuyper

Ben Pfaff

Stefan Ram

BartC

James Kuyper

BartC

Richard Damon

Kaz Kylheku

BartC

Richard Damon

Ark

David Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads