array indexing anecdote

James Kuyper · Mar 1, 2014

IMO, if something treats something as something else, which allows the
something else to become the something, understanding is at fault.

Thus understanding -> "understanding".

Could you identify the second "something" and the "something else" that
you're referring to? C treats array[index] and index[array] as two
different but equivalent ways of expressing the same concept. It doesn't
treat "index" as if it were "array", which would indicate a lack of
understanding - it simply doesn't which one comes first. This is a
natural consequence of C's definition of how the subscript operator is
interpreted, which is, in turn, a natural consequence of C's heavy
obsession with pointers.

Helmut Tessarek · Mar 1, 2014

Nice target language you have!

As mentioned before, this was not my example. Also the explanation was not
mine (which I also mentioned in an update to one of my posts).

Anyway, I found this example and explanation in a book whilst browsing through
it at a friend's place. It was 'The C Book, second edition by Mike Banahan,
Declan Brady and Mark Doran', published by Addison Wesley in 1991.

As I mentioned in my first post, I forgot about it, which implied that I knew
about it at some point and did not need an explanation.

I posted it as an anecdote (hence the subject) and/or amusement to newbies who
read the posts in this newsgroup. That's all.

--
Helmut K. C. Tessarek

/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/

Keith Thompson · Mar 1, 2014

Eric Sosman said:
My C compiler translates programs with arrays just fine,
exactly as it is supposed to do this according to C.

Click to expand...

I didn't say the compiler didn't compile the program. It does, which is the
example.

As far as the compiler is concerned, an expression like x[n] is translated
into *(x+n) and use made of the fact that an array name is converted into a
pointer to the array's first element whenever the name occurs in an
expression. That's why, amongst other things, array elements count from zero:
if x is an array name, then in an expression, x is equivalent to &x[0], i.e. a
pointer to the first element of the array. So, since *(&x[0]) uses the pointer
to get to x[0], *(&x[0] + 5) is the same as *(x + 5) which is the same as
x[5]. A curiosity springs out of all this. If x[5] is translated into *(x +
5), and the expression x + 5 gives the same result as 5 + x (it does), then
5[x] should give the identical result to x[5]!

Click to expand...

Yes, yes, we know this. We've known it for years and years.
Thirty-six years, to be precise: You'll find it on page 94 of the
"The C Programming Language" by Brian Kernighan and Dennis Ritchie,
published in 1978.

Well, almost. That page says that a is by definition equivalent to
(*a+i), but it doesn't explicitly follow that to the conclusion that
a is equvilalent to i[a].

Page 210 does say that:

Therefore, despite its assymetric appearance, subscripting is a
commutative operation.

though no examples are given.

But it needn't have been defined that way. The "+" operator
*could* have been defined so it can take a pointer as its left
operand and an integer as its right operand, but not vices versa.
The result would have been a consistent and usable language whose
only difference from C is that a few obscure expressions would be
invalid (and could trivially be modified to be valid).

[...]

Stefan Ram · Mar 1, 2014

Keith Thompson said:
Well, almost. That page says that a is by definition equivalent to
(*a+i),

We the author's of that page can't get the parentheses right
in such a simple case, I would not bother to continue reading.

Stefan Ram · Mar 1, 2014

Supersedes: <[email protected]>
["We the author's"->"When the authors"]

Keith Thompson said:
Well, almost. That page says that a is by definition equivalent to
(*a+i),

When the authors of that page can't get the parentheses right
in such a simple case, I would not bother to continue reading.

Kaz Kylheku · Mar 1, 2014

Ok, I give up. Tell me a better word then.

Intermediate English generation:

"Doesn't understand array indexing to be something different from
^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
displaced dereference"

Optimization pass:

"Doesn't distinguish array indexing as from displaced dereference"
^^^^^^^^^^^ ^^^^

Also, did you notice the quotes around the word? They can be interpreted as:
sort of <word-in-quotes> or for the lack of a better word.

http://en.wikipedia.org/wiki/Scare_quotes

glen herrmannsfeldt · Mar 1, 2014

(snip, someone wrote)

As far as the compiler is concerned, an expression like x[n] is
translated into *(x+n) and use made of the fact that an array
name is converted into a pointer to the array's first element
whenever the name occurs in an expression.

Click to expand...

Click to expand...

(snip)

Well, almost. That page says that a is by definition equivalent to
(*a+i), but it doesn't explicitly follow that to the conclusion that
a is equvilalent to i[a].

Page 210 does say that:

Click to expand...

Therefore, despite its assymetric appearance, subscripting is a
commutative operation.

Click to expand...

though no examples are given.

Click to expand...

But it needn't have been defined that way. The "+" operator
*could* have been defined so it can take a pointer as its left
operand and an integer as its right operand, but not vices versa.
The result would have been a consistent and usable language whose
only difference from C is that a few obscure expressions would be
invalid (and could trivially be modified to be valid).

Click to expand...

A non-commutative + reminds me of the non-commutative multiply
in some Cray processors. But yes, a non-commutative pointer
addition could have been done. It also reminds me of the
non-commutative + string concatenation operator in Java.
(But I still think they should have used a different operator
for the operation.)

To continue what they could have done, note that C allows
pointer-integer but not integer-pointer. Maybe that one seems
obvious, but note that the OS/360 assembler knows how to do it,
(that is, absolute-relocatable) and also to add two pointers
(relocatable+relocatable). (Conveniently, the assembler doesn't
scale by the size of an object. That would really complicate
allowing those combinations in C.)

But OK, commutative pointer+integer addition allows C to be C.
That is, to be special, and not just any other language.

-- glen

Jorgen Grahn · Mar 1, 2014

Jorgen Grahn said:
Jorgen Grahn said:

int ar[ARSZ], i;
for(i = 0; i < ARSZ; i++){
ar = i;
i[ar]++;
...

After over 2 decades of programming in C, I totally forgot about it.

Because it's not something you encounter in the wild. People pull all
kinds of crazy stunts, but for some reason not this one.

Click to expand...

Yes, this one too. See, for example, David Korn's entry in the 1987
International Obfuscated C Code Contest (ioccc.org):

main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}

(It depends on the compiler to predefine the macro "unix" to 1.
The output is "unix", but for reasons having nothing to do with
the spelling of the macro name.)

Click to expand...

In terms of that metaphor, the IOCCC is a zoo, not "in the wild".

Yes. I would have explicitly excluded the IOCCC, but I didn't remember
its abbreviation and was too lazy to look it up ...

/Jorgen

Ben Bacarisse · Mar 1, 2014

glen herrmannsfeldt said:
Is C the first language that only allows for array indexing
starting at zero?

Both BCPL and B have/had zero-based indexing.

<snip>

glen herrmannsfeldt · Mar 1, 2014

Both BCPL and B have/had zero-based indexing.

OK, but those are ancestors of C.

Any that are not directly related?

PL/I was the first I knew that allowed one to select the
lower bound, and later Pascal and Fortran 77 also did.

Given that mathematics like to start indexing at 1, though,
forcing 0 is breaking from tradition.

-- glen

Kaz Kylheku · Mar 1, 2014

(snip)

As far as the compiler is concerned, an expression like x[n] is
translated into *(x+n) and use made of the fact that an array name
is converted into a pointer to the array's first element whenever
the name occurs in an expression. That's why, amongst other
things, array elements count from zero:

Click to expand...

Hmm. If you converted x[n] into *(x+n-1) then arrays would count
from 1, like some other languages. Those who like arrays from zere
could always write *(x+n) instead...

Is C the first language that only allows for array indexing
starting at zero?

Lisp vectors and n-dimensional arrays go from zero. (There are also displaced
arrays that refer to other arrays).

How far back does that zero-based indexing go? At least as far back as Lisp 1.5, 1962.
The manual describes support for arrays that doesn't resemble the modern ones:

http://www.softwarepreservation.org/projects/LISP/book/LISP 1.5 Programmers Manual.pdf

"Indices range from 0 to n-1." (P. 27 "The Array Feature")

I do not see any such thing in the Lisp 1 manual (1960).

Stefan Ram · Mar 1, 2014

Ben Bacarisse said:
Both BCPL and B have/had zero-based indexing.

I any machine language, the address of the first component
of an array is the address of the array (plus zero).

BartC · Mar 1, 2014

Hmm. If you converted x[n] into *(x+n-1) then arrays would count
from 1, like some other languages. Those who like arrays from zere
could always write *(x+n) instead...

Well, writing x[n-1] is a bit like counting from 1 too! Except it's
obviously counting from 0.

You can't say a language is 1-based unless you can port a 1-based algorithm
to it without messing with the indexing or the bounds.

Not counting assembler programs where
the user computes the indexing.)

Those use the natural zero-base of arrays and offsets.

Given that mathematics like to start indexing at 1, though,
forcing 0 is breaking from tradition.

I don't know why this 0- and 1-based business is such a big deal. In all the
languages I've ever created, I've generally allowed both, but the default
base was usually 1. Both are useful.

(Actually I usually allow any lower bound, but anything other than 0 or 1 is
rare.)

But if the choice has to be only 0 or only 1, then 0 is a better bet
(because, with an extra element allocated, you can just ignore the 0th
element and index from 1).

Kaz Kylheku · Mar 1, 2014

OK, but those are ancestors of C.

Any that are not directly related?

PL/I was the first I knew that allowed one to select the
lower bound, and later Pascal and Fortran 77 also did.

Given that mathematics like to start indexing at 1, though,
forcing 0 is breaking from tradition.

Mathematics does not "like" to start indexing at 1.

There is no such "tradition".

It depends on the situation.

For instance, I see plenty of both zero and one based indexing here:

http://en.wikipedia.org/wiki/Series_(mathematics)

Jorgen Grahn · Mar 1, 2014

That's why I said in my original post that I came across this again after a
long, long time and that I have forgotten about it.

It was not meant to be the reason for an endless discussion.

Hey, this is comp.lang.c -- /anything/ can cause an endless
discussion ...

FWIW, I was mildly amused by your first posting. I had
forgotten about that little C curiosity.

/Jorgen

glen herrmannsfeldt · Mar 1, 2014

I any machine language, the address of the first component
of an array is the address of the array (plus zero).

As I previously noted, it would have been possible to define
a as *(a+b-1), in which case arrays would be origin 1.

The machine doesn't care, in most cases, if you give the right
origin for the array. That is, the address constant doesn't have
to be the address of the first element, though C programmers might
disagree. (At least for the hardware that I know about.)

The array descriptors used by IBM PL/I compilers store the
address of array element with all subscripts zero, even if it
isn't inside the array. Once you do that, you can easily find
any array element.

The VAX/VMS array descriptor includes both virtual and physical
origin, allowing for either PL/I or Fortran array argument passing.

-- glen

Stefan Ram · Mar 1, 2014

glen herrmannsfeldt said:
As I previously noted, it would have been possible to define
a as *(a+b-1), in which case arrays would be origin 1.

int a_[ 10 ], *a = a_ - 1;

Now, you can use a[ 1 ] up to a[ 10 ], but a[ 0 ] would be
an error. Disclaimer: The evaluation of »a_ - 1« has
undefined behavior.

jononanon · Mar 1, 2014

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

int main(void)
{
int arr[] = {1, 2, 3, 4};
int *p0 = arr;
int *p3 = p0 + 3;
ptrdiff_t diff;

printf("%d\n", p3 - p0); // 3
printf("%d\n", &*p3 - &*p0); // 3
printf("%d\n", (void *) p3 - (void *) p0); // 12
printf("%d\n", (char *) p3 - (char *) p0); // 12

return EXIT_SUCCESS;
}

In other words: a pointer is *NOT* just an address-value on to which an integer is added in an an UNSCALED manner.

Rather the integer is scaled
(int *)p + 3 <<-->> (int *)(p + 3*sizeof(int))

jononanon · Mar 1, 2014

Rather the integer is scaled

(int *)p + 3 <<-->> (int *)(p + 3*sizeof(int))

Ah sorry, rather like this:
(int *)p + 3 <<-->> (int *)((char*)p + 3*sizeof(int))

jononanon · Mar 1, 2014

(int *)p + 3 <<-->> (int *)((char*)p + 3*sizeof(int))

In other words: plus is not plus... but depends on the types involved in plus.

The behavior of the program.	2	Feb 21, 2014
Dynamic Array Size Problem??	9	Jul 10, 2023
Array of structs function pointer	10	Jul 16, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
Program to find the largest integer element of an array.	1	Mar 2, 2022
Command Line Arguments	0	Mar 7, 2023
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Function is not worked in C	2	Jun 27, 2023

array indexing anecdote

James Kuyper

Helmut Tessarek

Keith Thompson

Stefan Ram

Stefan Ram

Kaz Kylheku

glen herrmannsfeldt

Jorgen Grahn

Ben Bacarisse

glen herrmannsfeldt

Kaz Kylheku

Stefan Ram

BartC

Kaz Kylheku

Jorgen Grahn

glen herrmannsfeldt

Stefan Ram

jononanon

jononanon

jononanon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads