Arrays Vs Pointers

P

peter

Hi all

I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1] OR *(array +1).

One book says that you should not use the syntax array[1] because
of performance reason. Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers. So
what is the true oh wizards? Does it make a difference?

Thanks Peter
 
S

Stefan Ram

peter said:
I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1] OR *(array +1).

»*« is the opposite of a »reference«, it is a /de/reference.
»[]« is an abbreviation, so it should be preferred when shorter.
One book says that you should not use the syntax array[1] because
of performance reason.

I do not believe this.
Another book says that the syntax array[1] is only used by
FORTRAN programmers who do not understand c pointers.

I do not believe this.
 
J

James Kuyper

Hi all

I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1] OR *(array +1).

One book says that you should not use the syntax array[1] because
of performance reason. Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers. So
what is the true oh wizards? Does it make a difference?

Burn both books - they're both telling you nonsense. If you can't afford
to do that, then sell them to somebody, but realize that you may be
ruining the career of any one who makes the mistake of learning C from
those books.

By definition, the behavior of array[1] is EXACTLY equivalent to
*(array+1); any compiler which generates different code for the
different expressions is not very well designed. array[1] takes one less
character to type, and that character is a shifted one. Far more
important is that, IMO, it's usually easier to read and understand. The
other syntax occurs only in only a few cases in the body of code I'm
currently responsible for, and only in code written by someone else.

However, if you run into a context where *(array+1) makes your code
easier to read and understand, don't hesitate to use it.
 
S

Stephen Sprunk

I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1] OR *(array +1).

What is "best" depends on what you're measuring, and you don't specify.

IMHO, the most important thing in nearly all cases is clarity of code,
and array notation is usually better for that, though you may find
exceptions now and then.
One book says that you should not use the syntax array[1] because
of performance reason. Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers. So
what is the true oh wizards? Does it make a difference?

It sounds like you need better books.

With modern compilers, array notation should result in code that is _at
least_ as good as pointer notation. It can result in _better_ code, in
some cases, if it allows the compiler to make more aggressive
optimizations due to not having to worry about aliasing.

S
 
K

Keith Thompson

peter said:
I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1] OR *(array +1).

array[1] is *by definition* equivalent to *(array+1).
One book says that you should not use the syntax array[1] because
of performance reason. Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers. So
what is the true oh wizards? Does it make a difference?

Both books either are quite old, or were written by an author who
doesn't really know what he's talking about. Use whichever form
more clearly expresses what you're doing, and let the compiler
worry about generating decent code.

Once upon a time, the advice might have made some sense. But these
days, compilers are likely to generate equally good code for either
form (as long as you specify an optimization option like "-O3").
 
S

Stefan Ram

Keith Thompson said:
array[1] is *by definition* equivalent to *(array+1).

#include <stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); }
 
P

Paul N

I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1]   OR    *(array +1).

You should use whichever is clearest. If the thing really is an array,
then the clearest way will usually be array[1].
One book says that you should not use the syntax array[1] because
of performance reason.

As others have said, this seems highly unlikely for the example you
have given above. What the book may be recommending is that you use a
pointer to step though the array. For example, replacing:

int x;
for (x = 0; array[x]; x++) dostuff(array[x]);

with

type *p;
for (p = array; *p; p++) dostuff(*p);

The latter might be slightly quicker, but it's not likely to make too
much difference.
Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers.

Possibly what the book is trying to say is that, if you want your
array to store ten values, a proper C programmer will do:

type array[10];

and will use the values from array[0] to array[9]. Someone who is more
used to another language might do:

type array[11];

and use the values from array[1] to array[10]. This wastes the space
used for array[0], and means you have to remember to allocate an array
one bigger than what you actually want.

Hope this helps.
Paul.
 
B

Ben Bacarisse

Keith Thompson said:
array[1] is *by definition* equivalent to *(array+1).

#include <stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); }

*sigh* Must "equivalent" mean "can be lexically replaced by"? Has
c.l.c become a place where "a+b is equivalent to b+a" must be
challenged?
 
B

Ben Bacarisse

Stephen Sprunk said:
On 07-Feb-12 15:28, peter wrote:
One book says that you should not use the syntax array[1] because
of performance reason. Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers. So
what is the true oh wizards? Does it make a difference?

It sounds like you need better books.

I agree, but I am saddened by the number of daft remarks attributed to
books without a citation.

Peter: please name these books. The world needs to know who is saying
such things.

<snip>
 
S

Shao Miller

Hi all

I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1] OR *(array +1).

The expression:

E1[E2]

is defined to be semantically identical to:

(*((E1) + (E2)))

So your example 'array[1]' is identical to '(*((array) + (1)))'. After
removing some of those brackets, we get '*(array + 1)', which is just
the alternative you are concerned about.
One book says that you should not use the syntax array[1] because
of performance reason. Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers. So
what is the true oh wizards? Does it make a difference?

Which books are those?

Did either of them mention doing:

1[array]

? It's just as valid, and actually comes in handy in at least one
particular case:

#define CountOf(array_) (sizeof (array_) / sizeof *(array_))

int (* ptr_to_array)[42];

ptr_to_array = malloc(sizeof *ptr_to_array);
if (ptr_to_array) {
int i;

/* Assign '13' to each element */
for (i = 0; i < CountOf(*ptr_to_array); ++i)
i[*ptr_to_array] = 13;
/*
* Versus the following, which is slightly
* more typing, but probably clearer
*/
(*ptr_to_array) = 13;
}
 
S

Stefan Ram

Ben Bacarisse said:
Must "equivalent" mean "can be lexically replaced by"?

No, it's ambigous. It might mean this or something else.

When James writes:

|array[1] takes one less character to type

we are indeed comparing expressions, not their values.
 
S

Shao Miller

Keith Thompson said:
array[1] is *by definition* equivalent to *(array+1).

#include<stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); return 0;
}

I had to, though it was 100% correct and 0% helpful. Sorry about that. :)
 
I

Ike Naar

Keith Thompson said:
array[1] is *by definition* equivalent to *(array+1).

#include <stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); }

You forgot something. Try

printf( "%d\n", ( array[1] ) [ 1 ]);
printf( "%d\n", ( *(array+1) ) [ 1 ]); }
 
J

James Kuyper

Ben Bacarisse said:
Must "equivalent" mean "can be lexically replaced by"?

No, it's ambigous. It might mean this or something else.

When James writes:

|array[1] takes one less character to type

we are indeed comparing expressions, not their values.

Correct - in the context that you brought up, it saves three characters,
all shifted, and not just one.
 
K

Keith Thompson

Keith Thompson said:
array[1] is *by definition* equivalent to *(array+1).

#include <stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); }

Ok, quoting the standard (C99 6.5.2.1p2):

The definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2))).

It was slightly sloppy of me to omit the parentheses.

(BTW, I find your compressed code layout difficult to read.)
 
K

Keith Thompson

Shao Miller said:
Keith Thompson said:
array[1] is *by definition* equivalent to *(array+1).

#include<stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); return 0;
}

I had to, though it was 100% correct and 0% helpful. Sorry about that. :)

C does not require a "return 0;" at the end of main(), as of the current
standard or the one preceding it.
 
K

Keith Thompson

Ike Naar said:
Keith Thompson said:
array[1] is *by definition* equivalent to *(array+1).

#include <stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); }

You forgot something. Try

printf( "%d\n", ( array[1] ) [ 1 ]);
printf( "%d\n", ( *(array+1) ) [ 1 ]); }

I don't believe he "forgot" anything. (In fact, I did.)
 
S

Shao Miller

Shao Miller said:
array[1] is *by definition* equivalent to *(array+1).

#include<stdio.h> // printf

int main( void )
{ int b[] ={ 234, 567 }; int * array[ 2 ]={ 0, b };
printf( "%d\n", array[1] [ 1 ]);
printf( "%d\n", *(array+1) [ 1 ]); return 0;
}

I had to, though it was 100% correct and 0% helpful. Sorry about that. :)

C does not require a "return 0;" at the end of main(), as of the current
standard or the one preceding it.

I hope you're joking. Mine was a joke. :) No "requirement" was
claimed. I hoped to add something valuable, just as Mr. Stefan Ram did
for you (other than from Ben's perspective, that is).
 
J

John Bode

Hi all

I am currently exploring the world of pointers and have encounter some
inconsistent information regarding the best way to reference an
array: array[1]   OR    *(array +1).

One book says that you should not use the syntax array[1] because
of performance reason. Another book says that the syntax array[1] is
only used by FORTRAN programmers who do not understand c pointers. So
what is the true oh wizards? Does it make a difference?

Thanks Peter

Out of curiosity, which books are these?

FWIW, subscript notation *may* result in a few more instructions at
the assembly level than pointer notation. Here are some examples
compiled with gcc -g -Wa,aldh:

First, indexing with a constant expression:

14:init.c **** y = x[5];
210 .LM5:
211 0025 8B45DC movl -36(%ebp), %eax
212 0028 8945C4 movl %eax, -60(%ebp)
15:init.c **** y = *(x + 5);
214 .LM6:
215 002b 8B45DC movl -36(%ebp), %eax
216 002e 8945C4 movl %eax, -60(%ebp)

No difference; the same number of instructions are generated.

Indexing with an auto variable:

17:init.c **** y = x[z];
218 .LM7:
219 0031 8B45C0 movl -64(%ebp), %eax
220 0034 89C0 movl %eax, %eax
221 0036 8D148500 leal 0(,%eax,4), %edx
221 000000
222 003d 8D45C8 leal -56(%ebp), %eax
223 0040 89C0 movl %eax, %eax
224 0042 8B0402 movl (%edx,%eax), %eax
225 0045 8945C4 movl %eax, -60(%ebp)
18:init.c **** y = *(x + z);
227 .LM8:
228 0048 8B45C0 movl -64(%ebp), %eax
229 004b 8B4485C8 movl -56(%ebp,%eax,4), %eax
230 004f 8945C4 movl %eax, -60(%ebp)

Hmp. Subscripting with a variable results in more instructions
compared to the manual dereference, at least under these
circumnstances (gcc compiler, debugging turned on, no optimization).

This is one specific case, using one specific compiler with one
specific set of compiler settings. If I turn on optimization, those
differences may disappear. Or they may not. It may turn out to be
that using subscript notation *really is* less efficient (or at least
requires more instructions) than manually adding offsets and
dereferencing in most circumstances.

And unless you're failing to meet a *hard* space or performance
requirement AND array accesses are the absolute last place you can
squeeze out those last few bytes and/or cycles *AND* you *know* that
pointer notation will result in fewer/faster instructions, use
subscript notation (x). It's easier to read (especially for
multidimensional array accesses) and it more clearly conveys intent.

When you're thinking about this kind of micro-optimization, you have
to ask yourself the following questions:

1. How many times am I executing this operation? Do I do it once
over the lifetime of the program (in which case any gains are down in
the noise), or do I repeat the operation hundreds or thousands (or
more) times (in which case the gains are measurable)?

2. Have I *measured* the performance difference between the two
versions? Under a variety of conditions?

3. Are these differences consistent across different compilers?
Would leaving the code alone and simply switching to a different/
better compiler buy me the performance I need?

4. What is the tradeoff in terms of readability/maintainability?
Would I rather debug code that reads

x = y[j++][++k];

or

x = *(*(*(y + i) + j++) + ++k);

Both hurt to look at, but IMO the first hurts less. YMMV.
 
P

Philip Lantz

John said:
I am currently exploring the world of pointers and have encounter
some inconsistent information regarding the best way to reference
an array: array[1]   OR    *(array +1).

One book says that you should not use the syntax array[1] because
of performance reason.

FWIW, subscript notation *may* result in a few more instructions at
the assembly level than pointer notation. Here are some examples
compiled with gcc -g -Wa,aldh:
[examples snipped]

Hmp. Subscripting with a variable results in more instructions
compared to the manual dereference, at least under these
circumnstances (gcc compiler, debugging turned on, no optimization).

Looking at the code generated with no optimization is pointless--if you
care about performance, you need to turn on optimizations; it's as
simple as that. GCC generates really horrible code with optimizations
off, as exemplified by the *two* "mov eax, eax" instructions in your
sample code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top