Which if faster?

G

Ganesh

Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?
 
M

Michael

Which of the following is faster and why?
for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?


Two answers:
1) For any reasonably modern optimizing compiler, they should be the
same speed.
2) This kind of micro-optimization will almost never matter; better to
spend time on fixing slow algorithms, handling slow IO, etc. Whatever
a profiler tells you is the part of the code that is *actually* taking
time.

Michael
 
S

Salt_Peter

Ganesh said:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?


Probably because the first implies a cast.

Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}
 
F

Frederick Gotham

Ganesh posted:
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....



On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.
 
H

Howard

Salt_Peter said:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?


Probably because the first implies a cast.

Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}


Huh? The "int" type is the "native" integral type on any machine. Why
would there be a cast using "int i" to index a dynamically allocated array
(or any array for that matter)? The question wasn't about vectors, but
about raw pointers and arrays. I'd think that a "conversion" from size_t to
int would be needed if using your code and indexing the array with it. But
I see no cast implied or required anywhere in the OP's code.


The "usual" assumption is that the latter method (using pointers and
incrementing) is faster, on the assumption that indexing requires
multiplying the index by the size of an element, and adding that as an
offset to the array start. The assumption is false. That's one way to
implement indexing, but compilers are not required to implement it that way.

Indeed, a decent optimizing compiler is likely to take advantage of whatever
native machine code would do the job best, whether that's by using code more
like the pointer version, or by pre-loading some registers and calling a
built-in machine instruction which handles loops with one simple command.
Or any other solution they see fit to use.

Compiler makers undoubtedly profile their solutions. You should do the same
if you want to know whether you need to change your code, and whether a
proposed solution is a good one for your particular platform.

-Howard
 
R

Rolf Magnus

Ganesh said:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why?


I have seen that happen on some compiler. Depending on the optimization
capabilities of the compiler and on the type that a points to, it might
happen that i gets multiplied with that type's size on each iteration for
the first solution, while the latter does an addition instead. Since on
many architectures, an addition is quicker than a multiplication, it can
happen that the second one is faster.
Or is that dependent on architecture or compiler?

It is.
 
H

Howard

Frederick Gotham said:
Ganesh posted:
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....



On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);


Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement a
loop, regardless of what the original code looks like. The best way to know
what's fastest is to test it on your platform. And that's _still_ no
guarantee that you've actually hit upon the absolute fastest method for your
compiler and platform.

-Howard
 
F

Frederick Gotham

Howard posted:
Can you back those statements up with documentation?

There are many forms of optimizations a compiler might use to implement
a loop, regardless of what the original code looks like. The best way
to know what's fastest is to test it on your platform. And that's
_still_ no guarantee that you've actually hit upon the absolute fastest
method for your compiler and platform.


A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.lang.c/browse_frm/thread/33ac11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4ca430be
 
V

Victor Bazarov

Frederick said:
Ganesh posted:
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....



On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.


They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.

V
 
F

Frederick Gotham

Victor Bazarov posted:
On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.


Sorry, I don't follow. . .

(Also, the code assumes that len is non-zero)
 
V

Victor Bazarov

Frederick said:
Victor Bazarov posted:
On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer
and an offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);

When writing portable code, I opt for the former.

They are not semantially equivalent. You need an 'if' in front of
the 'do' to get them there.


Sorry, I don't follow. . .

(Also, the code assumes that len is non-zero)

Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

is not semantically equivalent to

int i = 0;
do
blah;
while (++i < len);

, that's all.

V
 
H

Howard

Frederick Gotham said:
Howard posted:

Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about. Here are your statements that I was
challenging:
On most systems, the following is the fastest:

int *p = arr;
int const *const pover = arr+len;

do *p++ = ...
while (pover != p);

On systems which have a CPU instruction which takes both a pointer and an
offset, the following is fastest:

size_t i = 0;

do arr[i++] = ...
while (len != i);
A few of us tested them over on comp.lang.c. Here's the relevant thread:

http://groups.google.ie/group/comp.lang.c/browse_frm/thread/33ac11a76eb728ad/
2e09898d4ca430be?lnk=st&q=&rnum=6&hl=en#2e09898d4ca430be

I see a couple tests, but nothing which makes your statements always true,
even for the specific code you gave (let alone for the more general case the
OP asked about). How do you know that my compiler won't be able to unroll a
loop? Or that it won't load up a couple registers and call a single machine
instruction that performs the loop? (I've used those many times when
writing assembly code to handle simple operations on arrays.)

I think that you can really only say that, on your machine, using your
version of your compiler, your specific tests showed some specific results.
It's not valid to extend that to: "on any machine which has such-and-such a
feature, the following code is always fastest". There's simply no way to
know that for sure.

And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that someone
couldn't write a compiler to implement it faster on one or more of those
machines. Or make a machine which will do it faster using some other
implementation.

-Howard
 
H

Howard

Assumption is the mother of all f***-ups.

for (int i = 0; i < len; +i)
blah;

Typing mistakes are certainly a close second. :)

That should obviously be "++i", not "+i".

-Howard
 
C

Clark S. Cox III

Salt_Peter said:
Ganesh said:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?


Probably because the first implies a cast.


It implies no such thing.
Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

I'd consider any compiler that produces significantly different code broken.
 
V

Victor Bazarov

Howard said:
Typing mistakes are certainly a close second. :)

That should obviously be "++i", not "+i".

That's all my keyboard's fault! And since I'm using an MS Natural
thing, it's all the fault of MS!
 
V

Victor Bazarov

Clark said:
Salt_Peter said:
Ganesh said:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and
of size N. I have read somewhere that the latter is faster. But
could not understand why? Or is that dependent on architecture or
compiler?


Probably because the first implies a cast.


It implies no such thing.
Ironically, you'ld probably get "faster" with:
for( size_t i = 0; i < N; ++i)
{
// your choice
}

I'd consider any compiler that produces significantly different code
broken.


Just to throw some fire into the oil, the former case is much easier
parallelizable than then latter.

V
 
G

Ganesh

Ganesh said:
Hi,

This is a question that pertains to pointers in general (C or C++).
Which of the following is faster and why?

for (int i = 0; i < N; i++)
= ... a...

(or)

for (int i = 0; i < N; i++)
= ...*a++....


'a' is a dynamically allocated array of any valid scalar type and of
size N. I have read somewhere that the latter is faster. But could not
understand why? Or is that dependent on architecture or compiler?



In my architecture (Intel Pentium IV 3 GHz, 1MG Cache ) I got the
following times for 10 runs on a reasonably huge array size for huge
number of iterations.

Array Access (seconds): 5 5 5 4 5 5 5 5 5 5
Pointer Access: 4 4 3 4 4 4 4 4 4
3

On the average, pointer access is faster, but array access also equals
the pointer access sometimes. Not sure why this happens. The assembly
code is different for each case. For array access, it shows
..
addl (%esi, %eax,4), %ebx
incl %eax
....
,
while for pointer access it generates
....
addl %eax, %ebx,
addl $4, %eax
....
 
B

BobR

Victor Bazarov wrote in message ...
That's all my keyboard's fault! And since I'm using an MS Natural
thing, it's all the fault of MS!

Nice save, Victor!!
Does *that* keyboard have a BSOD feature? Crash daily?
 
F

Frederick Gotham

Howard posted:
Why did you clip what I was responding to? Now nobody reading this will
know what the heck I was talking about.


Of course they will, by the miracle of threads! You do have threads... don't
you? Or do you still have candles for lightbulbs?

And even if you somehow demonstrated that it's true on all existing
compilers on all existing machines, that would be no guarantee that
someone couldn't write a compiler to implement it faster on one or more
of those machines. Or make a machine which will do it faster using some
other implementation.


Indeed you're right. When writing portable code, I make an educated guess as
to what strategy will produce high efficiency across the board. Looking at
the tests done all today's computers, I opt for the *p++ solution.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,276
Latest member
Sawatmakal

Latest Threads

Top