A newbie's code

C

Clark S. Cox III

Frederick said:
Flash Gordon posted:

What is aliasing? I've never heard of it. (Not being sarcastic)
http://en.wikipedia.org/wiki/Aliasing_(computing)



Indeed, both snippets achieve the same objective -- one is inherently more
efficient though.

This is simply not true; given the following code:
//BEGIN CODE
#include <stdio.h>
#include <time.h>

void foo1(int arr[], size_t len)
{
for(size_t i = 0; i != len; ++i)
{
arr += 77;
}
}

void foo2(int arr[], size_t len)
{
int *p = arr;
int const *const pover = arr + len;

do
{
*p++ += 77;
}
while(p != pover);
}

int main()
{
int arr[1000000];

clock_t start = clock();
for(int i=0; i!= 100; ++i)
{
foo1(arr, sizeof arr / sizeof *arr);
}

clock_t middle = clock();
for(int i=0; i!= 100; ++i)
{
foo2(arr, sizeof arr / sizeof *arr);
}

clock_t end = clock();

printf("array indexing: %f seconds\n", (middle - start) * 1.0 /
CLOCKS_PER_SEC);
printf("pointer arith : %f seconds\n", (end - middle) * 1.0 /
CLOCKS_PER_SEC);

return 0;
}
//END CODE

When I compile and run this on my Intel MacBook (with GCC 4.0.1), I get
the following output from 3 successive runs:

array indexing: 0.240000 seconds
pointer arith : 0.240000 seconds

array indexing: 0.240000 seconds
pointer arith : 0.240000 seconds

array indexing: 0.240000 seconds
pointer arith : 0.230000 seconds

However, when I do the same on my PowerMac G5, I get:

array indexing: 0.340000 seconds
pointer arith : 0.360000 seconds

array indexing: 0.340000 seconds
pointer arith : 0.350000 secondss

array indexing: 0.340000 seconds
pointer arith : 0.350000 seconds

So, there you have it, on one platform, the difference seems to be just
noise, and on another the array indexing method is actially faster.
 
E

ena8t8si

Frederick said:
size_t my_strlen_array( const char *s ){
unsigned r = 0;
while (s[r]) r++;
return r;
}
size_t my_strlen_pointer( const char *s ){
unsigned r = 0;
while (*s++) r++;
return r;
}

size_t my_strlen_pointer_2( const char *const s_0 ){
const char *s = s_0;
while (*s) s++;
return s-s_0;
}

My measurements show my_strlen_array is faster than
either of the pointer versions. YMMV, of course.


How could that possibly be? I would have thought the fastest was:

#include <cassert>

size_t StrLength(char const *const start)
{
int dummy = (assert(start), 0);

char const *p = start;

while(*p++);

return p - start - 1;
}

In fact I had tried something like that along
with my other examples -

size_t my_strlen_pointer_3( const char *const s_0 ){
const char *s = s_0;
while (*s++);
return s-1-s_0;
}

It ran slower than all the other versions, so I
didn't put it in.

How can that be? It be!
 
F

Frederick Gotham

Clark S. Cox III posted:
So, there you have it, on one platform, the difference seems to be just
noise, and on another the array indexing method is actially faster.


On my system (Intel Pentium 3, 500 MHz), pointers are 61% faster.

61% is far from negligible.

Try for yourself:

#include <stddef.h>
#include <stdio.h>
#include <time.h>

void Fun1(unsigned *const p, size_t const len)
{
size_t i;

for(i = 0; i != len; ++i)
{
p += 77;
}
}

void Fun2(unsigned *p, size_t const len)
{
unsigned const *const pover = p + len;

do *p++ += 77;
while(pover != p);
}

int main(void)
{
clock_t time_start, time_elapsed1=0, time_elapsed2=0;
size_t i;
unsigned arr[99999] = {0};

puts("Please wait, this shouldn't take longer than 20 seconds...\n");

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun1(arr, sizeof arr / sizeof*arr);
}
time_elapsed1 += clock() - time_start;

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun2(arr, sizeof arr / sizeof*arr);
}
time_elapsed2 += clock() - time_start;

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun1(arr, sizeof arr / sizeof*arr);
}
time_elapsed1 += clock() - time_start;

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun2(arr, sizeof arr / sizeof*arr);
}
time_elapsed2 += clock() - time_start;


printf(" Array Subscripting: %f seconds.\n",
(double)time_elapsed1 / CLOCKS_PER_SEC);
printf("Pointer Incrementation: %f seconds.\n",
(double)time_elapsed2 / CLOCKS_PER_SEC);

if(time_elapsed1 > time_elapsed2)
{
printf("\nPointers are %f%% faster.\n\n",
(double)time_elapsed1/time_elapsed2*100-100);
}
else
{
printf("\nArrays are %f%% faster.\n\n",
(double)time_elapsed2/time_elapsed1*100-100);
}

return 0;
}
 
F

Frederick Gotham

Bill Pursell posted:
yes, but how do you **set** the pointer?


Well to be honest, I don't write much C code (mostly C++). Up until very
recently, I always wrote:

p = 0;

, but lately I've taken a liking to "nullptr":

#define nullptr ((void*)0)

p = nullptr;

If I were writing C, I'd probably use NULL instead.

Anyhow, I always check for null pointers with:

if (!p) ...
 
I

Ian Collins

Frederick said:
Clark S. Cox III posted:





On my system (Intel Pentium 3, 500 MHz), pointers are 61% faster.

61% is far from negligible.

Try for yourself:
<snipped code>

I did, the difference varied between arrays 4% faster and pointers 11%
faster, depending on target and optimisations.

With a little help form OpenMP and an extra CPU, arrays were 37% faster.
 
F

Frederick Gotham

Ian Collins posted:
<snipped code>

I did, the difference varied between arrays 4% faster and pointers 11%
faster, depending on target and optimisations.

With a little help form OpenMP and an extra CPU, arrays were 37% faster.


I don't know what OpenMP is, plus I've never worked with a computer which
had more than one CPU.

Whatever way you've achieved the 37% faster arrays, I wonder could that
power be channelled to speed up the pointers... ?

I'm still curious as to how the array subscripting could possibly be
faster. It seems to be that it'd be faster to:

(1) Dereference a pointer
(2) Increment the pointer

rather than:

(1) Add an integer to a pointer value
(2) Dereference the pointer
(3) Increment the integer
 
C

Clark S. Cox III

Frederick said:
Ian Collins posted:



I don't know what OpenMP is, plus I've never worked with a computer which
had more than one CPU.

Whatever way you've achieved the 37% faster arrays, I wonder could that
power be channelled to speed up the pointers... ?

I'm still curious as to how the array subscripting could possibly be
faster. It seems to be that it'd be faster to:

(1) Dereference a pointer
(2) Increment the pointer

rather than:

(1) Add an integer to a pointer value
(2) Dereference the pointer
(3) Increment the integer

Some CPU's have single instructions for "load the memory at the location
described by a pointer and offset", on such platforms it's:

(1) Load the indexed memory
(2) Increment the index

My point being that you cannot say that either form is inherently more
efficient than the other. With that in mind, it is usually better to
write readable code first, and then to only optimize once you find a
bottleneck.

Compiler optimizers are much better than you seem to be believe. :)
 
I

Ian Collins

Frederick said:
Ian Collins posted:



I don't know what OpenMP is, plus I've never worked with a computer which
had more than one CPU.
http://www.openmp.org

It's the standard way of adding parallel processing to C, C++ and
Fortran code. The single core has had its day on the desktop.
Whatever way you've achieved the 37% faster arrays, I wonder could that
power be channelled to speed up the pointers... ?
I doubt it, OpenMP works with arrays (or at let that's all I used it for).
I'm still curious as to how the array subscripting could possibly be
faster. It seems to be that it'd be faster to:

(1) Dereference a pointer
(2) Increment the pointer

rather than:

(1) Add an integer to a pointer value
(2) Dereference the pointer
(3) Increment the integer
Mainly because the compiler finds it easier to do loop unrolling with
arrays. In the case of Sun cc, the array loop was optimised with 5
consecutive assignments.

Either way, it's not wise to make generalisations on performance,
there's inevitably an exception!
 
F

Frederick Gotham

posted:
It ran slower than all the other versions, so I
didn't put it in.

How can that be? It be!


Try this... it prints the following on my system:

Strlen1: 9.703000 seconds.
Strlen2: 6.870000 seconds.
Strlen3: 9.704000 seconds.
Strlen4: 9.584000 seconds.
strlen: 0.010000 seconds.

I would have thought Strlen3 would be the fastest (except for "strlen" of
course). I'm quite suprised however that strlen is so much faster...

#include <stddef.h>
#include <stdio.h>
#include <time.h>
#include <string.h>

char const str[] =
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz";

typedef struct Stopwatch {
clock_t start_time;
} Stopwatch;

void StopwatchReset(Stopwatch *const p)
{
p->start_time = clock();
}

clock_t StopwatchDuration(Stopwatch *const p)
{
/* Returns time elapsed since
stopwatch was last reset. */

clock_t const now = clock();

return now - p->start_time;
}

size_t Strlen1(char const *const p)
{
size_t i = 0;
while(p) ++i;
return i;
}

size_t Strlen2(char const *p)
{
size_t i = 0;
while(*p++) ++i;
return i;
}

size_t Strlen3(char const *const parg)
{
char const *p = parg;
while (*p++);
return parg - p - 1;
}

size_t Strlen4(char const *const parg)
{
char const *p = parg;
while (*p) ++p;
return parg - p;
}

int main(void)
{
Stopwatch watch;
clock_t dur1,dur2,dur3,dur4,dur5;
size_t i,j=0;

puts("Please wait, this shouldn't take longer than 30 seconds...\n");

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen1(str);
dur1 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen2(str);
dur2 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen3(str);
dur3 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen4(str);
dur4 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
strlen(str);
dur5 = StopwatchDuration(&watch);

printf("Strlen1: %f seconds.\n",
(double)dur1 / CLOCKS_PER_SEC);
printf("Strlen2: %f seconds.\n",
(double)dur2 / CLOCKS_PER_SEC);
printf("Strlen3: %f seconds.\n",
(double)dur3 / CLOCKS_PER_SEC);
printf("Strlen4: %f seconds.\n",
(double)dur4 / CLOCKS_PER_SEC);
printf(" strlen: %f seconds.\n",
(double)dur5 / CLOCKS_PER_SEC);

return 0;
}
 
I

Ian Collins

Frederick said:
posted:





Try this... it prints the following on my system:

Strlen1: 9.703000 seconds.
Strlen2: 6.870000 seconds.
Strlen3: 9.704000 seconds.
Strlen4: 9.584000 seconds.
strlen: 0.010000 seconds.
Again, it depends:

32 bit:

Strlen1: 0.590000 seconds.
Strlen2: 0.900000 seconds.
Strlen3: 0.890000 seconds.
Strlen4: 0.890000 seconds.
strlen: 0.230000 seconds.

64 bit:

Strlen1: 0.590000 seconds.
Strlen2: 0.600000 seconds.
Strlen3: 0.600000 seconds.
Strlen4: 0.600000 seconds.
strlen: 0.120000 seconds.

Same CPU and compiler optimisations.
I would have thought Strlen3 would be the fastest (except for "strlen" of
course). I'm quite suprised however that strlen is so much faster...
Some CPUs are slower doing indirect access (*p).

strlen doesn't have to scan character by character.
 
S

Sjouke Burry

Frederick said:
posted:

It ran slower than all the other versions, so I
didn't put it in.

How can that be? It be!



Try this... it prints the following on my system:

Strlen1: 9.703000 seconds.
Strlen2: 6.870000 seconds.
Strlen3: 9.704000 seconds.
Strlen4: 9.584000 seconds.
strlen: 0.010000 seconds.

I would have thought Strlen3 would be the fastest (except for "strlen" of
course). I'm quite suprised however that strlen is so much faster...

#include <stddef.h>
#include <stdio.h>
#include <time.h>
#include <string.h>

char const str[] =
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz";

typedef struct Stopwatch {
clock_t start_time;
} Stopwatch;

void StopwatchReset(Stopwatch *const p)
{
p->start_time = clock();
}

clock_t StopwatchDuration(Stopwatch *const p)
{
/* Returns time elapsed since
stopwatch was last reset. */

clock_t const now = clock();

return now - p->start_time;
}

size_t Strlen1(char const *const p)
{
size_t i = 0;
while(p) ++i;
return i;
}

size_t Strlen2(char const *p)
{
size_t i = 0;
while(*p++) ++i;
return i;
}

size_t Strlen3(char const *const parg)
{
char const *p = parg;
while (*p++);
return parg - p - 1;
}

size_t Strlen4(char const *const parg)
{
char const *p = parg;
while (*p) ++p;
return parg - p;
}

int main(void)
{
Stopwatch watch;
clock_t dur1,dur2,dur3,dur4,dur5;
size_t i,j=0;

puts("Please wait, this shouldn't take longer than 30 seconds...\n");

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen1(str);
dur1 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen2(str);
dur2 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen3(str);
dur3 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen4(str);
dur4 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
strlen(str);
dur5 = StopwatchDuration(&watch);

printf("Strlen1: %f seconds.\n",
(double)dur1 / CLOCKS_PER_SEC);
printf("Strlen2: %f seconds.\n",
(double)dur2 / CLOCKS_PER_SEC);
printf("Strlen3: %f seconds.\n",
(double)dur3 / CLOCKS_PER_SEC);
printf("Strlen4: %f seconds.\n",
(double)dur4 / CLOCKS_PER_SEC);
printf(" strlen: %f seconds.\n",
(double)dur5 / CLOCKS_PER_SEC);

return 0;
}

Might the compiler have optimized the true system routine
out of the loop,because nothing is midified in the loop?
(And then throw out the empty for loop?)
 
F

Frederick Gotham

Sjouke Burry posted:
Might the compiler have optimized the true system routine
out of the loop,because nothing is midified in the loop?
(And then throw out the empty for loop?)


I had considered that. I also considered how it's nice to snip quotes.
 
C

CBFalconer

Frederick said:
Sjouke Burry posted:


I had considered that. I also considered how it's nice to snip quotes.

To find out all you have to do is examine the object code.
 
R

Richard Bos

Frederick Gotham said:
Richard Bos posted:


I was implying that it would be a compile-time programmer error if either
pointer were null, not a runtime error.

That fails for at least two reasons. One, assert() doesn't work that
way. Two, most null pointer errors cannot be caught at compile time.

Richard
 
F

Frederick Gotham

Richard Bos posted:
That fails for at least two reasons. One, assert() doesn't work that
way. Two, most null pointer errors cannot be caught at compile time.


I worded that poorly. I'll try again.

Let's take a simple function:

void Func(int *const p)
{
assert(p);

*p = 5;
}

In the finished program, it shouldn't make a difference whether NDEBUG is
defined or not, because Func should never be called with a null pointer.

If it IS called with a null pointer, it's not a runtime error, but rather an
error that the programmer made in writing the function which calls Func. (My
use of "compile-time" was misleading.)
 
C

Chris Dollin

Frederick said:
Let's take a simple function:

void Func(int *const p)
{
assert(p);

*p = 5;
}

In the finished program, it shouldn't make a difference whether NDEBUG is
defined or not, because Func should never be called with a null pointer.

If it IS called with a null pointer, it's not a runtime error,

It's certainly a run-time error: it's an error, and it occurs at run-time.
but rather an error that the programmer made in writing the function
which calls Func.

The programmer made a mistake in writing the caller (or perhaps the caller's
caller, etc), and that mistake resulted in the error of passing null to
`Func`.
 
F

Frederick Gotham

Chris Dollin posted:
The programmer made a mistake in writing the caller (or perhaps the
caller's caller, etc), and that mistake resulted in the error of passing
null to `Func`.


Yes.

If this were acceptable, I'd have "Func" throw an exception.

However, if I want to downright condemn it, I place an assert in "Func".
 
P

pete

Flash said:
You might, but I and many others would not. Array subscripting is
perfectly clear in my opinion.

It is, but I like pointers and
I like to rewrite other people's code.

char *str_squeeze_f(char *s1, const char *s2)
{
char *const p1 = s1;
const char *const p2 = s2;
char *p3 = s1;

while (*s1 != '\0') {
s2 = p2;
while (*s2 != '\0' && *s1 != *s2) {
++s2;
}
if (*s2 == '\0') {
*p3++ = *s1;
}
++s1;
}
*p3 = '\0';
return p1;
}
 
E

ena8t8si

Frederick said:
posted:
It ran slower than all the other versions, so I
didn't put it in.

How can that be? It be!


Try this... it prints the following on my system:
[...]

I hope you realize that my courtesy in posting
the previous results doesn't carry either an
obligation or an interest to run random benchmarks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,432
Messages
2,571,681
Members
48,796
Latest member
Greg L.

Latest Threads

Top