Improvements in a loop : ++i better than i++ ?

J

joel.foutelet

Hello everybody

This code is intended to compute the number of partitions of a number.
see this link : http://en.wikipedia.org/wiki/Integer_partition


void partition(int n)
{
// total decrit la somme des nombres du tableau indice 0..ptcur
int total;

// prcur nombre d'elements utilises dans le tableau tab
int ptcur;

// nombre de partitions
int nb ;

// compteur de boucle pour l'initialisation
int i;

// tableau decrivant une solution
int *tab = malloc(n * sizeof(int));

if (tab == NULL)
{
fprintf(stderr, "Pb alloc memoire\n");
return;
}

// initialisation : la premiere partition n'est composee que de 1
for(i = 0; i < n; i++)
tab = 1;

nb = 1;
total = n;
ptcur = n-1;

// la boucle infernale
while (tab[0] != n)
{
if (total > n)
{
// on revient un cran en arrière
// et on augmente la valeur pointee
// car il n'y a pas de solution avec cette valeur
// a ce rang
total -= tab[ptcur];
--ptcur;
++tab[ptcur];
++total;
}
else
if (total == n)
{
// on a trouve une solution
// edit(tab, ptcur);
++nb;
// on revient un cran en arrière
// et on augmente la valeur pointee
// car on a epuise toutes les solutions avec cette valeur
// a ce rang
total -= tab[ptcur];
--ptcur;
++tab[ptcur];
++total;
}
else
{
// on avance d'un cran
++ptcur;
// un nombre du tableau est toujours superieur
// ou egal a ses precedents
tab[ptcur] = tab[ptcur-1];
total += tab[ptcur];
}
}
printf("Nombre de solutions %d\n", nb);
free(tab);
}

The loop while is executed à very big number of times (for n = 100
much more than 190,569,292 times which is the number of partitions of
100).

Is it possible to improve this code (not the algorithm) ?
e-g : is ++total faster than total++ in the absolute ?
is it better to compute r = total - n and test r > 0 and r == 0 or
test total > n and total == n ?
I can't test really on my system (Vista) cause of the multiprocessing,
sometimes it gaves me 11s or 13s for the same code and number 100.

Thank you for the hints.
 
R

Richard Tobin

e-g : is ++total faster than total++ in the absolute ?

No, it depends completely on the compiler and in casaes where they
don't make any difference to the meaning, a reasonable compiler
will generate the same code for both.
is it better to compute r = total - n and test r > 0 and r == 0 or
test total > n and total == n ?

Again, it depends on the compiler.
I can't test really on my system (Vista) cause of the multiprocessing,
sometimes it gaves me 11s or 13s for the same code and number 100.

If you can't reliably tell the difference, why do you care?

-- Richard
 
C

Chris McDonald

If you can't reliably tell the difference, why do you care?


Because the OP may wish to understand why they are the same, or different,
what may be causing the observed differences, and to anticipate if the
single- .vs. multi-processing could be having any influence.

All good reasons to ask the question.

______________________________________________________________________________
Dr Chris McDonald E: (e-mail address removed)
Computer Science & Software Engineering W: http://www.csse.uwa.edu.au/~chris
The University of Western Australia, M002 T: +618 6488 2533
Crawley, Western Australia, 6009 F: +618 6488 1089
 
K

Kenny McCormack

[email protected] (Richard Tobin) said:
Because the OP may wish to understand why they are the same, or different,
what may be causing the observed differences, and to anticipate if the
single- .vs. multi-processing could be having any influence.

All good reasons to ask the question.

Not here, it isn't.

That sort of speculation (i.e., actually wanting to learn/know something
for the sake of learning/knowing) is explicitly frowned upon.

You'll get used to this if you spend any time here.
 
A

Army1987

Hello everybody

This code is intended to compute the number of partitions of a number.
see this link : http://en.wikipedia.org/wiki/Integer_partition


void partition(int n)
{ [snip]
printf("Nombre de solutions %d\n", nb);
Wouldn't it make more sense to have the function returning the
number, and its caller print it?
free(tab);
}

The loop while is executed à very big number of times (for n = 100
much more than 190,569,292 times which is the number of partitions of
100).

Is it possible to improve this code (not the algorithm) ?
e-g : is ++total faster than total++ in the absolute ?
Since you were using them as statements, they are supposed to do
the same thing, and I'd be really surprised if they generated
different machine code.
is it better to compute r = total - n and test r > 0 and r == 0 or
test total > n and total == n ?
I don't think so, but try it (see below).
I can't test really on my system (Vista) cause of the multiprocessing,
Look up for the function clock() in <time.h>. It only returns the
time spent by the processor for your program. You can try it
as
#include <stdio.h>
#include <time.h>
int main(void)
{
clock_t t0, t1;
t0 = clock();
partition(100);
t1 = clock();
printf("%g\n", (double)(t1 - t0) / CLOCKS_PER_SEC);
return 0;
}
 
M

Mark Bluemel

I can't test really on my system (Vista) cause of the multiprocessing,
sometimes it gaves me 11s or 13s for the same code and number 100.

You could do with finding out how to access process accounting
information, so that you could, if the platform provided the mechanisms,
measure cpu usage rather than elapse time.

How to do that, and indeed whether it is possible, is not a topic for
this group - you should ask on a group related to the platform, not the
C language.

However, as Richard has already pointed out, it's unlikely you'll be
able to do much tuning along the lines you've asked about... If this is
a reasonably efficient algorithm for the task (I am not qualified to
comment on whether that is the case), then the code is so simple that a
moderately competent compiler will be able to optimise it at least as
well as you could.
 
D

Duncan Muirhead

Hello everybody

This code is intended to compute the number of partitions of a number.
see this link : http://en.wikipedia.org/wiki/Integer_partition
<snip>
I played with this a bit and on my system (Linux PC, gcc 3.4.6) the code
below (reformatted, uncommented to save space in this post) was about 10%
faster. Of course, your mileage may vary. The difference between no
optimisation and -O4 was about a factor of 3. One thing that puzzled me
was that the total>n case occurred about 43% of the time, the total==n
case about 7% and total < n 50%. But it looks as if your original ordering
of the cases ran fastest.

int partition2(int n)
{
int total = n;
int nb = 1;
int i;
int *tab = malloc(n * sizeof(int));
int* ptab = tab + n - 1;
if (tab == NULL)
{ fprintf(stderr, "Pb alloc memoire\n");
return -1;
}
for(i = 0; i < n; i++) tab = 1;
while (tab[0] != n)
{ if (total > n)
{ total -= *ptab--;
*ptab += 1;
total += 1;
}
else if (total == n)
{ ++nb;
total -= *ptab--;
*ptab += 1;
total += 1;
}
else
{ ptab += 1;
*ptab = ptab[-1];
total += *ptab;
}
}
free(tab);
return nb;
}
 
A

Army1987

You could do with finding out how to access process accounting
information, so that you could, if the platform provided the mechanisms,
measure cpu usage rather than elapse time.

How to do that, and indeed whether it is possible, is not a topic for
this group - you should ask on a group related to the platform, not the
C language.
clock() declared in <time.h>. It is standard C.
 
R

Richard Tobin

If you can't reliably tell the difference, why do you care?
[/QUOTE]
Because the OP may wish to understand why they are the same, or different,
what may be causing the observed differences, and to anticipate if the
single- .vs. multi-processing could be having any influence.

If you read the rest of the article, it's pretty clear that he just wants
to micro-optimise to make it as fast as possible. If the effect of the
micro-optimisations is swamped by the variability of his system, they
aren't worth doing.

-- Richard
 
R

Richard Tobin

Just to know (even if it appears no to be a good idea ! "That sort of
speculation (i.e., actually wanting to learn/know something for the
sake of learning/knowing) is explicitly frowned upon. ")

Take no notice of that comment, it was posted by a troll.
I have always heard that that the improvement of rapidity between ++i
and i++ is very very negligible in relation to improvement of the
algorithm.

Typically, it's not negligible, it's non-existent. There's no reason
for compilers to distinguish them when they appear as a complete
statement, because they mean the same thing. If you want to verify it,
most compilers have a way to show you the assembler code they generate.
I work with Visual Studio 2005. I have tested between the two methods
(++i and i++) and it appears that there is a small difference of speed
(about 0.09 s for much more 1,844,349,560 of loops) for 120 in favor
of i++

I'm not sure what you mean by "for 120".

If you saved 0.09s for 1.8 billion loops, that's a 20-billionth of a
second per loop, which is much less than the cycle time of your
processor, so it's unlikely to be due to a difference in the
generated code.

-- Richard
 
J

joel.foutelet

If you can't reliably tell the difference, why do you care?

-- Richard

Just to know (even if it appears no to be a good idea ! "That sort of
speculation (i.e., actually wanting to learn/know something for the
sake of learning/knowing) is explicitly frowned upon. ")

I have always heard that that the improvement of rapidity between ++i
and i++ is very very negligible in relation to improvement of the
algorithm.
Here the algorithm is very short, the calculs are essentially
incrementations and the loop is very very long so I think that here
the difference may be seen.
I work with Visual Studio 2005. I have tested between the two methods
(++i and i++) and it appears that there is a small difference of speed
(about 0.09 s for much more 1,844,349,560 of loops) for 120 in favor
of i++, but as I said, on Vista it is not convincing and I wanted to
have a opinion of comptetent persons.

So I thank you all for your answers.

Joël

PS : The code of Duncan Muirhead is about 3% slower on my system.
 
B

Ben Bacarisse

Hello everybody

This code is intended to compute the number of partitions of a number.
see this link : http://en.wikipedia.org/wiki/Integer_partition


void partition(int n)
{
printf("Nombre de solutions %d\n", nb);
free(tab);
}

The loop while is executed à very big number of times (for n = 100
much more than 190,569,292 times which is the number of partitions of
100).

As always, the best speed-ups come from algorithms. If you rewrite
the function as a recursive function of two parameters (see the
original wikipedia article) and memoize this function[1], you can
calculate partition counts super fast. The expense is extra memory,
of course.
 
D

Duncan Muirhead

Just to know (even if it appears no to be a good idea ! "That sort of
speculation (i.e., actually wanting to learn/know something for the
sake of learning/knowing) is explicitly frowned upon. ")

I have always heard that that the improvement of rapidity between ++i
and i++ is very very negligible in relation to improvement of the
algorithm.
Here the algorithm is very short, the calculs are essentially
incrementations and the loop is very very long so I think that here
the difference may be seen.
I work with Visual Studio 2005. I have tested between the two methods
(++i and i++) and it appears that there is a small difference of speed
(about 0.09 s for much more 1,844,349,560 of loops) for 120 in favor
of i++, but as I said, on Vista it is not convincing and I wanted to
have a opinion of comptetent persons.

So I thank you all for your answers.

Joël

PS : The code of Duncan Muirhead is about 3% slower on my system.

Indeed, mileage does vary. A minor point is that in fact only about 7% of
the loops result in nb being incremented, so the number of loops is around
14 times the number of partitions.
 
R

Richard

Just to know (even if it appears no to be a good idea ! "That sort of
speculation (i.e., actually wanting to learn/know something for the
sake of learning/knowing) is explicitly frowned upon. ")

I have always heard that that the improvement of rapidity between ++i
and i++ is very very negligible in relation to improvement of the
algorithm.
Here the algorithm is very short, the calculs are essentially
incrementations and the loop is very very long so I think that here
the difference may be seen.
I work with Visual Studio 2005. I have tested between the two methods
(++i and i++) and it appears that there is a small difference of speed
(about 0.09 s for much more 1,844,349,560 of loops) for 120 in favor
of i++, but as I said, on Vista it is not convincing and I wanted to
have a opinion of comptetent persons.

These numbers really are meaningless. A surge in room temperature
because of a potent fart could have caused the CPU to hiccup.

There is no difference here.
So I thank you all for your answers.

Joël

PS : The code of Duncan Muirhead is about 3% slower on my system.

--
 
C

Clever Monkey

Hello everybody

This code is intended to compute the number of partitions of a number.
see this link : http://en.wikipedia.org/wiki/Integer_partition
[...]
The loop while is executed à very big number of times (for n = 100
much more than 190,569,292 times which is the number of partitions of
100).

Is it possible to improve this code (not the algorithm) ?
e-g : is ++total faster than total++ in the absolute ?
is it better to compute r = total - n and test r > 0 and r == 0 or
test total > n and total == n ?
I can't test really on my system (Vista) cause of the multiprocessing,
sometimes it gaves me 11s or 13s for the same code and number 100.
If you really want to see what this specific compiler does on this
specific platform, you will need to profile the code. Measuring
wallclock time tells you nearly nothing in this case.
 
K

Keith Thompson

Just to know (even if it appears no to be a good idea ! "That sort of
speculation (i.e., actually wanting to learn/know something for the
sake of learning/knowing) is explicitly frowned upon. ")

The quoted text ("That sort of speculation ...") is a lie.

[...]
I work with Visual Studio 2005. I have tested between the two methods
(++i and i++) and it appears that there is a small difference of speed
(about 0.09 s for much more 1,844,349,560 of loops) for 120 in favor
of i++, but as I said, on Vista it is not convincing and I wanted to
have a opinion of comptetent persons.
[...]

Your compiler probably has an option to generate an assembly listing.
Compare the listings for two versions of your program, one with 'i++'
and one with '++i'. I predict that (a) you'll see exactly the same
instructions in both cases, and (b) you'll see the same kind of timing
variation when you test the *same* program repeatedly. (I could be
wrong, of course.)
 
M

Mark McIntyre

On Wed, 08 Aug 2007 05:51:36 -0700, in comp.lang.c ,
I have always heard that that the improvement of rapidity between ++i
and i++ is very very negligible in relation to improvement of the
algorithm.

The myth about ++i vs i++ dates back to when computers had much
simpler instruction sets and needed an extra operation to
post-increment. Unless you have an unusual use of your loop variable,
you will see zero difference.
Here the algorithm is very short, the calculs are essentially
incrementations and the loop is very very long so I think that here
the difference may be seen.

I doubt it.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
J

joel.foutelet

I have modifyed the code in the way suggested by Duncan Muirhead
(first total < n then total > n and the third case) and I win about
6,5s when I compute the partitions of 120.
I have also generate an assembly listing and I observed that total++
and ++total have exactly the same instructions in both cases as
predict by Keith Thompson. I think it closes the thread.

Once again, I would like to thank you all for the time passed and your
answers.
 
K

Keith Thompson

Mark McIntyre said:
On Wed, 08 Aug 2007 05:51:36 -0700, in comp.lang.c ,


The myth about ++i vs i++ dates back to when computers had much
simpler instruction sets and needed an extra operation to
post-increment. Unless you have an unusual use of your loop variable,
you will see zero difference.
[...]

I think the myth has survived in part because of C++. The prefix and
suffix '++' operators can have significantly different performance, at
least in some cases, when they're overloaded for user-defined types;
likewise for '--'. This doesn't apply to C, of course.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top