How increase a C/C++ Program speed ?

RAYYILDIZ · Mar 1, 2005

I Know C is the fastest progrmming language. However, by using some
bitwise operation you can get faster the your program.
For instance, we talk about swap function. For a integer swapping we
use this generally.

void swap(int* a,int* b){
int c;
c=*a;
*b=*a;
*b=c;
}

we use a local variable and swapping varibales with each other, it may
be expensive. We do this swap function like that:

void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}

Yeah, this function swap the integers, too. But it is faster than the
first one. Also do not need a local variable.

Victor Bazarov · Mar 1, 2005

RAYYILDIZ said:
I Know C is the fastest progrmming language.

You know incorrectly. But never mind...

However, by using some
bitwise operation you can get faster the your program.

Whatever that means...

For instance, we talk about swap function. For a integer swapping we
use this generally.

void swap(int* a,int* b){
int c;
c=*a;
*b=*a;
*b=c;

Actually, you have a serious error there. It should be

int c = *a;
*a = *b;
*b = c;

}

we use a local variable and swapping varibales with each other, it may
be expensive. We do this swap function like that:

void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;

That's a bad idea, generally.

}

Yeah, this function swap the integers, too. But it is faster than the
first one. Also do not need a local variable.

It's not faster because it doesn't work in a hypothetical case

int a = 42;
swap(&a, &a);

Unless it's proven that your original (with my corrections) swap is too
slow and affects your overall program too much, there is no need to do
that kind of micro-optimizations. Remember, "Premature optimization is
the root of all evil".

V

Phillip Jordan · Mar 1, 2005

RAYYILDIZ said:
I Know C is the fastest progrmming language.

I won't bite this flamebait, but:

we use a local variable and swapping varibales with each other, it may
be expensive. We do this swap function like that:
void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}

Yeah, this function swap the integers, too. But it is faster than the
first one.

On any CPU I've ever worked with, it won't be. It might work better for
extremely old architectures or some embedded systems that I've never
worked with, though.

Also do not need a local variable.

You are mistaken. Most architectures can't operate on two memory
operands at once, so the data must be loaded into registers. In this
case, the CPU registers can essentially be regarded as temporary variables.

Unless I'm missing something, the code produced will most likely consist
of 6 + 2 (for loading the pointers a and b into registers) memory reads,
3 xor instructions, and 3 memory writes, unless the compiler has
knowledge about the function parameters. (i.e. whether a can equal b or not)

The usual swapping approach with a temporary temporary variable will
easily be optimised by the compiler to load the data at a and b into
registers, and write them back into the reversed memory locations. The
temporary variable will most likely never even be put into memory. (2 +
2 reads, 2 writes) Even if it is put on the stack, for example because
you've disabled compiler optimisations, that only adds one read and
write each, that is still considerably less than the XOR method.

I'm probably off by a few instructions here or there, depending on the
CPU architecture, and how memory offsets are calculated, but I think my
point still stands.

The technique was used on very old register- and memory-starved systems,
as far as I know. It's no longer useful today.

~phil

Clark S. Cox III · Mar 1, 2005

I Know C is the fastest progrmming language.

Says who?

However, by using some
bitwise operation you can get faster the your program.
For instance, we talk about swap function. For a integer swapping we
use this generally.

void swap(int* a,int* b){
int c;
c=*a;
*b=*a;
*b=c;
}

First, you've written that incorrectly. I hope you meant:

void swap(int*a,int*b)
{
int c = *a;
*a = *b;
*b = c;
}

Second, we already have std::swap in the Standard Library:

#include <algorithm>

....
{
int i = 25;
int j = 12;
swap(i,j);
//Now i == 12 and j == 25
}
....

std::swap has the advantage that it will work with any assignable type.

we use a local variable and swapping varibales with each other, it may
be expensive. We do this swap function like that:

void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}

This is a very bad idea, for several reasons:
1) It's premature optimization.
2) It may actually be slower than the more obvious swapping algorithm
on some platforms
3) It is less readable
4) It is not always correct:

int i = 25;
int j = 25;
swap(&i,&j);

Yeah, this function swap the integers, too. But it is faster than the
first one.

Says who?

Also do not need a local variable.

Who cares?

Andrew Koenig · Mar 1, 2005

I Know C is the fastest progrmming language. However, by using some
bitwise operation you can get faster the your program.
For instance, we talk about swap function. For a integer swapping we
use this generally.

void swap(int* a,int* b){
int c;
c=*a;
*b=*a;
*b=c;
}

we use a local variable and swapping varibales with each other, it may
be expensive. We do this swap function like that:

void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}

Yeah, this function swap the integers, too. But it is faster than the
first one. Also do not need a local variable.

Have you actually measured it? On my machine, there's no significant
difference in execution time. Moreover, calling std::swap is significantly
faster than either version.

Using exclusive-or for swapping is a cute trick, but rarely useful in
practice.

Chris Jefferson · Mar 1, 2005

RAYYILDIZ said:
I Know C is the fastest progrmming language. However, by using some
bitwise operation you can get faster the your program.
For instance, we talk about swap function. For a integer swapping we
use this generally.

void swap(int* a,int* b){
int c;
c=*a;
*b=*a;
*b=c;
}

we use a local variable and swapping varibales with each other, it may
be expensive. We do this swap function like that:

void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}

NNNNNNNNOOOOOOOOOO!!!!!!!!!!!

The XORing version, on C++ compiler I've ever used is SLOWER once you
enable even basic optimisation.

SSSSLLLOOOWWWEEERR!!

(sorry, but this comes up so often). Any modern compiler can easily
remove unused variables, or just keep them in a register. They aren't
stupid!

Lets see what the average compiler will do (I check g++ 3.3 at
optimisation -O1):

For your first swap function, any compiler on any optimisation level
will produce the code (sudo-assembler)

read *a into register 1
read *b into register 2
write register 1 into *b
write register 2 into *a

Your code will produce:

read *a into register 1
read *b into register 2
register 1 = register 1 XOR register 2
register 2 = register 2 XOR register 1
register 1 = register 1 XOR register 2
put register 1 into *a
put register 2 into *b

As you can see, your code is clearly taking longer

Chris

assaarpa · Mar 1, 2005

we use a local variable and swapping varibales with each other, it may

be expensive. We do this swap function like that:

void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}

Yeah, this function swap the integers, too. But it is faster than the
first one. Also do not need a local variable.

Depends on your compiler, processor architechture, how the turing complete
(?) system executing the code is configured and other factors someone might
kindly contribute. A neat trick but the possible speedup in a typical case
doesn't come from fact that this tiny fragment of code itself when compiled
to something is faster than what the other function would produce but rather
the fact that it might require one less register (assuming this turing
complete computing system has registers or some level of hierarchy as far as
accessing the variables is concerned speedwise) possibly reducing or
completely avoiding spilling (assuming that your computer and/or
microarchitechture and/or/maybe turing complete computing system and the
compiler implementation are related to the concept of spilling in any shape,
form, method, way or fashion).

Furthermore (insert previous disclaimers enmasse here for security reasons),
your compiler might implement these functions without linking time code
generation, leading to observation that the implementation uses some
fashion, form or equivalent of call/return instructions, or close
resemblance thereof concepts making the issue of avoiding spilling a moot
one from any practical point of view in terms of performance or size of the
compiled code. <- this paragraph makes some rather bold assumptions about
the state of the system you are querying the possible differences of
performance for.

However, you should not concern yourself with this level of optimization
very much as it is highly platform and compiler dependent. The fastest way
to do something is not to do it at all, if you can avoid computing
something: don't compute it. It may be faster overall to do relatively slow
operation only a few times rather than optimized operation many times. Use
std::swap and when it becomes apparent that it is too slow and actual
bottleneck in your application you might find out that no matter how fast
swap you have won't help either.. at that time you will be optimizing
something that actually matters.

I'm not saying it's not all good and beneficial, even sexy if you have the
world's fastest swap.. it's just that either of these two can be faster than
the other one..

Heinz Ozwirk · Mar 1, 2005

RAYYILDIZ said:
I Know C is the fastest progrmming language. However, by using some
bitwise operation you can get faster the your program.
For instance, we talk about swap function. For a integer swapping we
use this generally.

void swap(int* a,int* b){
int c;
c=*a;
*b=*a;
*b=c;
}

we use a local variable and swapping varibales with each other, it may
be expensive. We do this swap function like that:

void swap(int*a ,int *b){
*a ^= *b;
*b ^= *a;
*a ^= *b;
}

Yeah, this function swap the integers, too. But it is faster than the
first one. Also do not need a local variable.

There is at least one compile which proves you wrong. On my machine, the simple solution using a temporary takes about 19 seconds for 1000 millions of swaps. The obfuscated version takes about 26 seconds for the same number of swaps.

Heinz

dumitru · Mar 2, 2005

Hey, folks... I saw in Doom3 SDK code that Mr. Carmack uses this swap
trick....

So it's safe

.

alef · Mar 2, 2005

Il 2005-03-01 said:
I Know C is the fastest progrmming language. However, by using some
bitwise operation you can get faster the your program.
For instance, we talk about swap function. For a integer swapping we
use this generally.

Does know your i396 compiler tool the instruction XCHG ?

Chris Jefferson · Mar 2, 2005

dumitru said:
Hey, folks... I saw in Doom3 SDK code that Mr. Carmack uses this swap
trick....

So it's safe .

Did he use it in C, or in inline assembler?

One thing I think many people miss is that using this swap trick in
assembler can be a useful thing to do. If you have values in two
registers and need to swap them around (perhaps because one of the
registers is a special one which can be used for some special operation)
then doing the "XORing swap trick" lets you swap the two registers
without using a third.

There is however no reason to do this swap trick unless you are playing
some special register tricks, so in plain C++ code with even basic
optimisation it just isn't useful

Chris

Karl Heinz Buchegger · Mar 2, 2005

Chris said:
Did he use it in C, or in inline assembler?

One thing I think many people miss is that using this swap trick in
assembler can be a useful thing to do. If you have values in two
registers and need to swap them around (perhaps because one of the
registers is a special one which can be used for some special operation)
then doing the "XORing swap trick" lets you swap the two registers
without using a third.

Reminds me of a 'trick' we used on IBM/360 back at university.
It was faster (and required less opcode memory) to XOR a register
with itself then to set it to 0.

There is however no reason to do this swap trick unless you are playing
some special register tricks, so in plain C++ code with even basic
optimisation it just isn't useful

Exactly.

Michael Bishop · Mar 23, 2005

Clark said:
On 2005-03-01 09:53:05 -0500, "RAYYILDIZ" <[email protected]> said:

This is a very bad idea, for several reasons:
1) It's premature optimization.
agree

2) It may actually be slower than the more obvious swapping algorithm on
some platforms
agree

3) It is less readable
agree

4) It is not always correct:

int i = 25;
int j = 25;
swap(&i,&j);

huh? explain please.

i = 25;
j = 25; ( i = 25, j = 25 )
i ^= j; ( i = 0, j = 25 )
j ^= i; ( i = 0, j = 25 )
i ^= j; ( i = 25, j = 25 )

I'm not saying you should use it, I'm just saying that it works.

-michael

Michael Bishop · Mar 23, 2005

Clark said:
On 2005-03-01 09:53:05 -0500, "RAYYILDIZ" <[email protected]> said:

This is a very bad idea, for several reasons:
1) It's premature optimization.
agree

2) It may actually be slower than the more obvious swapping algorithm on
some platforms
agree

3) It is less readable
agree

4) It is not always correct:

int i = 25;
int j = 25;
swap(&i,&j);

huh? explain please.

i = 25;
j = 25; ( i = 25, j = 25 )
i ^= j; ( i = 0, j = 25 )
j ^= i; ( i = 0, j = 25 )
i ^= j; ( i = 25, j = 25 )

I'm not saying you should use it, I'm just saying that it works.

-michael

Victor Bazarov · Mar 23, 2005

Michael Bishop said:
huh? explain please.

It would have a problem with swap(&i, &i), not with two different
lvalues. Not sure whether Clark *meant* that when he posted his
objection.

V

Malte Starostik · Mar 23, 2005

Karl said:
Reminds me of a 'trick' we used on IBM/360 back at university.
It was faster (and required less opcode memory) to XOR a register
with itself then to set it to 0.

That's the standard way most if not all x86 compilers set registers to
0, regardless of optimisation level. It's even common to xor a register
with itself and then increment or decrement it to set it to 1 or -1/max,
which still takes less bytes of opcodes than loading an immediate
value. It's also commonly used in buffer overflow exploits to inject
the code via null-terminated strings as it doesn't contains '\0'. And
all of this is off-topic here, so a quick leap back: In general you
better trust the compiler to optimise such trivial things these days.
With today's CPUs and all the pipelining it isn't all that trivial to
write faster assembly than the one generated by a compiler. And often
it's even worse to try and give "useful" hints to the compiler like the
(otherwise really cute) XOR swap.

Cheers,
Malte

Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
Problem with simple pthread program in C	1	Mar 9, 2023
Lexical Analysis on C++	1	Oct 31, 2023
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
How to alter the program so that when user types z or Z or 0, the program sets both a and b to zero?	0	Oct 11, 2022
How to try a range of hex values in C# code ?	0	Nov 19, 2022
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
Filter sober in c++ don't pass test	0	Dec 2, 2023

How increase a C/C++ Program speed ?

RAYYILDIZ

Victor Bazarov

Phillip Jordan

Clark S. Cox III

Andrew Koenig

Chris Jefferson

assaarpa

Heinz Ozwirk

dumitru

alef

Chris Jefferson

Karl Heinz Buchegger

Michael Bishop

Michael Bishop

Victor Bazarov

Malte Starostik

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads