low-level performance tuning: x= a ^ b ^ x;

T

Timo Nentwig

Hi!

Why exactely is x= a ^ b ^ x; (on my system much) faster than

if (x == a) x= b;
else x= a;

?
 
P

Patricia Shanahan

Timo said:
Hi!

Why exactely is x= a ^ b ^ x; (on my system much) faster than

if (x == a) x= b;
else x= a;

?

Note that the two pieces of code do different things, unless
x matches one of a or b.

On pipelined processors, conditional jumps can be very
expensive, especially if there is no consistent pattern to
the directions. Simple integer commands are cheap,
especially if the loop contains a few variables that can be
kept in registers.

Before actually using something like this make sure you have
tested the difference in a realistic context, either in your
actual application or surrounded by similar code, with a
loop structure similar to your application's kernel. The
surrounding context matters for branch prediction, and
performance of variable access.

Patricia
 
T

Timo Nentwig

Patricia said:
Note that the two pieces of code do different things, unless
x matches one of a or b.

On pipelined processors, conditional jumps can be very
expensive, especially if there is no consistent pattern to
the directions. Simple integer commands are cheap,
especially if the loop contains a few variables that can be
kept in registers.

Hmm. do you see a way to speed up this (clamping):

if( ox >= width ) ox = width - 1;
else if( ox < 0 ) ox = 0;

and

l = (dx + dy) >> 1;
if( l > 0xff ) l = 0xff;
else if( l < -0xff ) l = -0xff;
 
P

Patricia Shanahan

Timo said:
Patricia Shanahan wrote:




Hmm. do you see a way to speed up this (clamping):

if( ox >= width ) ox = width - 1;
else if( ox < 0 ) ox = 0;

Try getting rid of the "else", and see if it helps. Some
processors have fast ways of conditionally skipping a single
operation, but have to invalidate the pipeline on
mispredicted branch around more than one operation.

and

l = (dx + dy) >> 1;
if( l > 0xff ) l = 0xff;
else if( l < -0xff ) l = -0xff;

Same comment about the "else".

How about some context? Presumably, these things happen in
VERY frequently executed loops, or you wouldn't care about
them, and the objective is to improve the performance of the
whole loop, not just the snippet. What else goes on in each
loop? What goes on around it? Where do the operands come from?

Incidentally, I'm assuming that each piece of code gets
compiled to machine language for execution. If the bytecode
is being interpreted on each iteration, you need a better JVM.

Patricia
 
T

Timo Nentwig

Patricia said:
Try getting rid of the "else", and see if it helps. Some

I might be able to get rid of the else but I still don't see how to get rid
of the if.
 
T

Timo Nentwig

// HACK
if( (ox & 0x7fffffff) >= (width & 0x7fffffff) ) ox = ox < 0 ? 0 : width - 1;

But I'm not sure whether I have won much :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top