Does 'if' have performance overhead

L

lali.b97

Somewhere in a tutorial i read that if statement has performance
overheads as code within the if statement cannot take benefit of
pipeling of microprocessor and also that the compiler cannot
agressively optimize that code.
From that day onwards i have been trying to avoid if statement withing
my functions as much as possible and also try to have minimum code
withing if block.



However, i am bit skeptic about this.


I need some guidance. Performance is always the key issue for me when
it comes to writing programs.

Please guide.
 
K

Kai-Uwe Bux

Somewhere in a tutorial i read that if statement has performance
overheads as code within the if statement cannot take benefit of
pipeling of microprocessor and also that the compiler cannot
agressively optimize that code.

Somewhat true, I hear.

It's not the if-statement per se. It's the fact that control flow of the
program branches.

From that day onwards i have been trying to avoid if statement withing
my functions as much as possible and also try to have minimum code
withing if block.

That is a Bad Idea(tm).

Optimizations avoiding if-statements are usually non-obvious and change the
nature of your code fundamentally. To see that, just try to rewrite

unsigned int max ( unsigned int lhs, unsigned int rhs );

without branch statements. Generally, avoiding branch statements can be
considered code obfuscation. Code obfuscation for the sake of performance
gains that have not proven necessary by profiling is a form of premature
optimization.

However, i am bit skeptic about this.

What do your measurements tell you?

I need some guidance. Performance is always the key issue for me when
it comes to writing programs.

It should not be. Programmer time is much more expensive than CPU time. The
rational choice is to optimize for code beauty, extendability, ease of use,
and maintenability.


Best

Kai-Uwe Bux
 
J

Jim Langston

Somewhere in a tutorial i read that if statement has performance
overheads as code within the if statement cannot take benefit of
pipeling of microprocessor and also that the compiler cannot
agressively optimize that code.

my functions as much as possible and also try to have minimum code
withing if block.

However, i am bit skeptic about this.

I need some guidance. Performance is always the key issue for me when
it comes to writing programs.

Please guide.

You should not prematurely optimize. That is, don't attempt to optimize
code until you find what is actually taking the time.

if statements are extremly common in code in all languages They essentially
come down to a jump in the CPU based on some condition (jump if not zero,
jump if greater than zero, etc...). Now, I believe you are talking about
prefetching instructions and that he CPU won't know what set of instructions
to prefetch if there is a jump, set A or set B.

A lot of times the compiler itself may optimize the code, and a lot of times
the CPUs are smart enough to either figure it out or prefetch both branches.
In other words, I would not worry about if statments taking up too much
time. Although I wouldn't throw in if statments for no reason either.
 
G

Guest

Somewhere in a tutorial i read that if statement has performance
overheads as code within the if statement cannot take benefit of
pipeling of microprocessor and also that the compiler cannot
agressively optimize that code.

This is totally off topic:

That is only partially true one a modern PC processor (as opposed to
embedded processors which I have little knowledge about) since they all
have pretty good branch prediction these days. This is best demonstrated
by a simple loop:

for (int i = 0; i < 10; ++i)
{
// do stuff
}
// do other stuff

The processor recognises a loop when it sees one, and it will assume
that you will perform the iterations, so the loop can be optimised very
well. The problem comes when the last iteration is done, since the
processor wrongly assumes that you will iterate you get a small
performance hit when it discovers that you do not.

Similarly the processor can optimise if statements and other control
statements. Even better, they can learn, so if you have an if statement
and you time after time go to the else clause the processor will
remember this and will start executing the else clause when reaching the
if statement before the comparison is complete. Again, should it happen
that the assumption is wrong you get a performance hit.

Notice though that this performance hit is less noticeable on modern
hardware than is was on a P4, since the processors of today are not as
deeply pipelined.
my functions as much as possible and also try to have minimum code
withing if block.

The best way to optimise an if statement is to write the code that is
most likely to be executed in the if clause and the least likely in the
else clause, since that will save the processor a jump in most cases.
However, i am bit skeptic about this.

Rightly you should be, you should be sceptic about any optimisation that
is not at the algorithmic level.
I need some guidance. Performance is always the key issue for me when
it comes to writing programs.

Select the best algorithms and data structures for the task and then use
a good profiler.
 
L

lali.b97

This is totally off topic:

That is only partially true one a modern PC processor (as opposed to
embedded processors which I have little knowledge about) since they all
have pretty good branch prediction these days. This is best demonstrated
by a simple loop:

for (int i = 0; i < 10; ++i)
{
// do stuff
}
// do other stuff

The processor recognises a loop when it sees one, and it will assume
that you will perform the iterations, so the loop can be optimised very
well. The problem comes when the last iteration is done, since the
processor wrongly assumes that you will iterate you get a small
performance hit when it discovers that you do not.

Similarly the processor can optimise if statements and other control
statements. Even better, they can learn, so if you have an if statement
and you time after time go to the else clause the processor will
remember this and will start executing the else clause when reaching the
if statement before the comparison is complete. Again, should it happen
that the assumption is wrong you get a performance hit.

Notice though that this performance hit is less noticeable on modern
hardware than is was on a P4, since the processors of today are not as
deeply pipelined.


The best way to optimise an if statement is to write the code that is
most likely to be executed in the if clause and the least likely in the
else clause, since that will save the processor a jump in most cases.


Rightly you should be, you should be sceptic about any optimisation that
is not at the algorithmic level.


Select the best algorithms and data structures for the task and then use
a good profiler.

Thank you very much for your response.

lali
 
S

Stephan Rose

Somewhere in a tutorial i read that if statement has performance
overheads as code within the if statement cannot take benefit of
pipeling of microprocessor and also that the compiler cannot agressively
optimize that code.

my functions as much as possible and also try to have minimum code
withing if block.



However, i am bit skeptic about this.


I need some guidance. Performance is always the key issue for me when it
comes to writing programs.

Please guide.

I wouldn't worry about it. If I wanted to avoid if statements in my code
I'd be in serious trouble. Only worry about optimizing what actually
needs optimizing.

To give you an example, I am working on a 2D CAD application for
electronics design. At the lowest level, I have a class called Scalar
that allows me to perform arithmetic with any 2 values, regardless of
unit type (inch, millimeter, etc.) with one another.

This being among the most essential and lowest-level class of them all,
it has every bit imaginable optimized out of it that I can think of.
Because even a single instruction saved in this class can translate into
hundred thousands or more instructions during a complex operation later
on. Here, speed matters more to me than code clarity.

Maybe I should sell it to NASA so that they can stop crashing things into
planets because they can't get their units straight. =)

Now, my higher level functions though, such as the code that can take 2
object outlines composed of line and curve segments and calculate the
distance or intersection between the two outlines, is somewhat optimized
but I don't overly worry about squeezing every last bit out of it. In a
worst case scenario this code might maybe be called a hundred or so times
in one shot. Clearly written code at the expense of speed is more
important here as it involves some complex operating.

So what I'm trying to get at is, unless the code is really speed critical
and you absolutely will benefit from every tiniest bit of optimization,
don't worry about it. Rather worry that your code is clearly written in a
way that you can still understand it when you come back to it 6 months
later. That'll benefit you far more. =)

--
Stephan
2003 Yamaha R6

å›ã®ã“ã¨æ€ã„出ã™æ—¥ãªã‚“ã¦ãªã„ã®ã¯
å›ã®ã“ã¨å¿˜ã‚ŒãŸã¨ããŒãªã„ã‹ã‚‰
 
J

Juha Nieminen

Somewhere in a tutorial i read that if statement has performance
overheads as code within the if statement cannot take benefit of
pipeling of microprocessor and also that the compiler cannot
agressively optimize that code.

my functions as much as possible and also try to have minimum code
withing if block.

You are making the classic mistake: Believing something you read and
starting blindly doing it that way without actually *testing* it in
practice.

I may well be that your avoidance of the if clause may in fact be
producing slower code. However, since you haven't tested both
possibilities in your programs, you can't know.

Anyways, in the vast majority of cases such a small potential
optimization doesn't matter at all. Usually less than 1% of a program
which performs heavy calculations would require such low-level
optimization (if even that much).
 
M

Michael Bell

In message <[email protected]>
Kai-Uwe Bux said:
(e-mail address removed) wrote:

It takes more time? I'm not sure how you would measure that! Surely
not by sitting in front of the screen with a stop-watch?

The only workable thing I can think of is to put a line before the
block under test to read the internal clock, a second line at the end
of the block, and the difference is the time taken. How exactly would
you do such a thing?

Michael Bell

--
 
M

Michael Bell

In message <[email protected]>
Kai-Uwe Bux said:
(e-mail address removed) wrote:

It takes more time? I'm not sure how you would measure that! Surely
not by sitting in front of the screen with a stop-watch?

The only workable thing I can think of is to put a line before the
block under test to read the internal clock, a second line at the end
of the block, and the difference is the time taken. How exactly would
you do such a thing?

Michael Bell

--
 
S

Stephan Rose

In message <[email protected]>


It takes more time? I'm not sure how you would measure that! Surely not
by sitting in front of the screen with a stop-watch?

The only workable thing I can think of is to put a line before the block
under test to read the internal clock, a second line at the end of the
block, and the difference is the time taken. How exactly would you do
such a thing?

That is a little platform dependent really.

In my case, I have a class called CpuTimer that works both under Windows
and Linux that can do high precision timing.

Then when I want to profile something, I will usually isolate the one
single function I want to profile, write a small test case along with
test data. The amount of data I generate depends on the complexity of the
function, that can range anywhere from a few hundred data items to a few
million.

Then I measure the time it took to process the complete data set, which
then divided by the number of data items in the set gives me the average
execution time per function call.

It's not 100% precise of course as too many uncontrollable factors can
affect execution speed, such as OS background tasks and such. However, it
is good enough to tell if a change I have made has made things better or
worse which is ultimately the only thing I am really concerned about.

--
Stephan
2003 Yamaha R6

å›ã®ã“ã¨æ€ã„出ã™æ—¥ãªã‚“ã¦ãªã„ã®ã¯
å›ã®ã“ã¨å¿˜ã‚ŒãŸã¨ããŒãªã„ã‹ã‚‰
 
J

James Kanze

That is a little platform dependent really.

The standard provides a function, clock(), expressedly for this.
Regretfully, the implementation in Windows is poor enough to be
useless.
In my case, I have a class called CpuTimer that works both under Windows
and Linux that can do high precision timing.

The precision comes from repeating the operation millions of
times. I generally don't consider my measurements significant
unless I've repeated enough for the actual execution time to be
around five minutes.
Then when I want to profile something, I will usually isolate the one
single function I want to profile, write a small test case along with
test data. The amount of data I generate depends on the complexity of the
function, that can range anywhere from a few hundred data items to a few
million.

You also have to worry about ensuring that the optimizer doesn't
realize that your function has no real impact on the final
output, and suppresses it entirely.
Then I measure the time it took to process the complete data set, which
then divided by the number of data items in the set gives me the average
execution time per function call.
It's not 100% precise of course as too many uncontrollable factors can
affect execution speed, such as OS background tasks and such. However, it
is good enough to tell if a change I have made has made things better or
worse which is ultimately the only thing I am really concerned about.

Another thing you probably want to do is execute the function
once before starting the timing, to ensure that it is paged in,
and in cache, if it fits.
 
S

Stephan Rose

The standard provides a function, clock(), expressedly for this.
Regretfully, the implementation in Windows is poor enough to be useless.


The precision comes from repeating the operation millions of times. I
generally don't consider my measurements significant unless I've
repeated enough for the actual execution time to be around five minutes.


You also have to worry about ensuring that the optimizer doesn't realize
that your function has no real impact on the final output, and
suppresses it entirely.

Yep that's why I generate data items to process ahead of time to feed to
the function. For one it makes it more realistic as in reality I likely
wouldn't be calling the function over and over again with the same data
and I've yet to see the optimizer suppress it when doing that.
Another thing you probably want to do is execute the function once
before starting the timing, to ensure that it is paged in, and in cache,
if it fits.

Hmmm never thought of that, not a bad idea.

--
Stephan
2003 Yamaha R6

å›ã®ã“ã¨æ€ã„出ã™æ—¥ãªã‚“ã¦ãªã„ã®ã¯
å›ã®ã“ã¨å¿˜ã‚ŒãŸã¨ããŒãªã„ã‹ã‚‰
 
J

James Kanze

[...]
Yep that's why I generate data items to process ahead of time to feed to
the function. For one it makes it more realistic as in reality I likely
wouldn't be calling the function over and over again with the same data
and I've yet to see the optimizer suppress it when doing that.

Better yet, read the data from a separate file (before starting
the timings, of course).

In practice, how far you have to go depends on how good the
compiler is. To date (and there's absolutely no guarantee that
this will hold in the future), I've found it sufficient 1) to
make the function virtual (eliminating all possibilities of
inlining, etc.) and 2) to ensure that it writes something to a
member variable, something which depends on everything in the
function.

Since calling a virtual function isn't free, I first run a loop
timing the loop with an empty function, then run it with the
target function, subtracting the time for the empty function.
This has the additional advantage that the compiler cannot
decide that 99% of the virtual calls are to the same function,
and optimize that one function inline.

But as I said, there's no guarantee. It works for now, at least
with g++ (4.1.0) and Sun CC (5.8), but I expect that some time
in the future, I'll have to get even trickier.
 
Joined
Apr 23, 2008
Messages
1
Reaction score
0
The answer is ...

Hi,
I know this topic has died more than 6 months ago, but I arrived at this thread from Google as the first search result - so that means other people are also interested in this question and in its answer. For those who are interested, I performed a small test IN VS.NET 2005 CSHARP (which results in exe's which are, as far as I know, SLOWER than those created in C++) and the following are the results.


-----
! Tip: The bottom line of this long message, can be found at the bottom line.
-----

I created a small form which has a button, a label and a checkbox. Once the button is clicked, the following segment of code is run:

Code:
		private void doCount()
		{
			label1.Text = "";
			int blah = -1;
			DateTime dteStart = DateTime.Now;
			for (int intCount = 0; intCount < 1000000; intCount++)
			{
				/*
				if (checkBox1.Checked)
				{
					blah = -2;
				}
				*/
			}
			TimeSpan tmspnDiff = new TimeSpan(dteStart.Ticks);
			DateTime dteDiff = DateTime.Now.Subtract(tmspnDiff);
			label1.Text = dteDiff.Ticks.ToString();
		}
I ran this application, hit the button several times and saw that this took to compile either 312,500 or 468,750 ticks (a delta of 156250 ticks between the values).

I then changed the loop to iterate 10 million times instead of 1 million times and ran the same test - and this time saw that the loop took to compile either 3,750,000 or 3,906,250 ticks (again, with a delta of 156250 ticks between values - this appears to be a standard delta with my CPU).

I then changed the code again by uncommenting the IF part.
This took much longer to complete than before - took 13,593,750 ticks (times 3.6 in comparison with 3.75 million ticks).

I then changed the code again by replacing the IF part to this code:

Code:
				if (!checkBox1.Checked)
				{ }
				else
				{
					blah = -2;
				}
This took either 13,125,000 or 13,281,250 ticks to complete (again with the standard 156250 ticks delta between the two results), so we gained about 3.5% performance in comparison with the previous IF part.

Now for the bottom line.

Between the test conducted with the "less optimized" IF block of code (13,593,750 ticks) and the test conducted without the IF block of code (3,750,000 ticks) - the difference is 9,843,750 ticks.

Divide this by 10,000,000 times (the number of iterations) - we get 0.984375 ticks per IF clause performed.

Taking into consideration that 1 tick equals 1/10,000,000 second - that means that a standard IF clause takes to an old Pentium 4 processor 0.0000000984375 seconds to process. So, to answer the original post-author's question: No. Adding IF's to your program will not reduce performance in any human terms.
 
Last edited:

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top