break statement in a for loop

A

a.mil

I am programming for code-speed, not for ansi or other nice-guy stuff
and I encountered the following problem:

When I have a for loop like this:

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;
}

I want to break out of the loop -fast- after a==1. When I put a break
after it like in

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;
break;
}

the former fast loop turns into a horribly (factor 100 or so...) slow
one, so better to use no break at all in my opinion.

Does anyone know exactly -why- the break statement makes the loop so
slow??
Isn't it just an extra "jmp" in assembler equivalent code? I compared
both assembled outputs with and without the break statement, but they
really look quite different.. ?! I optimize with -O3 using gcc.
 
K

klaushuotari

I am programming for code-speed, not for ansi or other nice-guy stuff
and I encountered the following problem:

When I have a for loop like this:

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;

}

I want to break out of the loop -fast- after a==1. When I put a break
after it like in

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;
break;

}

the former fast loop turns into a horribly (factor 100 or so...) slow
one, so better to use no break at all in my opinion.

Does anyone know exactly -why- the break statement makes the loop so
slow??
Isn't it just an extra "jmp" in assembler equivalent code? I compared
both assembled outputs with and without the break statement, but they
really look quite different.. ?! I optimize with -O3 using gcc.

Break statement should not inherently make branching that much slower,
while in practice it usually does.

I guess it's implementation dependent, for example in x86 it's not a
big deal.
Just make concise and clear code, and it will turn out exactly nice in
machine code level. If the compiler is intelligent enough, it should
catch this situation and make appropriate code for it. No worries.
 
R

Richard Heathfield

(e-mail address removed) said:
I am programming for code-speed, not for ansi or other nice-guy stuff
and I encountered the following problem:

When I have a for loop like this:

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;
}

This should be really fast, since you get a divide by zero error on the
very first time through the loop.

It is easier to make a correct program fast than to make a fast program
correct. Start off, then, by seeking correctness.
 
M

Malcolm McLean

I am programming for code-speed, not for ansi or other nice-guy stuff
and I encountered the following problem:

When I have a for loop like this:

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;
}

I want to break out of the loop -fast- after a==1. When I put a break
after it like in

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;
break;
}

the former fast loop turns into a horribly (factor 100 or so...) slow
one, so better to use no break at all in my opinion.

Does anyone know exactly -why- the break statement makes the loop so
slow??
Isn't it just an extra "jmp" in assembler equivalent code? I compared
both assembled outputs with and without the break statement, but they
really look quite different.. ?! I optimize with -O3 using gcc.
I think there must be some mistake elsewhere.
If the loop execution has changed by a factor of 100 then either you are not
measuring the same thing, or the compiler has optimised the loop away in
one case but not the other. An example would be if you used the value of i
later in the function but ignored a. In one case the compiler can optimise
to i = 100, in the other case it probably isn't clever enough to do it
without running through the loop.
Someone will probably point out, correctly, that a conforming compiler can
compile the break to something horribly slow. The standard makes no
guarantees on efficiency. Whilst this might account for a factor of two or
three, I cannot see how it would account for 100.
 
C

christian.bau

I am programming for code-speed, not for ansi or other nice-guy stuff

I don't know _what_ you are programming for, but you flunked your
maths test _and_ your programming test, didn't you? (And that's me
being a nice guy, you shouldn't meet me in non-ansi mode).
When I have a for loop like this:

b=b0;
for (a=0,i=0;i<100;i++,b--) {
if (b%i) continue;
a=1;
}

As others mentioned, undefined behavior (most likely a crash)
immediately when you calculate b % 0 in the first iteration. If you
started the loop with i = 1, then a would be set to 1 immediately,
making the whole loop rather pointless.

Now assuming that you intended to do something sensible, like starting
the loop with i = 2: Note that (b % i) != 0 is exactly the same as ((b
+ i) % i) != 0. Inside the loop, i is increased and b decreased in
each iteration, whereas b + i remains constant. So it seems that you
are just checking whether a certain number has a factor less than 100.
You can do that significantly faster by checking divisibility by prime
numbers up to 100; in practice a hardcoded

a = (b % 2 && b % 3 && b % 5 && b % 7 && b % 11 ... && b % 97) ? 0 :
1;

would be faster by a factor ten again on many implementations.
 
A

a.mil

Ok, this was a bad fast example, I admit.
Therefore below another example that shows the point.


int main()
{
unsigned long i,j,k,l,n;

for (l=0;l<10000000;l++) {
j=0;
k=10;
n=231;

for (i=1;n>3;j+=8,k+=j,n-=2) {
if (k%n) continue;
i=0;
break;
}
}
return 0;
}


Run this without the break statement and with the break statement and
look at execution time. On my 1.5GHz P4 it takes 1.3s and 13.9s
respectively (!). My store on the factor 100 incorporates a lot more
code.
 
R

Richard Heathfield

(e-mail address removed) said:
Ok, this was a bad fast example, I admit.
Therefore below another example that shows the point.


int main()
{
unsigned long i,j,k,l,n;

for (l=0;l<10000000;l++) {
j=0;
k=10;
n=231;

for (i=1;n>3;j+=8,k+=j,n-=2) {
if (k%n) continue;
i=0;
break;
}
}
return 0;
}


Run this without the break statement and with the break statement and
look at execution time. On my 1.5GHz P4 it takes 1.3s and 13.9s
respectively (!). My store on the factor 100 incorporates a lot more
code.

The same result can be achieved much more quickly:

int main(void)
{
return 0;
}
 
A

a.mil

It's not my objective to show you the full code; it's about the issue
of execution speed and "break".


Richard Heathfield schreef:
 
R

Richard Heathfield

(e-mail address removed) said:
It's not my objective to show you the full code; it's about the issue
of execution speed and "break".

But break isn't the problem. The problem is a poorly-constructed
algorithm, which involves you performing many unnecessary calculations.
 
G

Guest

Ok, this was a bad fast example, I admit.
Therefore below another example that shows the point.


int main()
{
unsigned long i,j,k,l,n;

for (l=0;l<10000000;l++) {
j=0;
k=10;
n=231;

for (i=1;n>3;j+=8,k+=j,n-=2) {
if (k%n) continue;
i=0;
break;
}
}
return 0;
}


Run this without the break statement and with the break statement and
look at execution time. On my 1.5GHz P4 it takes 1.3s and 13.9s
respectively (!). My store on the factor 100 incorporates a lot more
code.

On my system, without the "break", it takes 0.001 seconds (according
to "time"). That's because without the break, the compiler realises
your program does nothing and your for loop is useless, so it removes
it altogether. How about an actual demonstration (a program that does
something useful, such as printing the result of your calculation,
whatever it is you're trying to calculate)?
 
M

Malcolm McLean

It's not my objective to show you the full code; it's about the issue
of execution speed and "break".
It is perfectly reasonable to post a compileable snippet showing the
problem, but here we don't have enough to work on. It is hard to optimise a
function that doesn't actully do anything.
 
R

Richard Tobin

Run this without the break statement and with the break statement and
look at execution time. On my 1.5GHz P4 it takes 1.3s and 13.9s
respectively (!).

I get a similar result with GCC on a PPC Mac. It appears to just be
an optimisation effect: without the break the compiler can determine
enough about the inner for loop to optimise it away completely; with
the break it can't.

It's easy to construct artificial examples of this kind, but they are
rare in real programs, because if the compiler can optimise away a
loop, then the programmer probably won't write it in the first place.

-- Richard
 
C

christian.bau

It's not my objective to show you the full code; it's about the issue
of execution speed and "break".

Your first example was pure nonsense invoking undefined behavior at
the first step.

Your second example was a completely useless piece of code, where huge
portions of the code could be optimised away. All that you have been
testing is the capability of the compiler to detect nonsense code and
remove it. It seems your break statement affected the compilers
capability of detecting nonsense code. This might be of importance to
you, it is of no importance to those 99.9999 percent of programmers
who actually write more sensible code in the first place.
 
M

Malcolm McLean

christian.bau said:
Your first example was pure nonsense invoking undefined behavior at
the first step.

Your second example was a completely useless piece of code, where huge
portions of the code could be optimised away. All that you have been
testing is the capability of the compiler to detect nonsense code and
remove it. It seems your break statement affected the compilers
capability of detecting nonsense code. This might be of importance to
you, it is of no importance to those 99.9999 percent of programmers
who actually write more sensible code in the first place.
That's not quite true

I would much rather write

y = x/factorial(6);

with the factorial function doing what you would expect than

y = x/FACT6;

or y = x/720;

if I could trust the compiler to optimise away the loop.
 
K

Keith Thompson

Malcolm McLean said:
That's not quite true

I would much rather write

y = x/factorial(6);

with the factorial function doing what you would expect than

y = x/FACT6;

or y = x/720;

if I could trust the compiler to optimise away the loop.

That's very different. In the case of the factorial() function, the
compiler can *optimize* the loop (possibly to a single literal value);
it can't *optimize away* the loop, because you use the result.

In the original code being discussed, the loop does not produce any
results. An optimizing compiler can eliminate it entirely.
 
A

a.mil

christian.bau schreef:
Your first example was pure nonsense invoking undefined behavior at
the first step.

Your second example was a completely useless piece of code, where huge
portions of the code could be optimised away. All that you have been
testing is the capability of the compiler to detect nonsense code and
remove it. It seems your break statement affected the compilers
capability of detecting nonsense code. This might be of importance to
you, it is of no importance to those 99.9999 percent of programmers
who actually write more sensible code in the first place.

Boy, oh boy... I thought there would be some intelligent people here
who would
know what an inserted break in such a loop actually would result in in
assemled
code. That was the whole objective of the question.

I'm not interested in answers
trying to tell me that an example I gave to clarify the -structural-
problem is not
a sensible program or is bad code or that I do not follow holy
standards. It's not
about the code, the question is about what an inserted break in any
loop (that is
not optimized away of course in the real program, I'm not that stupid,
but I assumed
some smartness of the reader here) actually inserts in the assembled
code.
Actually, to just give you an example, not even all code in K&R is
sensible code, but allas.

Since I will not and cannot display to you the full code I'll find out
myself what
is happening. Anyway, thanks for the fish and the (few) sensible
answers.
 
F

Flash Gordon

(e-mail address removed) wrote, On 29/04/07 22:38:

Boy, oh boy... I thought there would be some intelligent people here

There are plenty of intelligent people.
who would
know what an inserted break in such a loop actually would result in in
assemled
code. That was the whole objective of the question.

That all depends on the processor, the compiler, the options given to
the compiler, and often the surrounding code.
I'm not interested in answers
trying to tell me that an example I gave to clarify the -structural-
problem is not
a sensible program or is bad code or that I do not follow holy
standards. It's not

OK, so you don't care if your code is good, bad, or fundamentally
broken. In that case it should not matter what the compiler does.
about the code, the question is about what an inserted break in any
loop (that is
not optimized away of course in the real program, I'm not that stupid,
but I assumed
some smartness of the reader here) actually inserts in the assembled
code.

Well, since under some conditions the entire loop will be optimised away
if *obviously* depends on the *actual* code and the *real* situation.
Actually, to just give you an example, not even all code in K&R is
sensible code, but allas.

It makes a lot more sense than the code you have shown.
Since I will not and cannot display to you the full code I'll find out
myself what
is happening. Anyway, thanks for the fish and the (few) sensible
answers.

If you refuse to give a sensible example people cannot sensibly guess
what a compiler will do, and in any case the impact will depend on the
compiler, the options it is given, and the processor. For example, on
some processors it will break the pipeline, other processors do not even
*have* a pipeline.
 
O

Old Wolf

Boy, oh boy... I thought there would be some intelligent people here
who would know what an inserted break in such a loop actually would
result in in assemled code. That was the whole objective of the question.

This newsgroup does not deal with "assemled code"...
I do not follow holy standards.

.... nor with non-standard programs.
I'm not interested in answers trying to tell me that an example I gave
to clarify the -structural- problem is not a sensible program

Surely you agree that code which divides by zero is not a good example
for anything?

Anyway, it's not clear to me exactly what you are asking, but if you
are
asking about the speeds of varying algorithms, then comp.programming
would be a good newsgroup to ask on.
about the code, the question is about what an inserted break in any
loop actually inserts in the assembled code.

That would depend on your compiler. Some C environments don't even
have any "assembled code". You should post this question on a
newsgroup or forum relevant to your specific compiler.
Run this without the break statement and with the break statement
and look at execution time. On my 1.5GHz P4 it takes 1.3s and
13.9s respectively (!).

Different algorithms take different lengths of time? Who'da thunk it!

It's pretty obvious that your compiler is performing some sort of
optimization, and when you change the code you get a different
optimization. Without any further information it's impossible to
provide a more accurate 'answer'. You should post on a newsgroup
relevant to your specific compiler.
 
J

J. J. Farrell

christian.bau schreef:




Boy, oh boy... I thought there would be some intelligent people here
who would
know what an inserted break in such a loop actually would result in in
assemled
code. That was the whole objective of the question.

Then you have very strange thought processes, or a complete
misunderstanding of how C compilers work. It obviously depends
entirely on at least the complete compilation unit, the compiler, the
target processor, and the options given to the compiler. If your code
isn't correct C, it could also depend on the phase of the moon or
anything else. The only thing which can tell you is the compiler, and
the smallest change to anything in the list above could change its
behaviour entirely.
I'm not interested in answers
trying to tell me that an example I gave to clarify the -structural-
problem is not
a sensible program or is bad code or that I do not follow holy
standards. It's not
about the code, the question is about what an inserted break in any
loop (that is
not optimized away of course in the real program, I'm not that stupid,
but I assumed
some smartness of the reader here) actually inserts in the assembled
code.

How the Hell could anyone here know, without knowledge of the entire
compilation unit and detailed knowledge of your particular compiler's
algorithms under all the conditions you're providing? It can do
anything it likes as long as the code behaves in the way that C
defines it should behave, and if the code is not valid C then all bets
are off.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top