Greater than operator Vs Equal to operator : Which one is efficient?

V

Vivek Mandava

Hello All,

Which one is an expensive operation?
'>' or '==" ?

I know this won't matter much. However, if I'm executing this
operation million times, I would prefer the better choice.

My gut feeling is that '>' should be efficient. However, when
translated to assembly code all it generates is "cmpl" instruction
(gcc -S <prog.c>). So, my question here is :: how is "cmpl"
instruction executed by the cpu?

Ex: (1) if (n == 0) vs if (n > 0)
(2) if (len == 12) vs if (len < 13)

Thanks in advance.

Regards,
Vivek

web: http://students.iiit.net/~vivek/
 
M

Martin Dickopp

Which one is an expensive operation?
'>' or '==" ?

The C standard doesn't define the performance of these operations.
I know this won't matter much. However, if I'm executing this
operation million times, I would prefer the better choice.

My gut feeling is that '>' should be efficient. However, when
translated to assembly code all it generates is "cmpl" instruction
(gcc -S <prog.c>). So, my question here is :: how is "cmpl"
instruction executed by the cpu?

This is not a C question, but a question about assembler/machine code
on a specific platform. It is off-topic here; please ask in a newsgroup
dedicated to programming the CPU in question.

Martin
 
J

Jeff

Vivek Mandava said:
Hello All,

Which one is an expensive operation?
'>' or '==" ?

I know this won't matter much. However, if I'm executing this
operation million times, I would prefer the better choice.

My gut feeling is that '>' should be efficient. However, when
translated to assembly code all it generates is "cmpl" instruction
(gcc -S <prog.c>). So, my question here is :: how is "cmpl"
instruction executed by the cpu?

Why don't ask in your platform specific newsgroup ?
 
N

Nils Petter Vaskinn

Hello All,

Which one is an expensive operation?

That would depend on both plattform and compiler I think.

int main()
int i,j;

i=INT_MAX;
j=INT_MAX;

print_system_time();

while(1) {
if (i == 0) {
break;
} else {
++i;
}
}

print_system_time();

while(1) {
if (j < 1) {
break;
} else {
++j;
}
}

print_system_time;

return 0;
}

hth
 
T

Tom Zych

Vivek said:
Which one is an expensive operation?
'>' or '==" ?

That's highly system-dependent and therefore OT here. That said, I'd
be surprised if there were any difference. It should use some kind
of "compare" instruction either way.

Why not put a tight loop of 10^7 of them in a function and run your
profiler on it?
 
J

John Bode

Hello All,

Which one is an expensive operation?
'>' or '==" ?

That's entirely a function of the target instruction set, not the C
language.
I know this won't matter much. However, if I'm executing this
operation million times, I would prefer the better choice.

The only way to know for sure is to write code using each operator and
gather performance statistics on that code, recognizing that your
results are only valid for that particular architecture. FWIW,
they're equally expensive on all the platforms I'm familiar with (they
all generate similar instructions). I'd honestly be surprised if
there was a platform where one was significantly more expensive than
the other.
My gut feeling is that '>' should be efficient. However, when
translated to assembly code all it generates is "cmpl" instruction
(gcc -S <prog.c>). So, my question here is :: how is "cmpl"
instruction executed by the cpu?

Assuming you're on x86:
http://www.penguin.cz/~literakl/intel/intel.html
 
E

E. Robert Tisdale

Vivek said:
Which one is the more expensive operation?

'>' or '==' ?

I know this won't matter much.
However, if I'm executing this operation million times,
I would prefer the better choice.

My gut feeling is that '>' should be efficient.
However, when translated to assembly code
all it generates is "cmpl" instruction (gcc -S <prog.c>).
So, my question here is,
"How is the 'cmpl' instruction executed by the cpu?"

Ex: (1) if (n == 0) vs if (n > 0)
(2) if (len == 12) vs if (len < 13)

The cmpl instruction is a subtraction which does *not* save a result.
It just sets some status register flags.
The branch instructions (je, jne, jlt, jle, jgt, jge, etc.)
test these status bits and branch accordingly.
None of them are inherently faster than the others.
Your CPU almost certainly includes branch prediction hardware
which it can use to prefetch instructions for the most likely branch.
Your compiler almost certainly includes optimizations
to assist branch prediction.

It is almost always a bad idea to attempt to outsmart
your optimizing compiler.
You will probably only succeed in frustrating these optimizations.
Write your code in the most straight forward and readable way
that you can and let your optimizing compiler figure out
how to generate instructions that yield the highest performance
and efficiency. If performance is not satisfactory,
use a profiler to locate the code that produces the bottleneck
and concentrate on optimizing that code first.
 
C

Christian Bau

Hello All,

Which one is an expensive operation?
'>' or '==" ?

I know this won't matter much. However, if I'm executing this
operation million times, I would prefer the better choice.

There are three things you can do:

1. Don't worry. Unless you get a measurable benefit from it, try to
write code that is readable and easy to maintain instead of trying to
make it fast. This will mean that you finish writing your software
earlier.

2. Check the assembler code. Check the manufacturers web sites for
manuals; absolutely everything you need is available; most will send you
a CD with the manuals for free.

3. Measure. Write two versions of the code and measure their speed.

Before you start doing anything: Make sure your code is correct. Once it
is correct, use a profiler which will tell you which parts of your
program spend the most time.
 
P

pete

John said:
I'd honestly be surprised if
there was a platform where one was significantly more expensive than
the other.

I've seen old 16 bit systems, (CHAR_BIT==8, sizeof(int)==2)
where equality of long int
was determined by an bitwise XOR of the high words (2 bytes==word),
followed by a zero flag condition check, and then if zero,
followed by an XOR of the low words and a zero flag check.
So, it was possible to bail out before entirely evaluating
either operand.
A long was compared for equality against zero, by ORing
the high word, with the low word and then checking the zero flag.

Equality vs inequality,
seems to me to be a simpler concept than greater than,
so when the equality or inequality operator,
does what I want, then that's what I use.
 
G

Glen Herrmannsfeldt

(snip regarding the performance of == and >)
Equality vs inequality,
seems to me to be a simpler concept than greater than,
so when the equality or inequality operator,
does what I want, then that's what I use.

But consider that, with twos complement arithmetic, <0 can be evaluated by
checking only one bit, where ==0 must test them all.

-- glen
 
E

E. Robert Tisdale

Glen said:
But consider that, with twos complement arithmetic,
<0 can be evaluated by checking only one bit,
where ==0 must test them all.

But this is done in hardware an *not* in software.
The typical implementation does both in parallel.
cmpl sets the "negative" bit in the status register
if the sign bit is set and it sets the "zero" bit
in the status register if all the bits are zero.
Branch instructions test one or both
of these two status register bits and update
the instruction pointer (program counter) accordingly.
 
G

Glen Herrmannsfeldt

E. Robert Tisdale said:
But this is done in hardware an *not* in software.
The typical implementation does both in parallel.
cmpl sets the "negative" bit in the status register
if the sign bit is set and it sets the "zero" bit
in the status register if all the bits are zero.
Branch instructions test one or both
of these two status register bits and update
the instruction pointer (program counter) accordingly.

The hardware may supply a bit test instruction, which could test the sign
bit of a multibyte value. Though on modern machines the whole word would
go into the cache, anyway. It would be hard to guess whether the bit test
or compare instructions would be faster.

-- glen
 
G

Glen Herrmannsfeldt

pete said:
That's true with any integer arithmetic, actually.

It depends on how you treat negative zero. If the sign bit is set, it could
be negative zero for sign magnitude or ones complement, and so should not be
treated as <0.

-- glen
 
C

Christian Bau

"Glen Herrmannsfeldt said:
The hardware may supply a bit test instruction, which could test the sign
bit of a multibyte value. Though on modern machines the whole word would
go into the cache, anyway. It would be hard to guess whether the bit test
or compare instructions would be faster.

Wouldn't any half decent compiler know which is the best way to
implement comparisons with constants and choose the best method anyway?
It is quite possible that

if (x >= 0)...
if (x > -1)...
if (((unsigned long) x) & 0x80000000) == 0)...

produce identical code (on machines where the non-portable third line
produces the same result as the first two).
 
A

ArWeGod

Christian Bau said:
Wouldn't any half decent compiler know which is the best way to
implement comparisons with constants and choose the best method anyway?
It is quite possible that

if (x >= 0)...
if (x > -1)...
if (((unsigned long) x) & 0x80000000) == 0)...

produce identical code (on machines where the non-portable third line
produces the same result as the first two).

Quoting from someone named Christian Bau:
"There are three things you can do:
1. Don't worry. 2. Check the assembler code. 3. Measure.
Before you start doing anything: Make sure your code is correct. Once it is
correct, use a profiler which will tell you which parts of your program
spend the most time."

Guys, we are talking about 3000MHz computers here. Do 2, 5 or 10 machine
code instructions matter at that speed?
 
J

Jirka Klaue

ArWeGod wrote:
....
Guys, we are talking about 3000MHz computers here. Do 2, 5 or 10 machine
code instructions matter at that speed?

In an inner loop? Yes, they will lead to 2, 5 or 10 years computation time,
roughly.

Jirka
 
R

Richard Bos

ArWeGod said:
Quoting from someone named Christian Bau:
"There are three things you can do:
1. Don't worry. 2. Check the assembler code. 3. Measure.
Before you start doing anything: Make sure your code is correct. Once it is
correct, use a profiler which will tell you which parts of your program
spend the most time."

Guys, we are talking about 3000MHz computers here.

Says you.
Do 2, 5 or 10 machine code instructions matter at that speed?

They do, if they're repeated a million times.

The _real_ problem with this kind of micro-optimisation, from a
comp.lang.c POV, is that there's nothing portable about it. As far as
the Standard is concerned, speed cannot be regulated. However, sometimes
you _do_ want to speed up a certain part of a program. In those cases,
Christian's last suggestion is the only real solution: measure which
code is faster, and don't expect your optimisations to work anywhere
else.

Richard
 
G

Glen Herrmannsfeldt

Christian Bau said:
(snip)


Wouldn't any half decent compiler know which is the best way to
implement comparisons with constants and choose the best method anyway?
It is quite possible that

if (x >= 0)...
if (x > -1)...
if (((unsigned long) x) & 0x80000000) == 0)...

produce identical code (on machines where the non-portable third line
produces the same result as the first two).

That is a good question. One might hope so, but I don't really know. The
original question asked which was faster, so the question is really what
will a compiler generate.

But consider this: Pretty much everyone codes their for() loops with <= or
< test for increasing loops, and >= or > for decreasing loops.

for(i=0;i!=10;i++)

should work just as well as i <10, unless i accidentally becomes greater
than 10.

If a compiler determined that i didn't change in the loop, it could generate
code doing an inequality test, instead, if that was faster.

-- glen
 
C

Christian Bau

"ArWeGod said:
Quoting from someone named Christian Bau:
"There are three things you can do:
1. Don't worry. 2. Check the assembler code. 3. Measure.
Before you start doing anything: Make sure your code is correct. Once it is
correct, use a profiler which will tell you which parts of your program
spend the most time."

Guys, we are talking about 3000MHz computers here. Do 2, 5 or 10 machine
code instructions matter at that speed?

Machine instructions don't matter, memory bandwidth does :)

But the point of my post above was that any trivial micro-optimisations
that a programmer might do could quite easily be done by any decent
compiler, so we are back to (1) don't worry.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top