[...]> If you want faster code, then use assembler....that's about as fast as
you can get without upgrading hardware.
[...]
That's not *necessarily* true, and it's likely to be false if you're
not an experienced and/or talented assembly language programmer.
A good optimizing compiler isn't going to be as smart as a good
assembly language programmer, but it is more patient and persistent.
Optimization is often about tradeoffs; one technique might be better
or worse than another depending on some seemingly minor detail. To
pick a hypothetical example out of the air, algorithm X might be best
if an array size happens to be a power of 2, but algorithm Y might be
better than X if it isn't. A good optimizing compiler can make this
decision every time you compile your program; changing that single
array declaration and recompiling might result in substantially
different generated code for a 1% performance improvement. And there
can be many such decision points in your program. If you're manually
coding the whole thing in assembly language, you'll probably have to
make each such decision at design time.
And of course you completely lose any semblance of portability.
Disclaimer: I worked on optimizing compilers in the distant past, but
I've done very little assembly language programming.
--
Keith Thompson (The_Other_Keith) (e-mail address removed) <
http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <
http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
I would like to address the original claim:
[...]> If you want faster code, then use assembler....that's about as fast as
you can get without upgrading hardware.
from a sightly different stance.
Of course, that's mostly nonesense. Assembly gurus can write patches
of assembly that are faster than equivalent patches of C. However,
that is a terrible way to try to make things faster and should be used
as a last resort (ONLY after everything else fails).
The first step is to determine is what is the slow bit. This is
determined by profiling. Only after this information is made sure
should any attempt at speedup be launched.
After knowing the slow spots, the sensible way to improve the speed is
to improve the algorithm. For instance, for equality searches change
from a skiplist to a hash table will make average case performance go
from O(log(n)) to O(1). After researching all possible algorithmic
improvements, choose the most promising one and try it. If there are
no known algorithmic improvements, then try to come up with your own
algorithmic improvement.
If we have identified the problem and no known algorithmic
improvements exist, and we are unable to create any, then it is time
to look for linear speedups. These can be created by a number of
things that include:
1. Cache conscious algorithms
2. Profile guided optimization
3. Assembly language for the slow bits
4. Hardware improvements
{probably some others that I can't think of right now}
All of these linear improvements suffer from a terrible defect: They
do not scale with the problem. That is to say, we can make the
program run (perhaps) 4x faster. But when the size of the input data
set scales up to the equivalent of the savings then we are back to
square 1. And if the fundamental algorithm in question is not O(1) we
will rapidly lose ground as the problem expands. That is why an
algorithm improvement is the very best way to improve speed problems.
This is especially true because the problem ALWAYS scales up. Over
time, more and more data is going to accumulate, and this enlarged
data set is going to be fed into the programs.
Problems with Assembly:
1. It is tedious (look at how many lines of assembly you will need to
create the equivalent in a high level language)
2. It is not portable. Even across the same chip family, problems
develop over time (imagine a program using an old assembly routine
which returns and answer in AX instead of EAX, for example). The
semantics of inline assembly change from compiler to compiler.
3. You have to be really good at it to be able to outsmart a current
optimizing C compiler.
4. There are not as many good Assembly programmers as there are good
C programmers. For that reason it may be harder to maintain
(depending on the resources of the orgainization receiving the
solution of course).
IMO-YMMV