fermineutron said:
A while back i tried to calculate factorials of large numbers using
arrays in C, the array encoded integer arithemetic that i wrote in C
was very slow, it would take almost a second to multiply 2 array
encoded integers. resently i looked at large precision libraries for C,
in particular GMP but i was unable to get it to run with my compiler,
aperantly some header files were not found. I was curious what is the
best way to interface C and asm so that i could write simmilar library
in asm and use it in C.
Any ideas?
About the only realistic way you're going to get x86 assembly to outperform
highly optimized C, is to get a true x86 assembly expert, like Terje
Mathisen, to code it for you.
Most current C compilers are extremely efficient in generating assembly.
I've gone to great lengths to outcode C compilers for x86 in a couple
special situations. The best I could do was almost match the C compiler...
The problem for you is that the C optimizer can take full advantage of
extremely complicated situations and special instructions to generate the
best code. In fact, some C optimizers generate hundreds of trial
combinations. Most people just can't handle such complexity or convoluted
situations.
Are there ways to make your C code faster? Yes.
1) buy a new computer, a 2Ghz AMD is roughly 1000 times faster than a 500
Mhz AMD x86 CPU
2) find a better algorithm, a brute force factorization may take a second or
two, many times slower than an elliptic curve factorization
3) switch from, say GCC, to a compiler which is known for more efficient
code, say OpenWatcom or Digital Mars
4) completely unroll any loops, this reduces branching which is always
expensive in assembly
5) completely unroll any loops, occasionally the loop size can be reduced by
one, depending on how the loop was coded
6) precompute as many operations as possible, even extremely large lookup
tables are much faster than computation
7) make an attempt to reduce the number of variables used in the
calculations
8) replace multiplications and divisions with additions, subtractions,
bitshifts
9) don't attempt to access integer data smaller than the largest assembly
integer type of the CPU (32-bits for 32-bit cpu, 64-bit for...)
10) play around with a decent number of compiler optimizations, usually a
small number of them will other the most improvement
11) although C compilers are very good with optimization, they aren't
perfect. Forcing the use of a register can improve the code's speed
Rod Pemberton