pete said:
I put fast test code in a loop that runs many times,
time the loops with the test code,
time the loops without the test code,
and then divide the difference in time,
by the number of loop cycles.
This is a good approach, but requires some care to avoid
contamination by compiler optimizations. For example,
t0 = clock();
for (i = 0; i < 1000000; ++i) {
#if INCLUDE_THE_CODE
/* code being timed goes here */
#endif
}
t1 = clock();
.... may not work very well when INCLUDE_THE_CODE isn't defined,
because the compiler may optimize the whole loop away and replace
it with `i = 1000000;' -- or even with nothing at all, if `i'
isn't used subsequently.
A method that's worked for me is to use function pointers:
void no_op(void) {}
void for_real(void) {
/* code being timed goes here */
}
int main(int argc, char **argv) {
void (*func)(void) = (argc > 1) ? for_real : no_op;
...
t0 = clock();
for (i = 0; i < 1000000; ++i)
func();
t1 = clock();
...
.... so the compiler cannot predict which function the loop will
call. For an even greater comfort level, put the two called
functions in a different module and compile it separately from
the main program.
Someday, perhaps, compilers and linkers will become smart
enough to make even this technique fail (if anyone knows of an
implementation this smart, I'm sure we'd like to hear about it).
For the moment, though, it seems fairly robust.