B
Bonj
When performance benchmark testing one of my own functions against the
equivalent runtime function, I found that with /Ox on it optimized away its
own function so that it didn't call it *at all* after the first loop.
<psuedocode>
#ifdef USE_MY_FUNC
#define testfunc(x) myfunc(x)
#else
#define testfunc(x) runtimefunc(x)
#endif
QPC(start);
for(long i = 1; i<BIGNUMBER; i++)
dummyvar = testfunc(argv[1]);
QPC(mid);
for(long i = 1; i<BIGNUMBER; i++)
dummyvar = runtimefunc(argv[1]);
QPC(end);
printf("operation took %I64u\n", mid - start);
printf("control took %I64u\n", end - mid);
</code>
I think what is happening is that with /Ox on, it's completely done away
with the first (BIGNUMBER - 1) loops, knowing that the only thing that
changes is 'dummy', etc....
my dilemma is that:
I don't want to force it to "use" the value on all the loops, say by
printing it to the screen, as this would heavily dilute the time of my
function, i.e. most of the time of each loop wouldn't be spent doing the
function being tested, it would be spent printing.
I don't want to put debug compilation arguments on as this would make it
completely dumb and possibly create the illusion that my own function is
better than a standard implementation when perhaps it isn't when my program
is compiled in release mode.
So, my question is what compiler settings is the best to put on so that the
compiler isn't completely dumb in terms of optimization, but isn't a
absolute complete smartass either. (i.e. as close as possible to release
mode, to give a fair test, but to knock off the one (or more) settings that
allow it to completely eliminate calls)
i.e. make the calls that *I'm* telling it to do, but to do them as fast as
possible. i.e. i'm telling it what I want it to actually *do*, not what I
want the end result to be.
Hope this makes sense
Thanks
equivalent runtime function, I found that with /Ox on it optimized away its
own function so that it didn't call it *at all* after the first loop.
<psuedocode>
#ifdef USE_MY_FUNC
#define testfunc(x) myfunc(x)
#else
#define testfunc(x) runtimefunc(x)
#endif
QPC(start);
for(long i = 1; i<BIGNUMBER; i++)
dummyvar = testfunc(argv[1]);
QPC(mid);
for(long i = 1; i<BIGNUMBER; i++)
dummyvar = runtimefunc(argv[1]);
QPC(end);
printf("operation took %I64u\n", mid - start);
printf("control took %I64u\n", end - mid);
</code>
I think what is happening is that with /Ox on, it's completely done away
with the first (BIGNUMBER - 1) loops, knowing that the only thing that
changes is 'dummy', etc....
my dilemma is that:
I don't want to force it to "use" the value on all the loops, say by
printing it to the screen, as this would heavily dilute the time of my
function, i.e. most of the time of each loop wouldn't be spent doing the
function being tested, it would be spent printing.
I don't want to put debug compilation arguments on as this would make it
completely dumb and possibly create the illusion that my own function is
better than a standard implementation when perhaps it isn't when my program
is compiled in release mode.
So, my question is what compiler settings is the best to put on so that the
compiler isn't completely dumb in terms of optimization, but isn't a
absolute complete smartass either. (i.e. as close as possible to release
mode, to give a fair test, but to knock off the one (or more) settings that
allow it to completely eliminate calls)
i.e. make the calls that *I'm* telling it to do, but to do them as fast as
possible. i.e. i'm telling it what I want it to actually *do*, not what I
want the end result to be.
Hope this makes sense
Thanks