[ ... ]
I should have written:
return !( c < '0' || c > '9' );
[ ... ]
I figured that my form would be more efficient.
My form consists of:
Less than
Greater than
Or
Invert
Your form consists of:
Greater than or equal
Less than or equal
And
That would make sense if 'greater than or equal' was
implemented as two separate tests (and likewise 'less
than or equal').
That's not generally true though -- I can hardly think of
a processor that can't combine each of those into a
single test.
But if we were going for massive optimization, we'd be better off with:
inline bool NotDecimalDigit( char const c )
{
return c < '0' || c > '9';
}
inline bool IsDecimalDigit( char const c )
{
return !NotDecimalDigit(c);
}
I can hardly think of a processor for which I'd consider
this a "massive optimization". If I really had to
implement my own versions of these, I'd consider
something like this:
inline bool IsNotDedimalDigit(char ch) {
return static_cast<unsigned>(ch-'0') > 9;
}
inline bool IsDecimalDigit(char ch) {
return static_cast<unsigned>(ch-'0') <= 9;
}
To give a concrete comparison, here's what Visual C++
produces for the three versions in question:
?IsDecimalDigitRolf@@YI_ND@Z PROC NEAR
cmp cl, 48
jl SHORT $L315
cmp cl, 57
jg SHORT $L315
mov eax, 1
ret 0
$L315:
xor eax, eax
ret 0
?IsDecimalDigitRolf@@YI_ND@Z ENDP
?IsDecimalDigitTomas@@YI_ND@Z PROC NEAR
cmp cl, 48
jl SHORT $L307
cmp cl, 57
jg SHORT $L307
xor eax, eax
xor ecx, ecx
test al, al
sete cl
mov al, cl
ret 0
$L307:
mov eax, 1
xor ecx, ecx
test al, al
sete cl
mov al, cl
ret 0
?IsDecimalDigitTomas@@YI_ND@Z ENDP
?IsDecimalDigitJerry@@YI_ND@Z PROC NEAR
movsx eax, cl
sub eax, 48
mov ecx, 9
cmp ecx, eax
sbb eax, eax
add eax, 1
ret 0
?IsDecimalDigitJerry@@YI_ND@Z ENDP
I doubt anybody needs to read Intel assembly language to
guess that your attempted optimization seems to have
backfired. Rolf's code produces output that's
considerably shorter and simpler.
What may be a lot less obvious is that even though your
code is longer than Rolf's, the real difference in speed
will usually be pretty minimal. In both cases, you have a
couple of conditional branches that will often consume
the bulk of the time. In fact, either one of these might
easily consume 20 or more clock cycles on a modern CPU,
and in a bad case, you might hit that penalty twice.
Perhaps worse, the speed will often vary over a range of
3:1 or more depending on the input data.
My code avoids conditional execution entirely, so it's
not only short, but consistently fast (in fact, probably
always faster than even the best case for either of the
others).
This really isn't meant as a "You suck; I rule" kind of
post either. Rather, it's intended to point out that it
can be _really_ tricky to do micro-optimization like this
at all well. Unless you know quite a lot about your
compiler and your target CPU, it's entirely possible for
an attempted optimization to backfire, sometimes quite
badly.
Just for an obvious example, while my code works well for
a relatively typical target, on a processor that didn't
use two's complement integers, it would almost certainly
be truly terrible -- almost certainly quite a lot bigger
and slower than either your code or Rolf's.
Then use each form when it's most appropriate.
If you're after "massive optimization", I have my doubts
that either is likely to ever be "most appropriate".
The standard library isdigit implementation may well be
better than the ones I've given though. The standard
library will often use a table-drive approach, giving
code vaguely like this:
bool isdigit(int ch) {
return (type_table[ch+1] & _Digit) != 0;
}
On older processors that ran about the same speed as
memory, this was often a big win. On current processors,
that's a lot less dependable. If the table is loaded
entirely into the cache, this will typically execute very
quickly. OTOH, if the data for the table has to be loaded
from main memory very often, this will may easily be the
slowest of all.
As an aside, applying 'const' at the top level as you've
done above is basically pointless -- since the char is
being passed by value, there's no way this function could
possibly modify the original, whether const qualified or
not.