In said:
I have an unsigned char array. I want to determine if each char's ascii
value is less than 127. Is there a faster way than looping through the
characters and checking if each is less than 127?
Yes, if the array is properly aligned and properly sized to be aliased
with an array of unsigned int (things that you can control):
if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
else {
/* check char by char */
}
One could argue that the bits inside an int could be randomly distributed,
but this is the kind of risk that you can reasonably take. However, if
you really want, you can explicitly check:
allascii = -1;
if (UINT_MAX == 4294967295 && CHAR_BIT == 8 && sizeof(unsigned) == 4) {
unsigned n = 0x80808080;
unsigned char *p = (unsigned char *)&n;
unsigned test = p[0] | p[1] | p[2] | p[3];
if (test == 0x80) {
unsigned acc = 0;
unsigned *p = (unsigned *)array, *q = p + sizeof array / sizeof *p;
while (p < q) acc |= *p++;
if ((acc & 0x80808080) == 0) allascii = 1;
else allascii = 0;
}
}
if (allascii < 0) {
/* check char by char */
}
There is another way of coding the loop, to avoid scanning the whole
array in case of an early non-ascii character:
allascii = 1;
while (p < q)
if ((*p++ & 0x80808080) != 0) {
allascii = 0;
break;
}
But now, the body of the loop may execute slower, so it's hard to say
which version to prefer, without knowing how your typical data looks
like. If most arrays contain only ascii characters, the original version
is probably better.
Dan