CBFalconer said:
Bien. This I can discuss. The point in that code is to process
the char. sequence incoming. In the process it is necessary (once)
to detect error/eof by picking out the special EOF character.
Absent that, it is necessary to have unsigned char values to
satisfy the requirements of such routines as isdigit(), etc. It
also avoids the evils of mixing unsigned and signed values in
comparisons, for the range check.
Let's discuss it then.
I'll quote your proposed code again, to document this debate:
The cleanest way is to input chars until you have something that
will not fit. For example:
unsigned int rdvalue(FILE *f) {
unsigned int ch, value, err;
value = err = 0;
while (EOF != (ch = getc(f))) {
You are indeed mixing signed and unsigned int in this comparison, something
you qualify as evil, which is also my opinion.
/* ch contains an int, which is the value of the char */
if (!isdigit(ch)) break; /* Only interested in digits */
You are passing an unsigned int to isdigit. isdigit is supposed to receive
an int, with specific constraints on its value. I'm not sure if the
language of 7.4p1 is enough to make isdigit invoke undefined behaviour when
passed an unsigned int, but consider that converting -1U to an int may cause
problems and does not necessarily produce the value -1. It is not an actual
problem with your code, but a well hidden bear trap for the next reader, who
might consider simplifying said code by combining the test for EOF and that
for isdigit. With ch defined as unsigned int, ``while ((isdigit(ch =
getc(f)))'' is not well defined.
Consider for example the following classic implementation of isdigit:
#define EOF (-1)
extern unsigned char __ctype_flags[UCHAR_MAX + 2];
#define __C_DIGIT 1
#define isdigit(c) (__ctype_flags[(c) + 1] & __C_DIGIT)
On platforms where size_t is larger than unsigned int (both windows and
linux 64 bit), invoking isdigit with (unsigned)EOF will invoke undefined
behaviour.
It is therefore not advisable to store fgetc(fp) in an unsigned int before
testing for EOF.
ch = ch - '0'; /* form digit value */
if (((UINT_MAX - ch) / 10) > value) {
This test is incorrect, as was explained in another thread.
/* overflow detected, decide what to do */
ch = ch + '0'; /* restore char value */
break;
}
value = 10 * value + ch;
}
ungetc(ch, f); /* keep exit char for the user */
Same problem as above with the value EOF: converting it back to int as
expected by ungetc() may cause overflow. It would be better to not call
ungetc with the value EOF at all, for example invoking it in the loop,
before the break statements.
return err;
} /* untested - you check it out */
untested indeed.
Thus I maintain the single use in the EOF check is optimum in
clarity. Note that it totally avoids casts.
I obviously disagree: the code is not optimum in clarity, it is not
conforming, you avoid casts, but rely on implicit conversions with potential
overflows...