B
Bryan Olson
Yesterday I embarrassed myself on sci.crypt with some incorrect C
code and corresponding claims about the language. My source was
Harbison and Steele (H&S), /C, A Reference Manual/ and I thought
I was on strong ground.
The first issue, legal inputs for toupper() and other character-
handling functions, I wrote up and sent to the appropriate
address for errata items. Dr. Harbison subsequently
acknowledged that it seems to be bug. My explanation is at:
http://groups-beta.google.com/group/sci.crypt/msg/87461a4bc7c3246d
or Usenet Message ID:
<[email protected]>.
Now I'm looking at the description of fgetc() in H&S. I have the
third and fifth editions, which both say:
It reads the next character from the stream and returns it
as a value of type int. [page 374 in the fifth edition]
When H&S describes ungetc() in the same section, the text notes
that the argument is "(converted to unsigned char)". There's
no such note about fgetc().
My reading of H&S lead me to believe that on implementations
where the char type is signed, fgetc() on a binary file could
return a negative value that is a legal character (not EOF).
The ANSI/ISO standard defines fgetc(), saying, in part:
[...] the fgetc function obtains that character as an
unsigned char converted to an int [...]
According to the standard, fgetc() returns all character values
as non-negative integers.
I that many participants here both use H&S and know the standard
well. A few questions: Am I right that the fgetc() description
in H&S is incorrect, or am I missing something? Are there other
obvious problems with H&S relating to this char-might-be-signed
business?
Finally, I expect technical books to have some errors, but I was
surprised when sci.crypt participants quickly identified code I
had written based on H&S to have undefined behavior. I had
passed a char value to toupper(). The mistake in H&S dates from
at least 1991. How could it have gone unnoticed in such a
widely-used reference?
code and corresponding claims about the language. My source was
Harbison and Steele (H&S), /C, A Reference Manual/ and I thought
I was on strong ground.
The first issue, legal inputs for toupper() and other character-
handling functions, I wrote up and sent to the appropriate
address for errata items. Dr. Harbison subsequently
acknowledged that it seems to be bug. My explanation is at:
http://groups-beta.google.com/group/sci.crypt/msg/87461a4bc7c3246d
or Usenet Message ID:
<[email protected]>.
Now I'm looking at the description of fgetc() in H&S. I have the
third and fifth editions, which both say:
It reads the next character from the stream and returns it
as a value of type int. [page 374 in the fifth edition]
When H&S describes ungetc() in the same section, the text notes
that the argument is "(converted to unsigned char)". There's
no such note about fgetc().
My reading of H&S lead me to believe that on implementations
where the char type is signed, fgetc() on a binary file could
return a negative value that is a legal character (not EOF).
The ANSI/ISO standard defines fgetc(), saying, in part:
[...] the fgetc function obtains that character as an
unsigned char converted to an int [...]
According to the standard, fgetc() returns all character values
as non-negative integers.
I that many participants here both use H&S and know the standard
well. A few questions: Am I right that the fgetc() description
in H&S is incorrect, or am I missing something? Are there other
obvious problems with H&S relating to this char-might-be-signed
business?
Finally, I expect technical books to have some errors, but I was
surprised when sci.crypt participants quickly identified code I
had written based on H&S to have undefined behavior. I had
passed a char value to toupper(). The mistake in H&S dates from
at least 1991. How could it have gone unnoticed in such a
widely-used reference?