transformation wisdom

M

mdh

Could I get some help understanding a concept that is related to
exercise 4-9 in K&R II. The question relates to the properties of "EOF"
and the issue of transformation from Char to Int. On page 43 or K&R,
(last paragraph) it says "There is one subtle point about the
conversion of characters to integers. The language does not specify
whether variables of type char are signed or unsigned quantities......"
Then goes onto explaining how different machines might convert a char
to a pos or neg integer. But, then it says, (p 44, 1st paragraph) "The
definition of C guarantees that any character in the machine's standard
printing character set will never be negative, so these characters
will always be positive quantities in expressions. But arbitrary bit
patterns stored in character variables may appear to be negative on
some machines, yet positive on others"
I am clearly missing something. The answer to the exercise simply had
the "push-back" characters stored in an array of type "integer" as
opposed to type "character", but even though I see this, the above
explanation has left me more confused than enlightened!
thanks in advance.
 
R

Richard Heathfield

mdh said:
Could I get some help understanding a concept that is related to
exercise 4-9 in K&R II. The question relates to the properties of "EOF"
and the issue of transformation from Char to Int. On page 43 or K&R,
(last paragraph) it says "There is one subtle point about the
conversion of characters to integers. The language does not specify
whether variables of type char are signed or unsigned quantities......"
Then goes onto explaining how different machines might convert a char
to a pos or neg integer. But, then it says, (p 44, 1st paragraph) "The
definition of C guarantees that any character in the machine's standard
printing character set will never be negative, so these characters
will always be positive quantities in expressions. But arbitrary bit
patterns stored in character variables may appear to be negative on
some machines, yet positive on others"
I am clearly missing something. The answer to the exercise simply had
the "push-back" characters stored in an array of type "integer" as
opposed to type "character", but even though I see this, the above
explanation has left me more confused than enlightened!

It's even more confusing than that, because all input in C is done "as if"
by repeated calls to getc. Now, getc will only ever return a negative value
on end-of-file (when, of course, it returns EOF). The rest of the time, it:

a) captures the character
b) represents it as unsigned char
c) converts the unsigned char representation to an int
d) returns the int

Because of step b), even if your implementation has signed chars by default
and even if your input has something bizarre in it (say, the UK currency
symbol) you will still get a positive value for it. It's only when you
shove it back into a char that it will, if appropriate, revert to being
negative.

Actually, it generally works out okay, just doing what comes naturally.

In a getc (or getchar) loop, just use int.

int ch;
while((ch = getchar()) != EOF)
{
putchar(ch);
}

If you're filtering, just use ch:

int ch;
while((ch = getchar()) != EOF)
{
if(isalpha(ch))
{
putchar(ch);
}
}

If you're populating a string, use a char array:

int ch;
char buf[N] = {0};
int n = 0;
while(n < N - 1 && (ch = getchar()) != EOF)
{
if(isalpha(ch))
{
buf[n++] = ch;
}
}

In other words, just do normal stuff, and the chances are high that it will
all work exactly as you expect.
 
E

Eric Sosman

mdh said:
Could I get some help understanding a concept that is related to
exercise 4-9 in K&R II. The question relates to the properties of "EOF"
and the issue of transformation from Char to Int. On page 43 or K&R,
(last paragraph) it says "There is one subtle point about the
conversion of characters to integers. The language does not specify
whether variables of type char are signed or unsigned quantities......"
Then goes onto explaining how different machines might convert a char
to a pos or neg integer. But, then it says, (p 44, 1st paragraph) "The
definition of C guarantees that any character in the machine's standard
printing character set will never be negative, so these characters
will always be positive quantities in expressions. But arbitrary bit
patterns stored in character variables may appear to be negative on
some machines, yet positive on others"
I am clearly missing something. The answer to the exercise simply had
the "push-back" characters stored in an array of type "integer" as
opposed to type "character", but even though I see this, the above
explanation has left me more confused than enlightened!
thanks in advance.

It's the magic word "standard" that may be the cause of your
confusion. The C language defines a set of characters that must
be available at run-time: the upper- and lower-case letters, the
decimal digits, various punctuation marks, some special things
like '\n' an '\0', and so on. These are the "standard" characters,
and all of them must have non-negative codes.

But each implementation may also support additional characters
over and above those required by the language definition. Accented
letters like Àéîôü, special symbols like ¶¥$©, letters outside the
English repertoire like ßΣÆ, and perhaps many others. These are
"extended" characters, and their codes may be positive or negative;
the language definition doesn't specify.

The upshot is that even though the standard characters will
always be non-negative, any arbitrary character code (that might
be either a standard character or an extended character) could be
of either sign.
 
M

mdh

Eric said:
The upshot is that even though the standard characters will
always be non-negative, any arbitrary character code (that might
be either a standard character or an extended character) could be
of either sign.


thanks...makes sense.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top