when apply to <ctype.h>,why should I declare a char as an int?

Z

Zhang Yuan

T don not understand the difference between
int c;
and
char c;
char is declared as signed int.But when i use some function in <ctype.h>
such as
isprintf(c);
not
isprintf((unsigned char)c);
I did't find some difference.
Can you put up some examples,that tell the difference?
Thank you.
zhang yuan.
 
B

Ben Bacarisse

Zhang Yuan said:
T don not understand the difference between
int c;
and
char c;
char is declared as signed int.

I think you mean that, on your system, char is signed. char is not
"declared" as anything.

Did you make a mistake with the negative? "T don not" is obviously a
typing error and I usually don't comment on such things, but if you are
really saying that you do not understand the difference between these
two declarations, then I'd have to write a much longer reply!
But when i use some function in <ctype.h>
such as
isprintf(c);
not
isprintf((unsigned char)c);
I did't find some difference.

The function is "isprint" -- no "f" at the end.
Can you put up some examples,that tell the difference?

The isxxx functions are only defined if the argument is "representable
as an unsigned char" or if it is "equal the value of the macro EOF".
(These are quotes from the C standard). The fact that the result is
often undefined makes it hard to see what's going on simply by writing a
program, because an implementation is permitted to do whatever it likes
in those cases. For example, an implementation is permitted treat
isprint(c) exactly like isprint((unsigned char)c) so you may never see
any difference at all.

But to be more practical for a moment. Try this:

int c = -130;
puts(isprint( c) ? "T" : "F");
puts(isprint((unsigned char)c) ? "T" : "F");

Unless EOF == -130, the first call is undefined by the C standard --
anything could happen -- but since C implementations are usually quite
good at avoiding serious problems due to bad calls to these functions,
you may simply get a visible difference.
 
J

James Kuyper

T don not understand the difference between
int c;
and
char c;
char is declared as signed int. ...

You're using the words wrong, so I'm not quite sure what you mean. char
is a built in type, it isn't declared anywhere. The above code declares
"c", not "char".

char is an integer type, and it can be signed. If it is signed, it could
in principle have exactly the same representation as signed int, though
this would be rather unusual, and would require CHAR_BIT>=16. However,
even if it had the same representation, it would be a distinct type from
signed int, a distinction that matters for the purpose of some
compatibility rules.
But when i use some function in <ctype.h>
such as
isprintf(c);

I presume you mean isprint() rather than isprintf()?
not
isprintf((unsigned char)c);
I did't find some difference.
Can you put up some examples,that tell the difference?


Did you check whether CHAR_MIN is actually negative on your machine? If
not, you'll never see a difference. The only way that there could be a
difference is if c is negative. Did you test with negative values? If
there is a difference, it might occur only in some locale other than the
"C" locale - did you test with other locales?

The key point is that the behavior of isprint(c) is undefined when c has
any negative value other than EOF. Undefined behavior includes the
possibility that the behavior of isprint(c) is exactly the same as
isprint((unsigned char)c). Therefore, while the behavior could be
different, it's not possible to write a program which is guaranteed to
show any difference.

Because the behavior is undefined, it could in principle be arbitrarily
bad. In principle, it could erase arbitrary files from your hard disk.
In practice, such things are unlikely; you're unlikely to see anything
worse than a segfault, and not necessarily even that. However, do you
really want to bother finding out what the actual behavior is? The
proper response to learning that a given construct has undefined
behavior is to stop using it; you shouldn't invest too much time trying
to find out what the actual behavior of that construct is.

Exception: behavior undefined by the standard may be defined by some
other document. If the behavior is defined by some other document, and
you need to get that behavior, there's nothing wrong with writing such
code and relying upon the guarantees made by the other document.
However, you must keep in mind that your code will be safe only on
systems where that other document's definition of the behavior applies.
 
Z

Zhang Yuan

Did you make a mistake with the negative? "T don not" is obviously a
typing error and I usually don't comment on such things, but if you are
really saying that you do not understand the difference between these
two declarations, then I'd have to write a much longer reply!

First,thank you for forgiving me for my silly fault that i did not check spelling carefully.I should respect you,all my kind teachers in this group,in any cases.
I will pay attention to these.

The function is "isprint" -- no "f" at the end.


The isxxx functions are only defined if the argument is "representable
as an unsigned char" or if it is "equal the value of the macro EOF".
(These are quotes from the C standard). The fact that the result is
often undefined makes it hard to see what's going on simply by writing a
program, because an implementation is permitted to do whatever it likes
in those cases. For example, an implementation is permitted treat
isprint(c) exactly like isprint((unsigned char)c) so you may never see
any difference at all.

I had run this program below:
But to be more practical for a moment. Try this:

int c = -130;
puts(isprint( c) ? "T" : "F");
puts(isprint((unsigned char)c) ? "T" : "F");

output:
T
T

I don't understand it well.
The answer are same?
thank you.
Please forgive me.
 
Z

Zhang Yuan

You're using the words wrong, so I'm not quite sure what you mean. char
is a built in type, it isn't declared anywhere. The above code declares
"c", not "char".

char is an integer type, and it can be signed. If it is signed, it could
in principle have exactly the same representation as signed int, though
this would be rather unusual, and would require CHAR_BIT>=16. However,
even if it had the same representation, it would be a distinct type from
signed int, a distinction that matters for the purpose of some
compatibility rules.


I presume you mean isprint() rather than isprintf()?



Did you check whether CHAR_MIN is actually negative on your machine? If
not, you'll never see a difference. The only way that there could be a
difference is if c is negative. Did you test with negative values? If
there is a difference, it might occur only in some locale other than the
"C" locale - did you test with other locales?

The key point is that the behavior of isprint(c) is undefined when c has
any negative value other than EOF. Undefined behavior includes the
possibility that the behavior of isprint(c) is exactly the same as
isprint((unsigned char)c). Therefore, while the behavior could be
different, it's not possible to write a program which is guaranteed to
show any difference.

Because the behavior is undefined, it could in principle be arbitrarily
bad. In principle, it could erase arbitrary files from your hard disk.
In practice, such things are unlikely; you're unlikely to see anything
worse than a segfault, and not necessarily even that. However, do you
really want to bother finding out what the actual behavior is? The
proper response to learning that a given construct has undefined
behavior is to stop using it; you shouldn't invest too much time trying
to find out what the actual behavior of that construct is.

Exception: behavior undefined by the standard may be defined by some
other document. If the behavior is defined by some other document, and
you need to get that behavior, there's nothing wrong with writing such
code and relying upon the guarantees made by the other document.
However, you must keep in mind that your code will be safe only on
systems where that other document's definition of the behavior applies.

Thank you.
I'm new to C,lacking of experiences.
I will remember your advice offering general principles for me to study.
Thank you.
 
J

James Kuyper

output:
T
T

I don't understand it well.
The answer are same?

They could be. If you're willing to take the risk of undefined behavior,
try the following for a more comprehensive approach:

for(int c=CHAR_MIN; 1; c++)
{
printf("%d: %c %c\n" isprint(c), isprint((unsigned char)c));
if(c==CHAR_MAX)
break;
}

The unusual handling of the loop condition is intended to cope with the
extremely unlikely possibility that CHAR_MAX == INT_MAX on your system.
If that possibility does come up, the printout is going to be VERY long.

There's still no guarantee that you'll see any differences (or even that
this program will run at all) but if there are any, this code will
probably demonstrate them.
 
B

Ben Bacarisse

Zhang Yuan said:
First,thank you for forgiving me for my silly fault that i did not
check spelling carefully.I should respect you,all my kind teachers in
this group,in any cases. I will pay attention to these.

There is no need to apologise (my own typing/spelling is dreadful) but
there is a need to explain! My point was "what did you mean?" and I
still don't know. I.e. did you mean that you do understand the
difference between int c; and char c; or that you don't? It really
makes a difference.
I had run this program below:


output:
T
T

I don't understand it well.
The answer are same?

OK, on your system they behave the same. I said that they might. The
first call is undefined -- that means that the C language does not say
anything about what will happen. Behaving exactly the same as the other
call is one possibility. Behaving differently only on Tuesdays in
another.

The expressions isprint(c) and isprint((unsigned char)c) are different
but they won't always produce different values.

I suspect that you've seen somewhere that you should write

isprint((unsigned char)c)

and not

isprint(c)

and you wonder why. If this is the case, can you link to where this
advice came from? The reason is that the details matter: sometimes that
advice is right and sometimes it is wrong. The exact wording and the
context of the code really matters.

<snip>
 
I

Ike Naar

for(int c=CHAR_MIN; 1; c++)
{
printf("%d: %c %c\n" isprint(c), isprint((unsigned char)c));

That line looks garbled. Did you mean something like

printf("%d: %d %d\n", c, isprint(c), isprint((unsigned char)c));

?
 
J

Joe Pfeiffer

Zhang Yuan said:
T don not understand the difference between
int c;
and
char c;
char is declared as signed int.But when i use some function in
<ctype.h>

C has several types for representing integers, which are all closely
related but have subtle differences in size and whether they're signed
or unsigned.

"int" is a signed integer in the "natural" word size of the machine.
For most machines, that means a 32 bit integer (though I've used
machines in the past that used a 16 bit int).

"char" is an integer that's the "right size" to represent character
data. For most machines, that means eight bits. It is allowed to be
either signed or unsigned, depending on the implementation.
such as
isprintf(c);
not
isprintf((unsigned char)c);

I'm not famililar with the isprintf() function, but the general answer
would be that while a "char" may be either signed or unsigned, this
function specifically wants an unsigned char.

On a machine with an eight bit char, representing negatives in 2's
complement (which means just about everything, and in particular
including Intel) a signed char can represent values from -128 through
127, while an unsigned char can represent values from 0 through 255.

So, if you passed a -1 to the function, it would "think" it was 255.
 
Z

Zhang Yuan

There is no need to apologise (my own typing/spelling is dreadful) but
there is a need to explain! My point was "what did you mean?" and I
still don't know. I.e. did you mean that you do understand the
difference between int c; and char c; or that you don't? It really
makes a difference.

I know a little about the difference
char,unsigned char,signed char ,int

I think,maybe not right,that char are equal with
unsigned char in some occasions.



I suspect that you've seen somewhere that you should write

isprint((unsigned char)c)

and not

isprint(c)

and you wonder why.

The C standard library.
P.J.Plaucer

Page 27 paragraph 6
Few programmers know to write isprint ( (unsigned char) c) , a much
safer form. Of course, you can use the type cast safely only where you are
certain that the argument value EOF cannot occur.

thank you.
zhang yuan
 
J

James Kuyper

I'm not famililar with the isprintf() function, but the general answer
would be that while a "char" may be either signed or unsigned, this
function specifically wants an unsigned char.

If the function specifically wants an unsigned char, it should be
declared with a prototype that says so, in which case the conversion
would occur implicitly, without need for a cast.

isprintf() is almost certainly a typo for isprint(). isprint() wants an
int, not an unsigned char, but you should cast a char argument of
isprint() to unsigned char because, for functions declared in <ctype.h>,
"In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the macro
EOF. If the argument has any other value, the behavior is undefined."
Without the cast, c could contain negative values other than EOF, that
would be preserved by the implicit conversion to int.
 
J

James Kuyper

On 06/13/2012 09:35 AM, Zhang Yuan wrote:
....
I think,maybe not right,that char are equal with
unsigned char in some occasions.

char must have the same representation as either unsigned char or signed
char, but which one it's the same as might be different on different
compilers, or even on the same compiler with different options (gcc, for
example, has options which allow you to control this).
 
B

Ben Bacarisse

Zhang Yuan said:
I know a little about the difference
char,unsigned char,signed char ,int

I think,maybe not right,that char are equal with
unsigned char in some occasions.

This has been answered so I'll leave it.
The C standard library.
P.J.Plaucer

An impeccable source (though it's year since I read it).
Page 27 paragraph 6
Few programmers know to write isprint ( (unsigned char) c) , a much
safer form. Of course, you can use the type cast safely only where you are
certain that the argument value EOF cannot occur.

This is good advice. But it really helps to understand why one might
write this. I would not do it here, for example:

/* read a digit, if there is one */
int c = getchar();
while (c != EOF && !isdigit(c)) c = getchar();

The reason: getchar returns an int with the value already correctly
converted. The only possible negative return from getchar is EOF.
Adding the cast in this situation just complicates the code.

And, obviously, it's not needed in situations like this:

unsigned char *skip_spaces(unsigned char *cp)
{
while (isspace(*cp)) cp += 1;
return cp;
}

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top