when apply to <ctype.h>,why should I declare a char as an int?

Discussion in 'C Programming' started by Zhang Yuan, Jun 13, 2012.

  1. Zhang Yuan

    Zhang Yuan Guest

    T don not understand the difference between
    int c;
    and
    char c;
    char is declared as signed int.But when i use some function in <ctype.h>
    such as
    isprintf(c);
    not
    isprintf((unsigned char)c);
    I did't find some difference.
    Can you put up some examples,that tell the difference?
    Thank you.
    zhang yuan.
     
    Zhang Yuan, Jun 13, 2012
    #1
    1. Advertising

  2. Zhang Yuan <> writes:

    > T don not understand the difference between
    > int c;
    > and
    > char c;
    > char is declared as signed int.


    I think you mean that, on your system, char is signed. char is not
    "declared" as anything.

    Did you make a mistake with the negative? "T don not" is obviously a
    typing error and I usually don't comment on such things, but if you are
    really saying that you do not understand the difference between these
    two declarations, then I'd have to write a much longer reply!

    >But when i use some function in <ctype.h>
    > such as
    > isprintf(c);
    > not
    > isprintf((unsigned char)c);
    > I did't find some difference.


    The function is "isprint" -- no "f" at the end.

    > Can you put up some examples,that tell the difference?


    The isxxx functions are only defined if the argument is "representable
    as an unsigned char" or if it is "equal the value of the macro EOF".
    (These are quotes from the C standard). The fact that the result is
    often undefined makes it hard to see what's going on simply by writing a
    program, because an implementation is permitted to do whatever it likes
    in those cases. For example, an implementation is permitted treat
    isprint(c) exactly like isprint((unsigned char)c) so you may never see
    any difference at all.

    But to be more practical for a moment. Try this:

    int c = -130;
    puts(isprint( c) ? "T" : "F");
    puts(isprint((unsigned char)c) ? "T" : "F");

    Unless EOF == -130, the first call is undefined by the C standard --
    anything could happen -- but since C implementations are usually quite
    good at avoiding serious problems due to bad calls to these functions,
    you may simply get a visible difference.

    > Thank you.
    > zhang yuan.


    --
    Ben.
     
    Ben Bacarisse, Jun 13, 2012
    #2
    1. Advertising

  3. Zhang Yuan

    James Kuyper Guest

    On 06/13/2012 06:26 AM, Zhang Yuan wrote:
    > T don not understand the difference between
    > int c;
    > and
    > char c;
    > char is declared as signed int. ...


    You're using the words wrong, so I'm not quite sure what you mean. char
    is a built in type, it isn't declared anywhere. The above code declares
    "c", not "char".

    char is an integer type, and it can be signed. If it is signed, it could
    in principle have exactly the same representation as signed int, though
    this would be rather unusual, and would require CHAR_BIT>=16. However,
    even if it had the same representation, it would be a distinct type from
    signed int, a distinction that matters for the purpose of some
    compatibility rules.

    > But when i use some function in <ctype.h>
    > such as
    > isprintf(c);


    I presume you mean isprint() rather than isprintf()?

    > not
    > isprintf((unsigned char)c);
    > I did't find some difference.
    > Can you put up some examples,that tell the difference?



    Did you check whether CHAR_MIN is actually negative on your machine? If
    not, you'll never see a difference. The only way that there could be a
    difference is if c is negative. Did you test with negative values? If
    there is a difference, it might occur only in some locale other than the
    "C" locale - did you test with other locales?

    The key point is that the behavior of isprint(c) is undefined when c has
    any negative value other than EOF. Undefined behavior includes the
    possibility that the behavior of isprint(c) is exactly the same as
    isprint((unsigned char)c). Therefore, while the behavior could be
    different, it's not possible to write a program which is guaranteed to
    show any difference.

    Because the behavior is undefined, it could in principle be arbitrarily
    bad. In principle, it could erase arbitrary files from your hard disk.
    In practice, such things are unlikely; you're unlikely to see anything
    worse than a segfault, and not necessarily even that. However, do you
    really want to bother finding out what the actual behavior is? The
    proper response to learning that a given construct has undefined
    behavior is to stop using it; you shouldn't invest too much time trying
    to find out what the actual behavior of that construct is.

    Exception: behavior undefined by the standard may be defined by some
    other document. If the behavior is defined by some other document, and
    you need to get that behavior, there's nothing wrong with writing such
    code and relying upon the guarantees made by the other document.
    However, you must keep in mind that your code will be safe only on
    systems where that other document's definition of the behavior applies.
    --
    James Kuyper
     
    James Kuyper, Jun 13, 2012
    #3
  4. Zhang Yuan

    Zhang Yuan Guest

    On Wednesday, June 13, 2012 7:16:58 PM UTC+8, Ben Bacarisse wrote:

    >
    > Did you make a mistake with the negative? "T don not" is obviously a
    > typing error and I usually don't comment on such things, but if you are
    > really saying that you do not understand the difference between these
    > two declarations, then I'd have to write a much longer reply!
    >


    First,thank you for forgiving me for my silly fault that i did not check spelling carefully.I should respect you,all my kind teachers in this group,in any cases.
    I will pay attention to these.


    > >But when i use some function in <ctype.h>
    > > such as
    > > isprintf(c);
    > > not
    > > isprintf((unsigned char)c);
    > > I did't find some difference.

    >
    > The function is "isprint" -- no "f" at the end.
    >
    > > Can you put up some examples,that tell the difference?

    >
    > The isxxx functions are only defined if the argument is "representable
    > as an unsigned char" or if it is "equal the value of the macro EOF".
    > (These are quotes from the C standard). The fact that the result is
    > often undefined makes it hard to see what's going on simply by writing a
    > program, because an implementation is permitted to do whatever it likes
    > in those cases. For example, an implementation is permitted treat
    > isprint(c) exactly like isprint((unsigned char)c) so you may never see
    > any difference at all.


    I had run this program below:

    >
    > But to be more practical for a moment. Try this:
    >
    > int c = -130;
    > puts(isprint( c) ? "T" : "F");
    > puts(isprint((unsigned char)c) ? "T" : "F");
    >


    output:
    T
    T

    I don't understand it well.
    The answer are same?
    thank you.
    Please forgive me.




    > Unless EOF == -130, the first call is undefined by the C standard --
    > anything could happen -- but since C implementations are usually quite
    > good at avoiding serious problems due to bad calls to these functions,
    > you may simply get a visible difference.
    >
    > > Thank you.
    > > zhang yuan.

    >
    > --
    > Ben.
     
    Zhang Yuan, Jun 13, 2012
    #4
  5. Zhang Yuan

    Zhang Yuan Guest

    On Wednesday, June 13, 2012 7:37:51 PM UTC+8, James Kuyper wrote:
    > On 06/13/2012 06:26 AM, Zhang Yuan wrote:
    > > T don not understand the difference between
    > > int c;
    > > and
    > > char c;
    > > char is declared as signed int. ...

    >
    > You're using the words wrong, so I'm not quite sure what you mean. char
    > is a built in type, it isn't declared anywhere. The above code declares
    > "c", not "char".
    >
    > char is an integer type, and it can be signed. If it is signed, it could
    > in principle have exactly the same representation as signed int, though
    > this would be rather unusual, and would require CHAR_BIT>=16. However,
    > even if it had the same representation, it would be a distinct type from
    > signed int, a distinction that matters for the purpose of some
    > compatibility rules.
    >
    > > But when i use some function in <ctype.h>
    > > such as
    > > isprintf(c);

    >
    > I presume you mean isprint() rather than isprintf()?
    >
    > > not
    > > isprintf((unsigned char)c);
    > > I did't find some difference.
    > > Can you put up some examples,that tell the difference?

    >
    >
    > Did you check whether CHAR_MIN is actually negative on your machine? If
    > not, you'll never see a difference. The only way that there could be a
    > difference is if c is negative. Did you test with negative values? If
    > there is a difference, it might occur only in some locale other than the
    > "C" locale - did you test with other locales?
    >
    > The key point is that the behavior of isprint(c) is undefined when c has
    > any negative value other than EOF. Undefined behavior includes the
    > possibility that the behavior of isprint(c) is exactly the same as
    > isprint((unsigned char)c). Therefore, while the behavior could be
    > different, it's not possible to write a program which is guaranteed to
    > show any difference.
    >
    > Because the behavior is undefined, it could in principle be arbitrarily
    > bad. In principle, it could erase arbitrary files from your hard disk.
    > In practice, such things are unlikely; you're unlikely to see anything
    > worse than a segfault, and not necessarily even that. However, do you
    > really want to bother finding out what the actual behavior is? The
    > proper response to learning that a given construct has undefined
    > behavior is to stop using it; you shouldn't invest too much time trying
    > to find out what the actual behavior of that construct is.
    >
    > Exception: behavior undefined by the standard may be defined by some
    > other document. If the behavior is defined by some other document, and
    > you need to get that behavior, there's nothing wrong with writing such
    > code and relying upon the guarantees made by the other document.
    > However, you must keep in mind that your code will be safe only on
    > systems where that other document's definition of the behavior applies.
    > --
    > James Kuyper


    Thank you.
    I'm new to C,lacking of experiences.
    I will remember your advice offering general principles for me to study.
    Thank you.
     
    Zhang Yuan, Jun 13, 2012
    #5
  6. Zhang Yuan

    James Kuyper Guest

    On 06/13/2012 07:42 AM, Zhang Yuan wrote:
    > On Wednesday, June 13, 2012 7:16:58 PM UTC+8, Ben Bacarisse wrote:

    ....
    > I had run this program below:
    >
    >>
    >> But to be more practical for a moment. Try this:
    >>
    >> int c = -130;
    >> puts(isprint( c) ? "T" : "F");
    >> puts(isprint((unsigned char)c) ? "T" : "F");
    >>

    >
    > output:
    > T
    > T
    >
    > I don't understand it well.
    > The answer are same?


    They could be. If you're willing to take the risk of undefined behavior,
    try the following for a more comprehensive approach:

    for(int c=CHAR_MIN; 1; c++)
    {
    printf("%d: %c %c\n" isprint(c), isprint((unsigned char)c));
    if(c==CHAR_MAX)
    break;
    }

    The unusual handling of the loop condition is intended to cope with the
    extremely unlikely possibility that CHAR_MAX == INT_MAX on your system.
    If that possibility does come up, the printout is going to be VERY long.

    There's still no guarantee that you'll see any differences (or even that
    this program will run at all) but if there are any, this code will
    probably demonstrate them.
    --
    James Kuyper
     
    James Kuyper, Jun 13, 2012
    #6
  7. Zhang Yuan <> writes:

    > On Wednesday, June 13, 2012 7:16:58 PM UTC+8, Ben Bacarisse wrote:
    >
    >>
    >> Did you make a mistake with the negative? "T don not" is obviously a
    >> typing error and I usually don't comment on such things, but if you are
    >> really saying that you do not understand the difference between these
    >> two declarations, then I'd have to write a much longer reply!
    >>

    >
    > First,thank you for forgiving me for my silly fault that i did not
    > check spelling carefully.I should respect you,all my kind teachers in
    > this group,in any cases. I will pay attention to these.


    There is no need to apologise (my own typing/spelling is dreadful) but
    there is a need to explain! My point was "what did you mean?" and I
    still don't know. I.e. did you mean that you do understand the
    difference between int c; and char c; or that you don't? It really
    makes a difference.

    >> >But when i use some function in <ctype.h>
    >> > such as
    >> > isprintf(c);
    >> > not
    >> > isprintf((unsigned char)c);
    >> > I did't find some difference.

    >>
    >> The function is "isprint" -- no "f" at the end.
    >>
    >> > Can you put up some examples,that tell the difference?

    >>
    >> The isxxx functions are only defined if the argument is "representable
    >> as an unsigned char" or if it is "equal the value of the macro EOF".
    >> (These are quotes from the C standard). The fact that the result is
    >> often undefined makes it hard to see what's going on simply by writing a
    >> program, because an implementation is permitted to do whatever it likes
    >> in those cases. For example, an implementation is permitted treat
    >> isprint(c) exactly like isprint((unsigned char)c) so you may never see
    >> any difference at all.

    >
    > I had run this program below:
    >
    >> But to be more practical for a moment. Try this:
    >>
    >> int c = -130;
    >> puts(isprint( c) ? "T" : "F");
    >> puts(isprint((unsigned char)c) ? "T" : "F");

    >
    > output:
    > T
    > T
    >
    > I don't understand it well.
    > The answer are same?


    OK, on your system they behave the same. I said that they might. The
    first call is undefined -- that means that the C language does not say
    anything about what will happen. Behaving exactly the same as the other
    call is one possibility. Behaving differently only on Tuesdays in
    another.

    The expressions isprint(c) and isprint((unsigned char)c) are different
    but they won't always produce different values.

    I suspect that you've seen somewhere that you should write

    isprint((unsigned char)c)

    and not

    isprint(c)

    and you wonder why. If this is the case, can you link to where this
    advice came from? The reason is that the details matter: sometimes that
    advice is right and sometimes it is wrong. The exact wording and the
    context of the code really matters.

    <snip>
    --
    Ben.
     
    Ben Bacarisse, Jun 13, 2012
    #7
  8. Zhang Yuan

    Ike Naar Guest

    On 2012-06-13, James Kuyper <> wrote:
    > for(int c=CHAR_MIN; 1; c++)
    > {
    > printf("%d: %c %c\n" isprint(c), isprint((unsigned char)c));


    That line looks garbled. Did you mean something like

    printf("%d: %d %d\n", c, isprint(c), isprint((unsigned char)c));

    ?

    > if(c==CHAR_MAX)
    > break;
    > }
     
    Ike Naar, Jun 13, 2012
    #8
  9. Zhang Yuan

    Joe Pfeiffer Guest

    Zhang Yuan <> writes:

    > T don not understand the difference between
    > int c;
    > and
    > char c;
    > char is declared as signed int.But when i use some function in
    > <ctype.h>


    C has several types for representing integers, which are all closely
    related but have subtle differences in size and whether they're signed
    or unsigned.

    "int" is a signed integer in the "natural" word size of the machine.
    For most machines, that means a 32 bit integer (though I've used
    machines in the past that used a 16 bit int).

    "char" is an integer that's the "right size" to represent character
    data. For most machines, that means eight bits. It is allowed to be
    either signed or unsigned, depending on the implementation.

    > such as
    > isprintf(c);
    > not
    > isprintf((unsigned char)c);


    I'm not famililar with the isprintf() function, but the general answer
    would be that while a "char" may be either signed or unsigned, this
    function specifically wants an unsigned char.

    On a machine with an eight bit char, representing negatives in 2's
    complement (which means just about everything, and in particular
    including Intel) a signed char can represent values from -128 through
    127, while an unsigned char can represent values from 0 through 255.

    So, if you passed a -1 to the function, it would "think" it was 255.

    > I did't find some difference.
    > Can you put up some examples,that tell the difference?
    > Thank you.
    > zhang yuan.
     
    Joe Pfeiffer, Jun 13, 2012
    #9
  10. Zhang Yuan

    Zhang Yuan Guest

    On Wednesday, June 13, 2012 8:41:44 PM UTC+8, Ben Bacarisse wrote:

    > There is no need to apologise (my own typing/spelling is dreadful) but
    > there is a need to explain! My point was "what did you mean?" and I
    > still don't know. I.e. did you mean that you do understand the
    > difference between int c; and char c; or that you don't? It really
    > makes a difference.


    I know a little about the difference
    char,unsigned char,signed char ,int

    I think,maybe not right,that char are equal with
    unsigned char in some occasions.




    > I suspect that you've seen somewhere that you should write
    >
    > isprint((unsigned char)c)
    >
    > and not
    >
    > isprint(c)
    >
    > and you wonder why.


    The C standard library.
    P.J.Plaucer

    Page 27 paragraph 6
    Few programmers know to write isprint ( (unsigned char) c) , a much
    safer form. Of course, you can use the type cast safely only where you are
    certain that the argument value EOF cannot occur.

    thank you.
    zhang yuan


    > <snip>
    > --
    > Ben.
     
    Zhang Yuan, Jun 13, 2012
    #10
  11. Zhang Yuan

    James Kuyper Guest

    On 06/13/2012 08:59 AM, Ike Naar wrote:
    > On 2012-06-13, James Kuyper <> wrote:
    >> for(int c=CHAR_MIN; 1; c++)
    >> {
    >> printf("%d: %c %c\n" isprint(c), isprint((unsigned char)c));

    >
    > That line looks garbled. Did you mean something like
    >
    > printf("%d: %d %d\n", c, isprint(c), isprint((unsigned char)c));
    >
    > ?


    Yes. Sorry!
    --
    James Kuyper
     
    James Kuyper, Jun 13, 2012
    #11
  12. Zhang Yuan

    James Kuyper Guest

    On 06/13/2012 09:27 AM, Joe Pfeiffer wrote:
    > Zhang Yuan <> writes:
    >
    >> T don not understand the difference between
    >> int c;
    >> and
    >> char c;
    >> char is declared as signed int.But when i use some function in
    >> <ctype.h>

    ....
    >> such as
    >> isprintf(c);
    >> not
    >> isprintf((unsigned char)c);

    >
    > I'm not famililar with the isprintf() function, but the general answer
    > would be that while a "char" may be either signed or unsigned, this
    > function specifically wants an unsigned char.


    If the function specifically wants an unsigned char, it should be
    declared with a prototype that says so, in which case the conversion
    would occur implicitly, without need for a cast.

    isprintf() is almost certainly a typo for isprint(). isprint() wants an
    int, not an unsigned char, but you should cast a char argument of
    isprint() to unsigned char because, for functions declared in <ctype.h>,
    "In all cases the argument is an int, the value of which shall be
    representable as an unsigned char or shall equal the value of the macro
    EOF. If the argument has any other value, the behavior is undefined."
    Without the cast, c could contain negative values other than EOF, that
    would be preserved by the implicit conversion to int.
    --
    James Kuyper
     
    James Kuyper, Jun 13, 2012
    #12
  13. Zhang Yuan

    James Kuyper Guest

    On 06/13/2012 09:35 AM, Zhang Yuan wrote:
    ....
    > I think,maybe not right,that char are equal with
    > unsigned char in some occasions.


    char must have the same representation as either unsigned char or signed
    char, but which one it's the same as might be different on different
    compilers, or even on the same compiler with different options (gcc, for
    example, has options which allow you to control this).
    --
    James Kuyper
     
    James Kuyper, Jun 13, 2012
    #13
  14. Zhang Yuan <> writes:

    > On Wednesday, June 13, 2012 8:41:44 PM UTC+8, Ben Bacarisse wrote:
    >
    >> There is no need to apologise (my own typing/spelling is dreadful) but
    >> there is a need to explain! My point was "what did you mean?" and I
    >> still don't know. I.e. did you mean that you do understand the
    >> difference between int c; and char c; or that you don't? It really
    >> makes a difference.

    >
    > I know a little about the difference
    > char,unsigned char,signed char ,int
    >
    > I think,maybe not right,that char are equal with
    > unsigned char in some occasions.


    This has been answered so I'll leave it.

    >> I suspect that you've seen somewhere that you should write
    >>
    >> isprint((unsigned char)c)
    >>
    >> and not
    >>
    >> isprint(c)
    >>
    >> and you wonder why.

    >
    > The C standard library.
    > P.J.Plaucer


    An impeccable source (though it's year since I read it).

    > Page 27 paragraph 6
    > Few programmers know to write isprint ( (unsigned char) c) , a much
    > safer form. Of course, you can use the type cast safely only where you are
    > certain that the argument value EOF cannot occur.


    This is good advice. But it really helps to understand why one might
    write this. I would not do it here, for example:

    /* read a digit, if there is one */
    int c = getchar();
    while (c != EOF && !isdigit(c)) c = getchar();

    The reason: getchar returns an int with the value already correctly
    converted. The only possible negative return from getchar is EOF.
    Adding the cast in this situation just complicates the code.

    And, obviously, it's not needed in situations like this:

    unsigned char *skip_spaces(unsigned char *cp)
    {
    while (isspace(*cp)) cp += 1;
    return cp;
    }

    <snip>
    --
    Ben.
     
    Ben Bacarisse, Jun 13, 2012
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Schnoffos
    Replies:
    2
    Views:
    1,220
    Martien Verbruggen
    Jun 27, 2003
  2. trey

    newbie: char* int and char *int

    trey, Sep 10, 2003, in forum: C Programming
    Replies:
    7
    Views:
    405
    Irrwahn Grausewitz
    Sep 10, 2003
  3. Hal Styli
    Replies:
    14
    Views:
    1,646
    Old Wolf
    Jan 20, 2004
  4. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,988
    Smokey Grindel
    Dec 2, 2006
  5. gert
    Replies:
    20
    Views:
    1,167
Loading...

Share This Page