How should ctype.h's functions be used?

Discussion in 'C Programming' started by Sarloc, Oct 20, 2005.

  1. Sarloc

    Sarloc Guest

    I'm a user of a brazilian programming forum and I've been reading a
    discussion that never seems to end on how ctype.h's functions should be
    used. One of the guys keeps saying that values passed to any of the
    functions should be cast to unsigned char and the other guy keeps
    saying that there is no need for that. Both of them seem to know what
    they are talking about and I am confused. So I thought to myself: hey,
    maybe I should ask other people and see what they think. That is the
    purpose of this topic. Thank you for your time.
     
    Sarloc, Oct 20, 2005
    #1
    1. Advertising

  2. Sarloc

    Ben Pfaff Guest

    "Sarloc" <> writes:

    > I'm a user of a brazilian programming forum and I've been reading a
    > discussion that never seems to end on how ctype.h's functions should be
    > used. One of the guys keeps saying that values passed to any of the
    > functions should be cast to unsigned char and the other guy keeps
    > saying that there is no need for that.


    With the to*() and is*() functions, you should be careful to cast
    char arguments to unsigned char before calling them. Type `char'
    may be signed or unsigned, depending on your compiler or its
    configuration. If char is signed, then some characters have
    negative values; however, the arguments to is*() and to*()
    functions must be nonnegative (or EOF). Casting to unsigned char
    fixes this problem by forcing the character to the corresponding
    positive value.
    --
    "The lusers I know are so clueless, that if they were dipped in clue
    musk and dropped in the middle of pack of horny clues, on clue prom
    night during clue happy hour, they still couldn't get a clue."
    --Michael Girdwood, in the monastery
     
    Ben Pfaff, Oct 20, 2005
    #2
    1. Advertising

  3. "Sarloc" <> writes:
    > I'm a user of a brazilian programming forum and I've been reading a
    > discussion that never seems to end on how ctype.h's functions should be
    > used. One of the guys keeps saying that values passed to any of the
    > functions should be cast to unsigned char and the other guy keeps
    > saying that there is no need for that. Both of them seem to know what
    > they are talking about and I am confused. So I thought to myself: hey,
    > maybe I should ask other people and see what they think. That is the
    > purpose of this topic. Thank you for your time.


    You need to cast the arguments to unsigned char.

    C99 7.4p1 says:

    The header <ctype.h> declares several functions useful for
    classifying and mapping characters. In all cases the argument is
    an int, the value of which shall be representable as an unsigned
    char or shall equal the value of the macro EOF. If the argument
    has any other value, the behavior is undefined.

    If c is an object of type char, and plain char happens to be signed in
    your implementation, and the value of c happens to be a negative value
    other than EOF, then isalpha(c), for example, will invoke undefined
    behavior (after the value of c is promoted to int).

    You're likely to get away with it if you happen not to use negative
    char values, or if plain char is unsigned, or if the implementation
    accomodates negative values (I've seen some that do).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Oct 20, 2005
    #3
  4. Sarloc

    Eric Sosman Guest

    Ben Pfaff wrote:

    > "Sarloc" <> writes:
    >
    >
    >>I'm a user of a brazilian programming forum and I've been reading a
    >>discussion that never seems to end on how ctype.h's functions should be
    >>used. One of the guys keeps saying that values passed to any of the
    >>functions should be cast to unsigned char and the other guy keeps
    >>saying that there is no need for that.

    >
    >
    > With the to*() and is*() functions, you should be careful to cast
    > char arguments to unsigned char before calling them. Type `char'
    > may be signed or unsigned, depending on your compiler or its
    > configuration. If char is signed, then some characters have
    > negative values; however, the arguments to is*() and to*()
    > functions must be nonnegative (or EOF). Casting to unsigned char
    > fixes this problem by forcing the character to the corresponding
    > positive value.


    There's a subtlety here that's worth pointing out: Ben
    says (correctly) that you must cast a _char_ argument to
    unsigned char when using a <ctype.h> function. However, if
    the argument is an int obtained from getc() or something of
    the sort, you must _not_ cast it.

    The reason for casting a char argument to unsigned char
    is, as Ben explains, to handle C's built-in uncertainty as
    to whether char is signed or unsigned. The reason for _not_
    casting the int returned by getchar() is that this int already
    has the proper non-negative value for a "legitimate" character
    or has the negative value EOF. That is, getchar() can return
    more distinct values than a char can represent (the extra value
    being EOF), and if you coerce EOF to an unsigned char you'll
    lose the ability to distinguish it from a legitimate character.

    char *p = "Ɯberwald";
    if (isupper(*p)) /* wrong */
    if (isupper((unsigned char)*p)) /* right */

    int ch = getchar();
    if (isupper(ch)) /* right */
    if (isupper((unsigned char)ch)) /* wrong */

    --
    Eric Sosman
    lid
     
    Eric Sosman, Oct 22, 2005
    #4
  5. Sarloc

    Ben Pfaff Guest

    Eric Sosman <> writes:

    > There's a subtlety here that's worth pointing out: Ben
    > says (correctly) that you must cast a _char_ argument to
    > unsigned char when using a <ctype.h> function. However, if
    > the argument is an int obtained from getc() or something of
    > the sort, you must _not_ cast it.


    Good point. I'll add that to my boilerplate text.
    --
    int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.\
    \n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
    );while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p\
    );}return 0;}
     
    Ben Pfaff, Oct 22, 2005
    #5
  6. On Wed, 19 Oct 2005 23:53:37 GMT, Keith Thompson <>
    wrote:

    > "Sarloc" <> writes:
    > > I'm a user of a brazilian programming forum and I've been reading a
    > > discussion that never seems to end on [... whether to cast for ctype.h]


    > You need to cast the arguments to unsigned char.
    >
    > C99 7.4p1 says: <snip>


    Not always. You must ensure the value is in the range of unsigned
    char, or EOF (which last is rarely useful). Casting to unsigned char
    is one simple and effective way of doing this, but not the only one.
    For example if you use an int variable that was last assigned the
    return from getchar(), it is guaranteed to be in range without
    casting, except on the rare systems where UCHAR_MAX > INT_MAX, where
    you are unlikely to be able to use stdio.h anyway.

    - David.Thompson1 at worldnet.att.net
     
    Dave Thompson, Oct 24, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michael Brennan

    ctype.h - macros or functions?

    Michael Brennan, Jun 4, 2006, in forum: C Programming
    Replies:
    3
    Views:
    462
    Eric Sosman
    Jun 4, 2006
  2. hg
    Replies:
    2
    Views:
    394
  3. rych
    Replies:
    1
    Views:
    332
  4. Voom
    Replies:
    3
    Views:
    582
    Ben Pfaff
    Oct 6, 2010
  5. Zhang Yuan
    Replies:
    13
    Views:
    751
    Ben Bacarisse
    Jun 13, 2012
Loading...

Share This Page