I
Ioannis Vranos
ispunct() returns true for all symbols? (like <>/@^&#@ etc).
Ioannis Vranos
Ioannis Vranos
Ioannis Vranos said:ispunct() returns true for all symbols? (like <>/@^&#@ etc).
Lew Pitcher said:-----BEGIN PGP SIGNED MESSAGE-----
Caveat: You cross-posted this question to newsgroups that cover two different
computer languages. You may get two different answers, depending on which
language is described.
ispunct() returns true for all symbols? (like <>/@^&#@ etc).
From my manpage that shipped with gcc, ispunct() returns true for any
nonblank character that isn't a letter or a number. gcc says this
subroutine is conformant with ANSI-C.
What, exactly, is considered a letter can vary by locale, but in the C
locale any member of [A-Za-z] is considered alphabetic.
Barry Schwarz said:There are a minimum of 256 possible values for a char.
Blank is only
1. If we stick to the English alphabet, there are 52 letters and ten
digits leaving at least 193 values for which you man page says ispunct
returns true. Unfortunately, the C99 standard says it must be a
printing character which eliminates a significant number of these 193.
Ioannis Vranos said:We must note here that (plain) char may be either of type signed char or
unsigned char, and if it is signed char the negative values are useless
here.
Ioannis Vranos said:Yes i know, however i guessed that C99 ispunct() behaviour does not differ
from C++98 (and C90).
Ioannis Vranos said:We must note here that (plain) char may be either of type signed char or
unsigned char, and if it is signed char the negative values are useless
here.
Barry Schwarz said:There are a minimum of 256 possible values for a char. Blank is only
1.
If we stick to the English alphabet, there are 52 letters and ten
digits leaving at least 193 values for which you man page says ispunct
returns true. Unfortunately, the C99 standard says it must be a
printing character which eliminates a significant number of these 193.
I see three possibilities:
You misquoted the man page.
The man page is less specific than it should be and therefore
misleading.
The man page is incorrect regarding compliance and therefore
misleading.
What, exactly, is considered a letter can vary by locale, but in the C
locale any member of [A-Za-z] is considered alphabetic.
In any locale, a letter is any character for which isalpha returns
true. While your regular expression is correct (because it does not
depend on representation), it may lead someone to believe that if 'A'
<= mychar <= 'Z' then mychar is a letter. On my system, there are
characters between 'I' and "J' and between 'R' and 'S' that are not
letters.
Barry Schwarz said:[...]What, exactly, is considered a letter can vary by locale, but in the C
locale any member of [A-Za-z] is considered alphabetic.
In any locale, a letter is any character for which isalpha returns
true. While your regular expression is correct (because it does not
depend on representation), it may lead someone to believe that if 'A'
<= mychar <= 'Z' then mychar is a letter. On my system, there are
characters between 'I' and "J' and between 'R' and 'S' that are not
letters.
What someone believes, based on a misinterpretaion of a regex can't be
helped. The regex is well defined and doesn't include those other
characters you refer to. Anyway, regex'es aren't C, not yet C++, and
were used only as a shorthand.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.