Robert said:
Malcolm McLean wrote:
news
[email protected]... [can isspace(0) be nonzero, in any locale?
You are safe. The end of string marker must be zero, and isspace() must
return false for it.
Can you cite chapter and verse for the second "must?"
From 9899:1999 §7.4.1.10:
"The isspace function tests for any character that is a standard white-
space character or is one of a locale-specific set of characters for
which isalnum is false."
"Each set is further divided into a basic character set, whose contents
are given by this subclause, and a set of zero or more locale- specific
members (which are not members of the basic character set) called
extended characters."
...
"A byte with all bits set to 0, called the null character, shall exist
in the basic execution character set; it is used to terminate a
character string."
So the null character is a member of the basic character set and
therefore cannot be a locale-specific character. Since isspace()
returns true only for standard white-space characters of locale-
specific characters, of which the null character is neither, it must
return false when called with the null character.
... but "locale-specific set of characters" is not the
same thing as "set of locale-specific characters." In the first case
the adjectival phrase "locale-specific" applies to "set," while in the
second it applies to "characters." "Republican delegates to the
Congress" are not "delegates to the Republican Congress."
Army1987 may have spotted a defect.
Well, I hope no such locale actually exists. If it did, given the
description of strtol():
"First, they decompose the input string into three parts: an
initial, possibly empty, sequence of white-space characters (as
specified by the isspace function), a subject sequence resembling
an integer represented in some radix determined by the value of
base, and a final string of one or more unrecognized characters,
including the terminating null character of the input string.
Then, they attempt to convert the subject sequence to an integer,
and return the result."
I would expect that in such a locale, strtol(" ", &endptr, 10)
would be likely to behave strangely, and
char p[2][4] = { " ", "100" }; strtol(p[0], &endptr, 10)
would almost surely behave strangely. I was surprised by the
Standard not (apparently) forbidding that. The reason why I asked
that is:
I am writing a replacement of strtoul() which doesn't handle
negative numbers (i.e. treats '-' as an invalid character).
unsigned long int ustrtoul(const char *nptr, char **endptr,
int base)
{
const char *cursor;
for (cursor = nptr; isspace((unsigned char)*cursor); cursor++)
continue;
if (*cursor == '-') { /* Now *cursor is the first nonWS */
if (endptr != NULL)
*endptr = (char *)nptr; /* endptr should be const char**, */
/* but is char** to mimic strtoul() */
return 0;
} else
return strtoul(nptr, endptr, base);
}
If there were a locale with isspace('\0') != 0, that would cause
UB, but possibly even the "real" strtoul() would.