Comment on trim string function please

B

Ben Bacarisse

Joe Wright said:
Mine's prettier..

#define trim rtrim /* trim() is synonomous with rtrim() */

/* Remove trailing whitespace */

char * rtrim(char *str) {
char *s, *p; int c;
s = p = str;
while ((c = *s++)) if (c > ' ') p = s;
^^^^^^^^^
Yuck! Was that put in to trip the OP up?
 
V

vippstar

On Jul 20, 12:01 pm, (e-mail address removed) wrote:

Imagine that isspace is implemented as this in an ASCII system

char _Table[257] = { EOF, 0, 0, 0, .... };
char *_Ptable = _Table + 1;
int isspace(int c) { return _Ptable[c]; }
#define isspace(c) _Ptable[c]

I forgot to add this line,
#define EOF -1
(if this message appears with way too many newlines, it's google
groups doing it...)

<snip>
 
B

Ben Bacarisse

I also think it is. It doesn't write such a test on an extra line ``if
(!s || !*s) return s;'', and doesn't have multiple exit points.

It does fewer tests because it tests for fewer conditions! Joe
Wright's code does not test for a NULL which you seem to want. Also,
beware of the c > ' ' test. In short, it is wrong to use it. Test
for (in this case) !isspace((unsigned)c) instead.

Since he uses a new variable to store the char, I'd avoid the cast by
doing this:

char * rtrim(char *str) {
char *s, *p;
unsigned char c;
s = p = str;
while ((c = *s++)) if (!isspace(c)) p = s;
*p = '\0';
return str;
}
 
V

vippstar

It does fewer tests because it tests for fewer conditions! Joe
Wright's code does not test for a NULL which you seem to want. Also,
beware of the c > ' ' test. In short, it is wrong to use it. Test
for (in this case) !isspace((unsigned)c) instead.

That's wrong. It needs (unsigned char).
 
C

CBFalconer

(e-mail address removed) wrote:

Imagine that isspace is implemented as this in an ASCII system

char _Table[257] = { EOF, 0, 0, 0, .... };
char *_Ptable = _Table + 1;
int isspace(int c) { return _Ptable[c]; }
#define isspace(c) _Ptable[c]

I forgot to add this line,
#define EOF -1

Don't do that. EOF is system defined in stdio.h, and need not be
-1. If it is something else you have generated a big foulup.
 
B

Ben Bacarisse

Joe Wright said:
I wrote the code assuming ASCII. (c > ' ') is a normal character while
(c <= ' ') is whitespace.

So I assumed, but it is still likely to trip someone up, even on an
"ASCII" machine. I put "ASCII" in quotes because almost no one uses
plain ASCII any more. If there are characters > 127 on a machine with
signed char, it goes wrong. It also goes wrong if you don't want to
trim those characters < ' ' that isspace declares are not spaces.

Just use isspace and !isspace.
 
B

Ben Bacarisse

CBFalconer said:
(e-mail address removed) wrote:

Imagine that isspace is implemented as this in an ASCII system

char _Table[257] = { EOF, 0, 0, 0, .... };
char *_Ptable = _Table + 1;
int isspace(int c) { return _Ptable[c]; }
#define isspace(c) _Ptable[c]

I forgot to add this line,
#define EOF -1

Don't do that. EOF is system defined in stdio.h, and need not be
-1. If it is something else you have generated a big foulup.

He was describing an implementation. That is almost impossible
without showing things that should not be done in ordinary code.

There is a mistake in it, though.
char _Table[257] = { EOF, 0, 0, 0, .... };

should start with 0 not EOF and he could have avoided the need to show a
value for EOF by writing:

char _Table[] = { 0, 0, 0, 0, .... };
char *_Ptable = _Table - EOF;
 
V

vippstar

CBFalconer said:
(e-mail address removed) wrote:
<snip>
Imagine that isspace is implemented as this in an ASCII system
char _Table[257] = { EOF, 0, 0, 0, .... };
char *_Ptable = _Table + 1;
int isspace(int c) { return _Ptable[c]; }
#define isspace(c) _Ptable[c]
I forgot to add this line,
#define EOF -1
Don't do that. EOF is system defined in stdio.h, and need not be
-1. If it is something else you have generated a big foulup.

He was describing an implementation. That is almost impossible
without showing things that should not be done in ordinary code.

That's correct.
There is a mistake in it, though.

Whoops, the function returns EOF if EOF is passed. It should return 0,
you are right.
char _Table[257] = { EOF, 0, 0, 0, .... };

should start with 0 not EOF and he could have avoided the need to show a
value for EOF by writing:

char _Table[] = { 0, 0, 0, 0, .... };
char *_Ptable = _Table - EOF;

The value of EOF is still shown, because of _Table's size.
Here's what could work in C99,

char _Table[-EOF + 256] = { [0] = 0, [-EOF] = 0, [-EOF + 1] =
1, .... };
char *_Ptable = _Table - EOF;
 
B

Ben Bacarisse

On Jul 20, 5:30 pm, Ben Bacarisse <[email protected]> wrote:
should start with 0 not EOF and he could have avoided the need to show a
value for EOF by writing:

char _Table[] = { 0, 0, 0, 0, .... };
char *_Ptable = _Table - EOF;

The value of EOF is still shown, because of _Table's size.

That's why I left out the size in my example! I doubt anyone would
implement it this way, but there is no need to show the value of EOF --
the table just need an initial -EOF entries in it.
 
L

lawrence.jones

The N1124 May 2005 draft shows that isspace() accepts an int. How is
an negative value an out-of-range argument?

7.4p1:

The header <ctype.h> declares several functions useful for
classifying and mapping characters. In all cases the argument
is an int, the value of which shall be representable as an
unsigned char or shall equal the value of the macro EOF. If the
argument has any other value, the behavior is undefined.

A negative value is not representable as an unsigned char.
 
V

vippstar

(e-mail address removed) writes:

That's why I left out the size in my example! I doubt anyone would
implement it this way, but there is no need to show the value of EOF --
**the table just need an initial -EOF entries in it.**

That's why I said that the value of EOF is shown, indirectly in the
code. Changing EOF to a lesser value would break things, whereas with
my code it does not break things. (as long as EOF is not 256 - 32767
or less)
 
B

Barry Schwarz

void TrimCString(char *str)
{
size_t i = 0;
while(isspace(str))


Bug: If `str' contains negative-valued characters, this use
may violate the "contract" of isspace() by passing an out-of-range
argument.


The N1124 May 2005 draft shows that isspace() accepts an int. How is
an negative value an out-of-range argument?


N1124 is slightly out of date and you should find N1256 (I think this
is the latest draft available) on the web. However, even 1124 says in
paragraph 7.4:

"In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the
macro EOF. If the argument has any other value, the behavior is
undefined."

Obviously, the only negative value allowed is the value of the macro
EOF.
Use `while (isspace( (unsigned char)str ))'. (This
is one of those occasions where a cast is almost always required
and almost always omitted, as opposed to the more usual situation
where a cast is almost always inserted and almost always wrong.)


The C-FAQ 13.6 <http://c-faq.com/lib/strtok.html> shows two example
calls on isspace() with argument type of char without any cast.


Human fallibility?
I still don't understand why a cast is needed! stupid me. Can someone
please inspire me?

On some systems, plain char is a signed type, equivalent to but
distinct from signed char. The cast insures the code will not invoke
undefined behavior on those systems.


Remove del for email
 
L

lovecreatesbea...

On some systems, plain char is a signed type, equivalent to but
distinct from signed char. The cast insures the code will not invoke
undefined behavior on those systems.

Thank you.

The parameter type of isspace() is declared as int because of the same
reason of the return type of getchar(). They both need to hold chars
plus EOF. Do I understand it correctly?

If ``str'' contains EOF, isspace((unsigned char)str) becomes
isspace((unsigned char)EOF). EOF may be -1 or may not. Does the
expression evaluates true or false?
 
C

CBFalconer

Barry Schwarz said:
On some systems, plain char is a signed type, equivalent to but
distinct from signed char. The cast insures the code will not
invoke undefined behavior on those systems.

The parameter type of isspace() is declared as int because of the
same reason of the return type of getchar(). They both need to
hold chars plus EOF. Do I understand it correctly?

If ``str'' contains EOF, isspace((unsigned char)str) becomes
isspace((unsigned char)EOF). EOF may be -1 or may not. Does the
expression evaluates true or false?


'str' is an array of char. It cannot contain an EOF, because and
EOF is a negative value, and for these routines chars are
considered unsigned. That is why you receive chars (from getc, for
example) in an int.
 
B

Ben Bacarisse

Thank you.

The parameter type of isspace() is declared as int because of the same
reason of the return type of getchar(). They both need to hold chars
plus EOF. Do I understand it correctly?

That is probably why we are in this situation. It would have been
reasonable (to my mind) to permit only char values (thus forcing the
programmer to exclude EOF first).
If ``str'' contains EOF, isspace((unsigned char)str) becomes
isspace((unsigned char)EOF). EOF may be -1 or may not. Does the
expression evaluates true or false?


No, EOF can not appear in a string. The UB is simply because the
standard says it is. The only valid things you can pass to isspace
are EOF (some negative number) and integers equal to 0, 1, 2, ... all
the way up to UCHAR_MAX. On a system where char is signed some
characters in a string can have negative values and the standard says
you can't pass these to isspace and friends.

The standard *could* have allowed signed char values, but that would
have created problems with existing code that does, for example, this:

while (isspace(getchar()));

Language standards will only very rarely decide something that makes
existing code invalid.
 
L

lovecreatesbea...

Thank (e-mail address removed), Barry and Ben.

On some systems, plain char is a signed type, equivalent to but
distinct from signed char. The cast insures the code will not invoke
undefined behavior on those systems.
If ``str'' contains EOF, isspace((unsigned char)str) becomes
isspace((unsigned char)EOF). EOF may be -1 or may not. Does the
expression evaluates true or false?


(For isspace() accepts int, there's no reason that ``str'' should be
type of array of char.)

Or I can ask it in this way. EOF is a legal candidate argument of
isspace(). Is isspace((unsigned char)EOF) a good practice? (As
mentioned in an other post.)
 
L

lovecreatesbea...

(e-mail address removed) said:

No, it isn't. It's meaningless, and in (good) practice you terminate your
read-loop on EOF anyway:

while((ch = getc(fp)) != EOF)
{
if(isspace(ch)) /* here, ch cannot have the value EOF */
{

Thank you.

If the input doesn't come from stdin. We do need this equivalent,
right?

if (i == EOF)
/*
* (x) STOP
* isspace() can't handle this.
*/
;

isspc = isspace((unsigned char)i);

What a design of isspace()!
 
L

lovecreatesbea...

Thank you.

If the input doesn't come from stdin. We do need this equivalent,
right?

        if (i == EOF)
                /*
                 * (x) STOP
                 * isspace() can't handle this.

Sorry. I meant that the immediately followed call to isspace(unsigned
char(i)).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top