Trimming whitespaces

J

john_g83

have a bit of c code that is ment to take a string (that may or may not
have spaces before or after the string) i.e. " stuff ", and trims off
the whitespace before and after.
Code:

char *trim (char *str, char ch)
{
char *first, *last;
int count;

/* Move first to the first character that isn't the same as ch */
for (first = str; *first == ch; first++);
/* Move last to the null character. Thats the only way to know 100% we
are
* removing items from the end of the string */
for (last = first; *last != '\0'; last++);
/* Ok now we backtrack until we find a character that isn't the same as
ch */
for (last--; *last == ch; last--);

if ( first != str)
{
for (count=0; count< last - first + 1; count++)
str[count] = *(first+count);
str[count] = '\0';
}
else
{
str[last-first] = '\0';
}

return str;
}

the problem is that it always removes the last letter of str as well.
i.e.
" stuff " -> "stuf" any ideas why this is happening.
Cheers
John
 
P

pete

have a bit of c code that is ment to take a string
(that may or may not
have spaces before or after the string) i.e. " stuff ", and trims off
the whitespace before and after.
Code:

char *trim (char *str, char ch)
{
char *first, *last;
int count;

/* Move first to the first character that isn't the same as ch */
for (first = str; *first == ch; first++);

That line could over run an array when (ch == '\0').
/* Ok now we backtrack until we find a character that
isn't the same as ch */
for (last--; *last == ch; last--);

When a string consists of a null terminated array of
characters which are all equal to ch, then what happens?
 
P

pete

pete said:
That line could over run an array when (ch == '\0').


When a string consists of a null terminated array of
characters which are all equal to ch, then what happens?

#include <string.h>

char *trim(char *str, char ch)
{
char *const p = str;

while (*str != '\0' && *str == ch) {
++str;
}
memmove(p, str, 1 + strlen(str));
str = p + strlen(p);
while (str != p && *--str == ch) {
*str = '\0';
}
return p;
}
 
R

Rajan

John,
Are you passing const char* as an argument to the trim function?
i.e. let's say in main() are you invoking trim(" stuff ", ' ');
If you are passing const char* like this you can't do any changes in
str[] subscript because this is a read-only section which you can't
change.
If you have to pass an argument in trim , it has to be either an array
address or allocated pointer.
 
R

Rajan

John,
This is a code which will work fine:-

char *trim (char *str, char ch)
{
char *first, *last;

for (first = str; *first == ch; first++);
str = first;
for (last = str; *last != ch; last++) ;
*last = '\0';
return str;
}
 
A

Al Bowers

Rajan said:
John,
This is a code which will work fine:-

char *trim (char *str, char ch)
{
char *first, *last;

for (first = str; *first == ch; first++);
str = first;
for (last = str; *last != ch; last++) ;
*last = '\0';
return str;
}

It would not work if str is a empty string, i.e. "".
Also, it will fail if all the characters in str are value ch, i.e.
char buf[32] = "aaaaaaa";
trim(buf,'a');
 
B

Ben Bacarisse

Rajan said:
John,
This is a code which will work fine:-

char *trim (char *str, char ch)
{
char *first, *last;

for (first = str; *first == ch; first++);
str = first;
for (last = str; *last != ch; last++) ;
*last = '\0';
return str;
}

It would not work if str is a empty string, i.e. "".
Also, it will fail if all the characters in str are value ch, i.e.
char buf[32] = "aaaaaaa";
trim(buf,'a');

.... and it finds the first "ch" after first "non-ch" not the first of a
conscutive run of "ch" at the end of str as - the original code clearly
intended.

Because this function returns a pointer other than the one it was passed
if the storage is malloced it can't be freed with out holding onto the
original pointer somewhere else making it complicated to use in some
situations.
 
A

Al Bowers

Ben said:
Rajan said:
John,
This is a code which will work fine:-

char *trim (char *str, char ch)
{
char *first, *last;

for (first = str; *first == ch; first++);
str = first;
for (last = str; *last != ch; last++) ;
*last = '\0';
return str;
}

It would not work if str is a empty string, i.e. "".
Also, it will fail if all the characters in str are value ch, i.e.
char buf[32] = "aaaaaaa";
trim(buf,'a');


... and it finds the first "ch" after first "non-ch" not the first of a
conscutive run of "ch" at the end of str as - the original code clearly
intended.

Because this function returns a pointer other than the one it was passed
if the storage is malloced it can't be freed with out holding onto the
original pointer somewhere else making it complicated to use in some
situations.
Somewhat similiar to the "complicated" use of function realloc.
Example:
buf = realloc(buf, size)
intead of
char *tmp = realloc(buf,size)

More troublesome to me is that function trim as defined above, must
have synopsis saying to not use the function if the str is an
empty string, or if str consists entirely of characters ch.
 
J

john_bode

(e-mail address removed) wrote:

[snip code]
the problem is that it always removes the last letter of str as well.
i.e.
" stuff " -> "stuf" any ideas why this is happening.
Cheers
John

Huh. I'm not getting that result based on the same test data (" stuff
"). You sure that's the code you're actually running?
 
C

CBFalconer

have a bit of c code that is ment to take a string (that may or
may not have spaces before or after the string) i.e. " stuff ",
and trims off the whitespace before and after.
Code:

char *trim (char *str, char ch)

Untested:

char *trim(char *s, char ch)
{
char *p;

if (s && *s && ch) { /* avoid evil cases */
while (ch == *s) s++; /* trims leading. */
p = s; /* must be advanced over entry */
while (*p) p++; /* find end of string */
p-- /* last char in string */
while ((p > s) && (ch == *p)) p--;
*p = '\0';
}
return s; /* ok in evil cases */
}
 
S

SM Ryan

(e-mail address removed) wrote:
# have a bit of c code that is ment to take a string (that may or may not
# have spaces before or after the string) i.e. " stuff ", and trims off
# the whitespace before and after.
# Code:
#
# char *trim (char *str, char ch)

# the problem is that it always removes the last letter of str as well.

Do your increments inside the loop so they only happen if the loop
predicate is true.

while (*str==ch) str++;
char *last = str+strlen(str)-1;
while (last>=str && *last==ch) *last-- = 0;
 
C

Chris Torek

Untested:

Indeed. It contains one syntax error, and one other error. :)
char *trim(char *s, char ch)
{
char *p;

if (s && *s && ch) { /* avoid evil cases */
while (ch == *s) s++; /* trims leading. */
p = s; /* must be advanced over entry */
while (*p) p++; /* find end of string */

So far, this is OK, although I would replace that last line with:

p += strlen(p);

and then simplify the two lines to:

p = s + strlen(s);

Note that we now have *p=='\0'. (Also, there is no need to test
*s -- if *s=='\0', the code will be a no-op, once we fix it. I
also think that a low-level function like this is OK if it crashes
when passed a NULL pointer, or behaves badly when ch=='\0', but
this is more a matter of taste.)
p-- /* last char in string */
while ((p > s) && (ch == *p)) p--;

Both bugs are in these two lines. Suppose strlen(p) was 0, so that
we have p==s initially. (This can happen if the entire string is
just ch characters.) Then "p--;" (after fixing the missing semicolon)
leaves p equal to s-1. The code depends on p >= s, because the
next line is:
*p = '\0';

This will remove one extra character; and if s was unchanged from
when trim() was first called, p will point outside the buffer to
be trimmed, smashing some unrelated data.

The shortest fix is to replace the two buggy lines with:

while (p > s && p[-1] == ch) p--;

The first test (p > s) ensures that the second is allowed, and the
second test (p[-1] == ch) detects when the last "to-be-retained"
character is one that should be discarded after all.
}
return s; /* ok in evil cases */
}

Note that if the character(s)-to-be-trimmed were passed as a string
(allowing trimming of, e.g., " \t\n", which might be appropriate
for a buffer obtained via fgets()), we could write the above as:

#include <string.h>

char *trim(char *s, const char *remove) {
char *p;

s += strspn(s, remove); /* advance over leading unwanteds */
p = s + strlen(s);
while (p > s && strchr(remove, p[-1]) != NULL)
p--; /* back up over trailing unwanteds */
*p = '\0'; /* overwrite first trailing unwanted,
or replace '\0' with '\0' */
return s;
}
 
N

Nils Weller

So far, this is OK, although I would replace that last line with:

p += strlen(p);

and then simplify the two lines to:

p = s + strlen(s);

I like using strchr() for this purpose (though strlen() may potentially
be implemented slightly faster);

p = strchr(s, 0);

(Interestingly, many people seem to be unaware of the fact that strchr()
considers the terminating null character to be part of the string, which
is why I have seen many buggy strchr() implementations, so one could
argue that the strlen() version is safer and thus superior, after all
:))
 
R

Rajan

Hi Al,
Thanks for your thoughts on this.
I thought that this function trim was meant to print any string by
trimming white spaces taking str as " aaaa" or something of that sort
and ch as white space which is why I wrote this piece of code, but in
any case the *last = '\0' would still eat up one char of the string
let's say if I have "aaaa".
So any string without white spaces would get printed as it is except
that it would eat one char, which is my mistake i.e. *last='\0' without
putting a condition
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top