Trimming whitespaces

Discussion in 'C Programming' started by john_g83@hotmail.com, Jun 23, 2005.

  1. Guest

    have a bit of c code that is ment to take a string (that may or may not
    have spaces before or after the string) i.e. " stuff ", and trims off
    the whitespace before and after.
    Code:

    char *trim (char *str, char ch)
    {
    char *first, *last;
    int count;

    /* Move first to the first character that isn't the same as ch */
    for (first = str; *first == ch; first++);
    /* Move last to the null character. Thats the only way to know 100% we
    are
    * removing items from the end of the string */
    for (last = first; *last != '\0'; last++);
    /* Ok now we backtrack until we find a character that isn't the same as
    ch */
    for (last--; *last == ch; last--);

    if ( first != str)
    {
    for (count=0; count< last - first + 1; count++)
    str[count] = *(first+count);
    str[count] = '\0';
    }
    else
    {
    str[last-first] = '\0';
    }

    return str;
    }

    the problem is that it always removes the last letter of str as well.
    i.e.
    " stuff " -> "stuf" any ideas why this is happening.
    Cheers
    John
     
    , Jun 23, 2005
    #1
    1. Advertising

  2. pete Guest

    wrote:
    >
    > have a bit of c code that is ment to take a string
    > (that may or may not
    > have spaces before or after the string) i.e. " stuff ", and trims off
    > the whitespace before and after.
    > Code:
    >
    > char *trim (char *str, char ch)
    > {
    > char *first, *last;
    > int count;
    >
    > /* Move first to the first character that isn't the same as ch */
    > for (first = str; *first == ch; first++);


    That line could over run an array when (ch == '\0').

    > /* Ok now we backtrack until we find a character that
    > isn't the same as ch */
    > for (last--; *last == ch; last--);


    When a string consists of a null terminated array of
    characters which are all equal to ch, then what happens?

    --
    pete
     
    pete, Jun 23, 2005
    #2
    1. Advertising

  3. pete Guest

    pete wrote:
    >
    > wrote:
    > >
    > > have a bit of c code that is ment to take a string
    > > (that may or may not
    > > have spaces before or after the string)
    > > i.e. " stuff ", and trims off
    > > the whitespace before and after.
    > > Code:
    > >
    > > char *trim (char *str, char ch)
    > > {
    > > char *first, *last;
    > > int count;
    > >
    > > /* Move first to the first character that isn't the same as ch */
    > > for (first = str; *first == ch; first++);

    >
    > That line could over run an array when (ch == '\0').
    >
    > > /* Ok now we backtrack until we find a character that
    > > isn't the same as ch */
    > > for (last--; *last == ch; last--);

    >
    > When a string consists of a null terminated array of
    > characters which are all equal to ch, then what happens?


    #include <string.h>

    char *trim(char *str, char ch)
    {
    char *const p = str;

    while (*str != '\0' && *str == ch) {
    ++str;
    }
    memmove(p, str, 1 + strlen(str));
    str = p + strlen(p);
    while (str != p && *--str == ch) {
    *str = '\0';
    }
    return p;
    }

    --
    pete
     
    pete, Jun 23, 2005
    #3
  4. Rajan Guest

    John,
    Are you passing const char* as an argument to the trim function?
    i.e. let's say in main() are you invoking trim(" stuff ", ' ');
    If you are passing const char* like this you can't do any changes in
    str[] subscript because this is a read-only section which you can't
    change.
    If you have to pass an argument in trim , it has to be either an array
    address or allocated pointer.
     
    Rajan, Jun 23, 2005
    #4
  5. Rajan Guest

    John,
    This is a code which will work fine:-

    char *trim (char *str, char ch)
    {
    char *first, *last;

    for (first = str; *first == ch; first++);
    str = first;
    for (last = str; *last != ch; last++) ;
    *last = '\0';
    return str;
    }
     
    Rajan, Jun 23, 2005
    #5
  6. Al Bowers Guest

    Rajan wrote:
    > John,
    > This is a code which will work fine:-
    >
    > char *trim (char *str, char ch)
    > {
    > char *first, *last;
    >
    > for (first = str; *first == ch; first++);
    > str = first;
    > for (last = str; *last != ch; last++) ;
    > *last = '\0';
    > return str;
    > }
    >


    It would not work if str is a empty string, i.e. "".
    Also, it will fail if all the characters in str are value ch, i.e.
    char buf[32] = "aaaaaaa";
    trim(buf,'a');

    --
    Al Bowers
    Tampa, Fl USA
    mailto: (remove the x to send email)
    http://www.geocities.com/abowers822/
     
    Al Bowers, Jun 23, 2005
    #6
  7. On Thu, 23 Jun 2005 09:24:53 -0400, Al Bowers wrote:

    > Rajan wrote:
    >> John,
    >> This is a code which will work fine:-
    >>
    >> char *trim (char *str, char ch)
    >> {
    >> char *first, *last;
    >>
    >> for (first = str; *first == ch; first++);
    >> str = first;
    >> for (last = str; *last != ch; last++) ;
    >> *last = '\0';
    >> return str;
    >> }
    >>

    >
    > It would not work if str is a empty string, i.e. "".
    > Also, it will fail if all the characters in str are value ch, i.e.
    > char buf[32] = "aaaaaaa";
    > trim(buf,'a');


    .... and it finds the first "ch" after first "non-ch" not the first of a
    conscutive run of "ch" at the end of str as - the original code clearly
    intended.

    Because this function returns a pointer other than the one it was passed
    if the storage is malloced it can't be freed with out holding onto the
    original pointer somewhere else making it complicated to use in some
    situations.

    --
    Ben.
     
    Ben Bacarisse, Jun 23, 2005
    #7
  8. Al Bowers Guest

    Ben Bacarisse wrote:

    > On Thu, 23 Jun 2005 09:24:53 -0400, Al Bowers wrote:
    >
    >
    >>Rajan wrote:
    >>
    >>>John,
    >>>This is a code which will work fine:-
    >>>
    >>>char *trim (char *str, char ch)
    >>>{
    >>>char *first, *last;
    >>>
    >>>for (first = str; *first == ch; first++);
    >>>str = first;
    >>>for (last = str; *last != ch; last++) ;
    >>>*last = '\0';
    >>>return str;
    >>>}
    >>>

    >>
    >>It would not work if str is a empty string, i.e. "".
    >>Also, it will fail if all the characters in str are value ch, i.e.
    >>char buf[32] = "aaaaaaa";
    >>trim(buf,'a');

    >
    >
    > ... and it finds the first "ch" after first "non-ch" not the first of a
    > conscutive run of "ch" at the end of str as - the original code clearly
    > intended.
    >
    > Because this function returns a pointer other than the one it was passed
    > if the storage is malloced it can't be freed with out holding onto the
    > original pointer somewhere else making it complicated to use in some
    > situations.
    >

    Somewhat similiar to the "complicated" use of function realloc.
    Example:
    buf = realloc(buf, size)
    intead of
    char *tmp = realloc(buf,size)

    More troublesome to me is that function trim as defined above, must
    have synopsis saying to not use the function if the str is an
    empty string, or if str consists entirely of characters ch.


    --
    Al Bowers
    Tampa, Fl USA
    mailto: (remove the x to send email)
    http://www.geocities.com/abowers822/
     
    Al Bowers, Jun 23, 2005
    #8
  9. Guest

    wrote:

    [snip code]

    > the problem is that it always removes the last letter of str as well.
    > i.e.
    > " stuff " -> "stuf" any ideas why this is happening.
    > Cheers
    > John


    Huh. I'm not getting that result based on the same test data (" stuff
    "). You sure that's the code you're actually running?
     
    , Jun 23, 2005
    #9
  10. CBFalconer Guest

    wrote:
    >
    > have a bit of c code that is ment to take a string (that may or
    > may not have spaces before or after the string) i.e. " stuff ",
    > and trims off the whitespace before and after.
    > Code:
    >
    > char *trim (char *str, char ch)


    Untested:

    char *trim(char *s, char ch)
    {
    char *p;

    if (s && *s && ch) { /* avoid evil cases */
    while (ch == *s) s++; /* trims leading. */
    p = s; /* must be advanced over entry */
    while (*p) p++; /* find end of string */
    p-- /* last char in string */
    while ((p > s) && (ch == *p)) p--;
    *p = '\0';
    }
    return s; /* ok in evil cases */
    }

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
     
    CBFalconer, Jun 23, 2005
    #10
  11. SM Ryan Guest

    wrote:
    # have a bit of c code that is ment to take a string (that may or may not
    # have spaces before or after the string) i.e. " stuff ", and trims off
    # the whitespace before and after.
    # Code:
    #
    # char *trim (char *str, char ch)

    # the problem is that it always removes the last letter of str as well.

    Do your increments inside the loop so they only happen if the loop
    predicate is true.

    while (*str==ch) str++;
    char *last = str+strlen(str)-1;
    while (last>=str && *last==ch) *last-- = 0;


    --
    SM Ryan http://www.rawbw.com/~wyrmwif/
    JUSTICE!
    Justice is dead.
     
    SM Ryan, Jun 23, 2005
    #11
  12. Chris Torek Guest

    In article <>
    CBFalconer <> wrote:
    >Untested:


    Indeed. It contains one syntax error, and one other error. :)

    > char *trim(char *s, char ch)
    > {
    > char *p;
    >
    > if (s && *s && ch) { /* avoid evil cases */
    > while (ch == *s) s++; /* trims leading. */
    > p = s; /* must be advanced over entry */
    > while (*p) p++; /* find end of string */


    So far, this is OK, although I would replace that last line with:

    p += strlen(p);

    and then simplify the two lines to:

    p = s + strlen(s);

    Note that we now have *p=='\0'. (Also, there is no need to test
    *s -- if *s=='\0', the code will be a no-op, once we fix it. I
    also think that a low-level function like this is OK if it crashes
    when passed a NULL pointer, or behaves badly when ch=='\0', but
    this is more a matter of taste.)

    > p-- /* last char in string */
    > while ((p > s) && (ch == *p)) p--;


    Both bugs are in these two lines. Suppose strlen(p) was 0, so that
    we have p==s initially. (This can happen if the entire string is
    just ch characters.) Then "p--;" (after fixing the missing semicolon)
    leaves p equal to s-1. The code depends on p >= s, because the
    next line is:

    > *p = '\0';


    This will remove one extra character; and if s was unchanged from
    when trim() was first called, p will point outside the buffer to
    be trimmed, smashing some unrelated data.

    The shortest fix is to replace the two buggy lines with:

    while (p > s && p[-1] == ch) p--;

    The first test (p > s) ensures that the second is allowed, and the
    second test (p[-1] == ch) detects when the last "to-be-retained"
    character is one that should be discarded after all.

    > }
    > return s; /* ok in evil cases */
    > }


    Note that if the character(s)-to-be-trimmed were passed as a string
    (allowing trimming of, e.g., " \t\n", which might be appropriate
    for a buffer obtained via fgets()), we could write the above as:

    #include <string.h>

    char *trim(char *s, const char *remove) {
    char *p;

    s += strspn(s, remove); /* advance over leading unwanteds */
    p = s + strlen(s);
    while (p > s && strchr(remove, p[-1]) != NULL)
    p--; /* back up over trailing unwanteds */
    *p = '\0'; /* overwrite first trailing unwanted,
    or replace '\0' with '\0' */
    return s;
    }
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
     
    Chris Torek, Jun 23, 2005
    #12
  13. Nils Weller Guest

    In article <>, Chris Torek wrote:
    > In article <>
    > CBFalconer <> wrote:
    > [...]
    >> while (*p) p++; /* find end of string */

    >
    > So far, this is OK, although I would replace that last line with:
    >
    > p += strlen(p);
    >
    > and then simplify the two lines to:
    >
    > p = s + strlen(s);


    I like using strchr() for this purpose (though strlen() may potentially
    be implemented slightly faster);

    p = strchr(s, 0);

    (Interestingly, many people seem to be unaware of the fact that strchr()
    considers the terminating null character to be part of the string, which
    is why I have seen many buggy strchr() implementations, so one could
    argue that the strlen() version is safer and thus superior, after all
    :))

    --
    Nils R. Weller, Bremen / Germany
    My real email address is ``nils<at>gnulinux<dot>nl''
    .... but I'm not speaking for the Software Libre Foundation!
     
    Nils Weller, Jun 23, 2005
    #13
  14. Rajan Guest

    Hi Al,
    Thanks for your thoughts on this.
    I thought that this function trim was meant to print any string by
    trimming white spaces taking str as " aaaa" or something of that sort
    and ch as white space which is why I wrote this piece of code, but in
    any case the *last = '\0' would still eat up one char of the string
    let's say if I have "aaaa".
    So any string without white spaces would get printed as it is except
    that it would eat one char, which is my mistake i.e. *last='\0' without
    putting a condition
     
    Rajan, Jun 25, 2005
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Steve C. Orr, MCSD

    Re: Trimming Blank Spaces in String

    Steve C. Orr, MCSD, Aug 9, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    473
  2. William Mild
    Replies:
    2
    Views:
    931
    William Mild
    Oct 23, 2003
  3. Brian Henry
    Replies:
    5
    Views:
    5,727
    Brian Henry
    Oct 21, 2004
  4. =?Utf-8?B?Q2hyaXM=?=

    Help trimming URL path

    =?Utf-8?B?Q2hyaXM=?=, Dec 14, 2004, in forum: ASP .Net
    Replies:
    4
    Views:
    3,067
    Juan T. Llibre [MVP]
    Dec 14, 2004
  5. Paul
    Replies:
    0
    Views:
    359
Loading...

Share This Page