strtok and strsep

Discussion in 'C Programming' started by Keith Thompson, Nov 4, 2011.

  1. "Bill Cunningham" <> writes:
    > I just read that some new function called strsep that I've never heard
    > of has replaced strtok and strtok is now evidently deprecated. Is this C99
    > or some other standard? strsep...got me.


    strsep() is a *proposed* replacement for strtok() (and I don't think
    it's particularly new). It differs in its handling of empty fields.

    strsep(), unlink strtok(), is not defined by the C standard (nor is it
    defined by POSIX). My man page says it conforms to 4.4BSD.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Nov 4, 2011
    #1
    1. Advertising

  2. I just read that some new function called strsep that I've never heard
    of has replaced strtok and strtok is now evidently deprecated. Is this C99
    or some other standard? strsep...got me.


    Bill
    Bill Cunningham, Nov 4, 2011
    #2
    1. Advertising

  3. Keith Thompson

    Seebs Guest

    On 2011-11-04, Keith Thompson <> wrote:
    > strsep() is a *proposed* replacement for strtok() (and I don't think
    > it's particularly new).


    It's not.

    > It differs in its handling of empty fields.


    Also in that it is inherently thread-safe, etcetera.

    > strsep(), unlink strtok(), is not defined by the C standard (nor is it
    > defined by POSIX). My man page says it conforms to 4.4BSD.


    And to this day I think it should have been added in C99 when we had the
    proposal. It's trivial to write, and it's a VERY useful function... The
    latter being more than I feel I can say for strtok().

    -s
    --
    Copyright 2011, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    I am not speaking for my employer, although they do rent some of my opinions.
    Seebs, Nov 4, 2011
    #3
  4. Keith Thompson

    Kaz Kylheku Guest

    On 2011-11-04, Keith Thompson <> wrote:
    > "Bill Cunningham" <> writes:
    >> I just read that some new function called strsep that I've never heard
    >> of has replaced strtok and strtok is now evidently deprecated. Is this C99
    >> or some other standard? strsep...got me.

    >
    > strsep() is a *proposed* replacement for strtok() (and I don't think
    > it's particularly new). It differs in its handling of empty fields.
    >
    > strsep(), unlink strtok(), is not defined by the C standard (nor is it
    > defined by POSIX). My man page says it conforms to 4.4BSD.


    People working with strings in C using the standard library
    should familiarize themselves with the Snobol-inspired functions
    strspn, strcspn and strpbrk.

    (Snobol has the pattern matching primitives SPAN and BREAK,
    where we get the terminology for strspn and strpbrk: the pointer
    to the break!)

    Time and time again I have seen awful ad-hoc tokenizing C code that could have
    been reduced to like 1/5 the number of lines with strspn and strcspn, and made
    easy to understand at the same time.

    In 2001 I posted the following to comp.lang.c: a strtok function which lets you
    maintain your context to avoid the internal global variable.
    It retains the disadvantage of poking zeros in the original string.

    You can see that the task of pulling a token based on a
    a set of separator character is very easy. A call to strspn,
    a call to strcspn and taking care of some cases.

    #include <string.h>
    #include <stdio.h>

    /*
    * To use this function, initialize a pointer P
    * to point to the start of the string. Then extract
    * tokens T like this:
    * T = get_next_token(&P, delimiters);
    * When it returns a null pointer, there are no more,
    * and P is set to null value as well.
    */

    char *get_next_token(char **context, const char *delim)
    {
    char *ret;

    /* A null context indicates no more tokens. */
    if (*context == 0)
    return 0;

    /* Skip delimiters to find start of token */
    ret = (*context += strspn(*context, delim));

    /* skip to end of token */
    *context += strcspn(*context, delim);

    /* If the token has zero length, we just
    skipped past a run of trailing delimiters, or
    were at the end of the string already.
    There are no more tokens. */

    if (ret == *context) {
    *context = 0;
    return 0;
    }

    /* If the character past the end of the token is the end of the string,
    set context to 0 so next time we will report no more tokens.
    Otherwise put a 0 there, and advance one character past. */

    if (**context == 0) {
    *context = 0;
    } else {
    **context = 0;
    (*context)++;
    }

    return ret;
    }

    /*
    * Handy macro wrapper for get_next_token
    */

    #define FOR_EACH_TOKEN(CTX, I, S, D) \
    for (CTX = (S), (I) = get_next_token(&(CTX), D); \
    (I) != 0; \
    (I) = get_next_token(&(CTX), D))

    int main(int argc, char **argv)
    {
    char *context, *iter;

    if (argc >= 2)
    FOR_EACH_TOKEN (context, iter, argv[1], ":")
    puts(iter);

    return 0;
    }
    Kaz Kylheku, Nov 4, 2011
    #4
  5. "Bill Cunningham" <> wrote in message
    news:4eb329c1$0$14136$...
    > I just read that some new function called strsep that I've never heard
    > of has replaced strtok and strtok is now evidently deprecated. Is this C99
    > or some other standard? strsep...got me.
    >


    From my limited use of strsep(), one major difference between strsep() and
    strtok()... is that strtok() will skip over a run of characters that are
    delimiter characters, and strsep() will return a null string for each
    additional delimiter character. Also, as pointed out by others, strtok()
    saves internal state and thus is *not* thread safe.

    --
    +<><><><><><><><><><><><><><><><><><><>+
    | Charles Richmond |
    +<><><><><><><><><><><><><><><><><><><>+
    Charles Richmond, Nov 4, 2011
    #5
  6. On Nov 4, 3:12 pm, "Charles Richmond" <> wrote:
    >
    > From my limited use of strsep(), one major difference between strsep() and
    > strtok()... is that strtok() will skip over a run of characters that are
    > delimiter characters, and strsep() will return a null string for each
    > additional delimiter character.  Also, as pointed out by others, strtok()
    > saves internal state and thus is *not* thread safe.
    >

    The problem is that a run of spaces almost certainly means the same
    thing as just a single space, a run of commas almost certainly
    indicates missing data, unless it's a trailing comma on a newline, and
    a run of non-space whitespace, like tabs, could mean anything,
    depending on context.

    These rules are hard to code in a single function.
    --
    Basic Algorithms, the second book of C you should read after your
    introductory C primer.
    http://www.malcolmmclean.site11.com/www
    Malcolm McLean, Nov 4, 2011
    #6
  7. Keith Thompson

    Seebs Guest

    On 2011-11-04, Malcolm McLean <> wrote:
    > The problem is that a run of spaces almost certainly means the same
    > thing as just a single space, a run of commas almost certainly
    > indicates missing data, unless it's a trailing comma on a newline, and
    > a run of non-space whitespace, like tabs, could mean anything,
    > depending on context.
    >
    > These rules are hard to code in a single function.


    But unnecessary, because it's easy to handle that using strsep() and
    knowing what rules you're using at any given time. Skipping the null
    substrings is trivial.

    -s
    --
    Copyright 2011, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    I am not speaking for my employer, although they do rent some of my opinions.
    Seebs, Nov 4, 2011
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alex Vinokur

    strtok() and std::string

    Alex Vinokur, Apr 14, 2005, in forum: C++
    Replies:
    6
    Views:
    4,907
    Pete Becker
    Apr 14, 2005
  2. Victor Bazarov
    Replies:
    0
    Views:
    1,033
    Victor Bazarov
    Jun 14, 2005
  3. Alex Vinokur
    Replies:
    0
    Views:
    509
    Alex Vinokur
    Jun 15, 2005
  4. Replies:
    9
    Views:
    756
  5. Ram
    Replies:
    3
    Views:
    1,065
    Seebs
    Jan 15, 2010
Loading...

Share This Page