strtok

Discussion in 'C Programming' started by Bill Cunningham, Jan 1, 2012.

  1. Why does my man page for strtok and an extension for a thread safe
    version of this function say not to use these functions? Are they buggy or
    deprecated? What should one use instead?

    Bill
    Bill Cunningham, Jan 1, 2012
    #1
    1. Advertising

  2. On Jan 1, 8:30 pm, "Bill Cunningham" <> wrote:
    >     Why does my man page for strtok and an extension for a thread safe
    > version of this function say not to use these functions? Are they buggy or
    > deprecated?


    neither,what does the man page say? Some people don't like what
    strtok() does. It modifies the string you pass to it and doesn't
    handle empty fields in the way you might like.

    > What should one use instead?


    good question. There's no portable answer. So write your own I
    suppose. I've used strtok() its ok it does what it says on the can
    Nick Keighley, Jan 1, 2012
    #2
    1. Advertising

  3. Bill Cunningham wrote:
    > Why does my man page for strtok and an extension for a thread safe
    > version of this function say not to use these functions? Are they buggy or
    > deprecated? What should one use instead?
    >
    > Bill
    >
    >


    To my knowledge, it uses internal state to maintain the current position
    which can generate race conditions if used from multiple threads
    concurrently.

    eglibc has strtok_r() which is a reentrant function.
    It allows the user to specify the address of a variable to use for
    storing position.

    [OT]
    This is similar to the errno races observed with threaded code.
    Some c libraries thus #define errno to be a function call returning a
    pointer, which then is dereferenced to effectively create a thread-local
    errno.
    Johann Klammer, Jan 1, 2012
    #3
  4. Bill Cunningham

    osmium Guest

    "Bill Cunningham" wrote:

    > Why does my man page for strtok and an extension for a thread safe
    > version of this function say not to use these functions? Are they buggy or
    > deprecated? What should one use instead?


    Bill, I suggest you put any concerns about thread safe operations on the
    back burner, next to "How will my program accommodate quantum computers?"

    If you would quit trying to learn everything you might possibly learn
    *something*.
    osmium, Jan 1, 2012
    #4
  5. Nick Keighley wrote:

    > neither,what does the man page say? Some people don't like what
    > strtok() does. It modifies the string you pass to it and doesn't
    > handle empty fields in the way you might like.
    >
    >> What should one use instead?

    >
    > good question. There's no portable answer. So write your own I
    > suppose. I've used strtok() its ok it does what it says on the can


    Under bugs I have :
    "Never use these functions. If you do note
    These functions modify first arg
    These functions can't be used with const strings.
    The identity of the delimiting character is lost.
    strtok uses a static buffer while parsing so is not thread safe. If using
    threads user strtok_r."

    The man pages online don't say this.

    Bill
    Bill Cunningham, Jan 1, 2012
    #5
  6. "Bill Cunningham" <> writes:

    > Why does my man page for strtok and an extension for a thread safe
    > version of this function say not to use these functions? Are they buggy or
    > deprecated? What should one use instead?


    If your question is still about reading your market data (as you posted
    on comp.programming) you don't need strtok.

    If you are sure that the data will be in the format you previously
    described, fscanf can do the job perfectly well. If you want a little
    more control, read each line using fgets and use sscanf to "parse" the
    data you want.

    Example:

    #include <stdio.h>

    int main(void)
    {
    double price;
    int line_no, day, month, year;
    char line[100];
    while (fgets(line, sizeof line, stdin) &&
    sscanf(line, "%d %lf %2d%2d%2d",
    &line_no, &price, &day, &month, &year) == 5)
    printf("line number: %d, price=%f on %d/%02d/%02d\n",
    line_no, price, day, month, year);
    return 0;
    }

    --
    Ben.
    Ben Bacarisse, Jan 1, 2012
    #6
  7. Ben Bacarisse wrote:
    > "Bill Cunningham" <> writes:
    >
    >> Why does my man page for strtok and an extension for a thread
    >> safe version of this function say not to use these functions? Are
    >> they buggy or deprecated? What should one use instead?

    >
    > If your question is still about reading your market data (as you
    > posted on comp.programming) you don't need strtok.


    Ok

    > If you are sure that the data will be in the format you previously
    > described, fscanf can do the job perfectly well.


    Ok

    If you want a little
    > more control, read each line using fgets and use sscanf to "parse" the
    > data you want.
    >
    > Example:
    >
    > #include <stdio.h>
    >
    > int main(void)
    > {
    > double price;
    > int line_no, day, month, year;
    > char line[100];
    > while (fgets(line, sizeof line, stdin) &&
    > sscanf(line, "%d %lf %2d%2d%2d",
    > &line_no, &price, &day, &month, &year) == 5)
    > printf("line number: %d, price=%f on %d/%02d/%02d\n",
    > line_no, price, day, month, year);
    > return 0;
    > }


    Thanks. That sscanf is alittle hard to read but then again I'm not
    familiar with that function.

    Bill
    Bill Cunningham, Jan 1, 2012
    #7
  8. Bill Cunningham

    jacob navia Guest

    Le 01/01/12 22:10, osmium a écrit :
    > "Bill Cunningham" wrote:
    >
    >> Why does my man page for strtok and an extension for a thread safe
    >> version of this function say not to use these functions? Are they buggy or
    >> deprecated? What should one use instead?

    >
    > Bill, I suggest you put any concerns about thread safe operations on the
    > back burner, next to "How will my program accommodate quantum computers?"
    >
    > If you would quit trying to learn everything you might possibly learn
    > *something*.
    >
    >

    Good words!
    jacob navia, Jan 1, 2012
    #8
  9. On 2012-01-01, Bill Cunningham <> wrote:
    > Nick Keighley wrote:
    >
    >> neither,what does the man page say? Some people don't like what
    >> strtok() does. It modifies the string you pass to it and doesn't
    >> handle empty fields in the way you might like.
    >>
    >>> What should one use instead?

    >>
    >> good question. There's no portable answer. So write your own I
    >> suppose. I've used strtok() its ok it does what it says on the can

    >
    > Under bugs I have :
    > "Never use these functions. If you do note


    Retard manpage author. Nevermind that, use it if it does what you need,
    just be aware of its limitations as others pointed out.

    --
    John Tsiombikas
    http://nuclear.mutantstargoat.com/
    John Tsiombikas, Jan 2, 2012
    #9
  10. Bill Cunningham

    Ben Pfaff Guest

    "Bill Cunningham" <> writes:

    > Why does my man page for strtok and an extension for a thread safe
    > version of this function say not to use these functions? Are they buggy or
    > deprecated? What should one use instead?


    strtok() has at least these problems:

    * It merges adjacent delimiters. If you use a comma as your
    delimiter, then "a,,b,c" will be divided into three tokens,
    not four. This is often the wrong thing to do. In fact, it
    is only the right thing to do, in my experience, when the
    delimiter set contains white space (for dividing a string
    into "words") or it is known in advance that there will be
    no adjacent delimiters.

    * The identity of the delimiter is lost, because it is
    changed to a null terminator.

    * It modifies the string that it tokenizes. This is bad
    because it forces you to make a copy of the string if
    you want to use it later. It also means that you can't
    tokenize a string literal with it; this is not
    necessarily something you'd want to do all the time but
    it is surprising.

    * It can only be used once at a time. If a sequence of
    strtok() calls is ongoing and another one is started,
    the state of the first one is lost. This isn't a
    problem for small programs but it is easy to lose track
    of such things in hierarchies of nested functions in
    large programs. In other words, strtok() breaks
    encapsulation.

    --
    char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
    ={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa67f6aaa,0xaa9aa9f6,0x11f6},*p
    =b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
    2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
    Ben Pfaff, Jan 2, 2012
    #10
  11. On Mon, 02 Jan 2012 10:04:24 -0800, (Ben Pfaff)
    wrote:

    > "Bill Cunningham" <> writes:
    >
    > > Why does my man page for strtok and an extension for a thread safe
    > > version of this function say not to use these functions? Are they buggy or
    > > deprecated? What should one use instead?

    >
    > strtok() has at least these problems:
    >
    > * It merges adjacent delimiters. If you use a comma as your
    > delimiter, then "a,,b,c" will be divided into three tokens,
    > not four. This is often the wrong thing to do. In fact, it
    > is only the right thing to do, in my experience, when the
    > delimiter set contains white space (for dividing a string
    > into "words") or it is known in advance that there will be
    > no adjacent delimiters.
    >

    Yes.

    > * The identity of the delimiter is lost, because it is
    > changed to a null terminator.
    >

    Yes but. IME at least half the time the delimiter is unique (so
    identity doesn't matter). There are a few cases where even nonunique
    delimiters don't matter; one I personally like -- for vanishingly
    small values of like -- is the punctuation in US telephone numbers.

    > * It modifies the string that it tokenizes. This is bad
    > because it forces you to make a copy of the string if
    > you want to use it later. It also means that you can't
    > tokenize a string literal with it; this is not
    > necessarily something you'd want to do all the time but
    > it is surprising.
    >

    Yes but. If you want to parse nondestructively and use the tokens as C
    strings, you have to copy anyway -- and often allocate each token
    separately which is almost certainly costlier. Although that second
    condition isn't a given: I worked on one largish project that
    religiously used {ptr,len} into existing/shared buffers. It required a
    rigid policy on lifetime of the shared buffers (which for this app was
    fairly easy) and a substantial custom library (basically duplicating
    string.h plus more) but after that was paid once it worked nicely.

    And that literals are not (safely) modifiable should be surprising at
    most once, and needs to be learned anyway even without strtok();
    consider mktemp and mkstemp, strupr and strlwr, or doing parity or
    rot13 or other simple ciphering in place.

    > * It can only be used once at a time. If a sequence of
    > strtok() calls is ongoing and another one is started,
    > the state of the first one is lost. This isn't a
    > problem for small programs but it is easy to lose track
    > of such things in hierarchies of nested functions in
    > large programs. In other words, strtok() breaks
    > encapsulation.


    Yes. Although it's often a good idea (sometimes even a requirement) to
    separate the parsing from the 'real' processing, and if the parsing by
    itself is so complicated you can't grok all the code in one fwoop,
    your input syntax is in danger of being unusable anyway.
    David Thompson, Jan 19, 2012
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Adam Balgach
    Replies:
    2
    Views:
    549
    news-east
    Nov 28, 2004
  2. Alex Vinokur

    strtok() and std::string

    Alex Vinokur, Apr 14, 2005, in forum: C++
    Replies:
    6
    Views:
    4,884
    Pete Becker
    Apr 14, 2005
  3. strtok problem

    , Aug 28, 2003, in forum: C Programming
    Replies:
    4
    Views:
    494
  4. Robert

    strtok trouble

    Robert, Sep 5, 2003, in forum: C Programming
    Replies:
    17
    Views:
    1,205
    Jalapeno
    Sep 6, 2003
  5. Fatih Gey

    segfault on strtok

    Fatih Gey, Oct 23, 2003, in forum: C Programming
    Replies:
    40
    Views:
    1,428
    nobody
    Nov 1, 2003
Loading...

Share This Page