problem with strtok()

Discussion in 'C Programming' started by Michael, Aug 12, 2006.

  1. Michael

    Michael Guest

    Hi,

    I have a proble I don't understand when using strtok(). It seems that if I
    make a call to strtok(), then make a call to another function that also
    makes use of strtok(), the original call is somehow confused or upset.

    I have the following code, which I am using to tokenise some input which is
    in th form x:y:1.2:

    int tokenize_input(Sale *sale, char *string){

    char *temp;
    int temp_int;
    int result = TRUE;

    if((temp = strtok(string, ":")) == NULL){
    result = FALSE;
    } else {
    sale -> sale_id = atoi(temp);
    }

    if((temp = strtok('\0',":")) == NULL){
    result = FALSE;
    } else {
    if(get_date(temp)
    > -1){ /* when I added this

    line, my problem started*/
    strncpy(sale -> date, temp, DATE_LENGTH);
    } else
    {
    /*These were added at the same time*/
    result = FALSE;
    /**/
    }
    /**/
    }

    if((temp = strtok('\0',".")) ==
    NULL){ /*this now returns NULL*/
    result = FALSE;
    } else {
    temp_int = atoi(temp)*100;
    }

    if((temp = strtok('\0',":")) == NULL){
    result = FALSE;
    } else {
    temp_int = temp_int + atoi(temp);
    sale -> price = temp_int;
    }

    return result;
    }

    get_date() is also using strtok(). It all worked fine until I added the
    marked lines in order to do some validation of input, at which point the
    later strtok() began returning NULL.

    Can anyone explain why this would occur and how can get around it?

    Thanks for your help

    Michael
     
    Michael, Aug 12, 2006
    #1
    1. Advertising

  2. "Michael" <> writes:
    > I have a proble I don't understand when using strtok(). It seems that if I
    > make a call to strtok(), then make a call to another function that also
    > makes use of strtok(), the original call is somehow confused or upset.


    Yup. strtok() is not reentrant. It uses internal static data that
    makes it impossible to use more than once concurrently.

    [...]

    > Can anyone explain why this would occur and how can get around it?


    Either serialize your calls to strtok(), so each use finishes before
    you start another one, or use something other than strtok().

    Some systems provide a strtok_r() function. This is non-standard, and
    any code that uses it will be portable only to systems that provide
    it, but it might suit your purposes anyway. (strtok_r() is likely to
    be present on any non-ancient Unix-like system.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Aug 12, 2006
    #2
    1. Advertising

  3. Michael

    pete Guest

    Keith Thompson wrote:
    >
    > "Michael" <> writes:
    > > I have a proble I don't understand when using strtok().
    > > It seems that if I
    > > make a call to strtok(),
    > > then make a call to another function that also
    > > makes use of strtok(),
    > > the original call is somehow confused or upset.

    >
    > Yup. strtok() is not reentrant. It uses internal static data that
    > makes it impossible to use more than once concurrently.
    >
    > [...]
    >
    > > Can anyone explain why this would occur and how can get around it?

    >
    > Either serialize your calls to strtok(), so each use finishes before
    > you start another one, or use something other than strtok().
    >
    > Some systems provide a strtok_r() function. This is non-standard, and
    > any code that uses it will be portable only to systems that provide
    > it, but it might suit your purposes anyway. (strtok_r() is likely to
    > be present on any non-ancient Unix-like system.)


    /* BEGIN new.c */

    #include <stdio.h>
    #include <string.h>

    #define STRING "\n\n\n\tThere's\n a\r beat in \r\tmy head.\n\n\n"
    #define WHITE "\n\r\t"

    char *str_tok_r(char *s1, const char *s2, char **s3);
    char *str_sep(char **s1, const char *s2);
    /*
    ** K&R2 Exercise 2-4
    ** alternate squeeze functions
    */
    char *str_squeeze(char *s1, const char *s2);
    char *str_squeeze_r(char *s1, const char *s2);
    char *str_squeeze_s(char *s1, const char *s2);

    int main(void)
    {
    char s1[sizeof STRING];

    puts(strcpy(s1, STRING));
    puts(str_squeeze(s1, WHITE));

    puts(strcpy(s1, STRING));
    puts(str_squeeze_r(s1, WHITE));

    puts(strcpy(s1, STRING));
    puts(str_squeeze_s(s1, WHITE));

    return 0;
    }

    char *str_tok_r(char *s1, const char *s2, char **s3)
    {
    if (s1 != NULL) {
    *s3 = s1;
    }
    s1 = *s3 + strspn(*s3, s2);
    if (*s1 == '\0') {
    return NULL;
    }
    *s3 = s1 + strcspn(s1, s2);
    if (**s3 != '\0') {
    *(*s3)++ = '\0';
    }
    return s1;
    }

    char *str_sep(char **s1, const char *s2)
    {
    char *const p1 = *s1;

    if (p1 != NULL) {
    *s1 = strpbrk(p1, s2);
    if (*s1 != NULL) {
    *(*s1)++ = '\0';
    }
    }
    return p1;
    }

    char *str_squeeze(char *s1, const char *s2)
    {
    char *const p1 = s1;
    const char *const p2 = s2;

    s2 = strtok(p1, p2);
    while (s2 != NULL) {
    do {
    *s1++ = *s2++;
    } while (*s2 != '\0');
    s2 = strtok(NULL, p2);
    }
    *s1 = '\0';
    return p1;
    }

    char *str_squeeze_r(char *s1, const char *s2)
    {
    char *const p1 = s1;
    const char *const p2 = s2;
    char *p3;

    s2 = str_tok_r(p1, p2, &p3);
    while (s2 != NULL) {
    do {
    *s1++ = *s2++;
    } while (*s2 != '\0');
    s2 = str_tok_r(NULL, p2, &p3);
    }
    *s1 = '\0';
    return p1;
    }

    char *str_squeeze_s(char *s1, const char *s2)
    {
    char *const p1 = s1;
    const char *const p2 = s2;
    char *p3 = s1;

    do {
    s2 = str_sep(&p3, p2);
    while (*s2 != '\0') {
    *s1++ = *s2++;
    }
    } while (p3 != NULL);
    *s1 = '\0';
    return p1;
    }

    /* END new.c */

    --
    pete
     
    pete, Aug 12, 2006
    #3
  4. Michael

    Stan Milam Guest

    Michael wrote:
    > Hi,
    >
    > I have a proble I don't understand when using strtok(). It seems that if I
    > make a call to strtok(), then make a call to another function that also
    > makes use of strtok(), the original call is somehow confused or upset.
    >
    > I have the following code, which I am using to tokenise some input which is
    > in th form x:y:1.2:
    >
    > int tokenize_input(Sale *sale, char *string){
    >
    > char *temp;
    > int temp_int;
    > int result = TRUE;
    >
    > if((temp = strtok(string, ":")) == NULL){
    > result = FALSE;
    > } else {
    > sale -> sale_id = atoi(temp);
    > }
    >
    > if((temp = strtok('\0',":")) == NULL){
    > result = FALSE;
    > } else {
    > if(get_date(temp)
    > > -1){ /* when I added this

    > line, my problem started*/
    > strncpy(sale -> date, temp, DATE_LENGTH);
    > } else
    > {
    > /*These were added at the same time*/
    > result = FALSE;
    > /**/
    > }
    > /**/
    > }
    >
    > if((temp = strtok('\0',".")) ==
    > NULL){ /*this now returns NULL*/
    > result = FALSE;
    > } else {
    > temp_int = atoi(temp)*100;
    > }
    >
    > if((temp = strtok('\0',":")) == NULL){
    > result = FALSE;
    > } else {
    > temp_int = temp_int + atoi(temp);
    > sale -> price = temp_int;
    > }
    >
    > return result;
    > }
    >
    > get_date() is also using strtok(). It all worked fine until I added the
    > marked lines in order to do some validation of input, at which point the
    > later strtok() began returning NULL.
    >
    > Can anyone explain why this would occur and how can get around it?
    >
    > Thanks for your help
    >
    > Michael
    >
    >
    >


    The strtok() function uses a static char * to maintain the address of
    the string it is parsing. If a new initializing call to strtok() is
    made you will lose the address of the first string. Over the years I've
    written several replacement functions for strtok() (which I believe
    should be deprecated). My favorite is something I wrote a few years ago
    in another language and ported recently to C. Here it is, so enjoy.

    /**********************************************************************/
    /* File Name: gettoken.c. */
    /* Author: Stan Milam. */
    /* Date Written: 15-Jan-2000. */
    /* Description: */
    /* Extract and remove a token from a string. Handles empty */
    /* tokens. */
    /* (c) Copyright 2006 by Stan Milam. */
    /* All rights reserved. */
    /* */
    /**********************************************************************/

    #include <errno.h>
    #include <string.h>

    #define strzcpy(d,s,l) (strncpy((d), (s), (l))[(l)] = '\0', (d))

    /**********************************************************************/
    /* Name: */
    /* gettoken(). */
    /* */
    /* Synopsis: */
    /* #include "strtools.h" */
    /* char *gettoken( char *dest, char *source, char *delimters ); */
    /* */
    /* Description: */
    /* The gettoken() function will extract tokens seperated by a */
    /* specified set of delimiters from a string and store the token */
    /* value in the dest argument. Furthermore, the token is removed */
    /* from the source string along with the delimiter. Empty token */
    /* fields cause the destination vaue to be an empty string. */
    /* */
    /* Arguments: */
    /* char *dest - Address of a buffer where the token will be */
    /* stored. */
    /* char *source - The address of the string containing one or */
    /* more tokens. */
    /* char *delimiters - The address of a string of characters used */
    /* as token delimiters. */
    /* */
    /* Return Value: */
    /* The gettoken() function will return the address of the */
    /* destination argument upon successful completion, and will */
    /* return NULL when there no tokens left to extract or any one of */
    /* the arguments are a NULL value. Should one of the arguments */
    /* be a NULL pointer the global errno variable will be set to */
    /* EINVAL. */
    /* */
    /**********************************************************************/

    char *
    gettoken( char *dest, char *source, const char *delimiters )
    {
    char *rv = NULL;

    if ( dest == NULL || source == NULL || delimiters == NULL )
    errno = EINVAL;
    else {
    *dest = '\0';
    if ( *source ) {
    char *ptr = strpbrk( source, delimiters );

    /**********************************************************/
    /* At this point we know we have something, perhaps an */
    /* empty token. Default the return value to the */
    /* destination address. If the result of strpbrk() is not */
    /* NULL and not the same as the source, copy the token */
    /* into the destination string. */
    /**********************************************************/

    rv = dest;
    if ( ptr != NULL ) {
    char *tmp = ptr++;
    if ( source != tmp )
    rv = strzcpy( dest, source, (size_t)(tmp-source) );
    }


    /**************************************************************/
    /* If there are no delimters the source is the token. */
    /**************************************************************/

    else {
    rv = strcpy( dest, source );
    ptr = (char *) source + strlen( source );
    }

    /**********************************************************/
    /* Copy the source string down past the token we just */
    /* found. */
    /**********************************************************/

    memmove( (char *)source, ptr, strlen( ptr ) + 1 );
    }
    }
    return rv;
    }


    #ifdef TEST

    #include <stdio.h>
    #include <assert.h>

    int
    main( void )
    {
    char dest[100];
    char delim[]="|;!";
    char a[] = "|B.B. Shagnasty|!Shagnasty, William B.|Billy Bob
    Shagnasty|;!";

    errno = 0;
    assert( gettoken( NULL, a, delim ) == NULL);
    assert( errno == EINVAL ); errno = 0;

    assert( gettoken( dest, NULL, delim ) == NULL);
    assert( errno == EINVAL ); errno = 0;

    assert( gettoken( dest, a, NULL ) == NULL );
    assert( errno == EINVAL ); errno = 0;

    while( gettoken( dest, a, delim ) )
    puts( dest );

    return 0;
    }
    #endif


    --
    Regards,
    Stan Milam
    =============================================================
    Charter Member of The Society for Mediocre Guitar Playing on
    Expensive Instruments, Ltd.
    =============================================================
     
    Stan Milam, Aug 13, 2006
    #4
  5. Michael

    Ben Pfaff Guest

    "Michael" <> writes:

    > I have a proble I don't understand when using strtok(). It seems that if I
    > make a call to strtok(), then make a call to another function that also
    > makes use of strtok(), the original call is somehow confused or upset.


    strtok() has at least these problems:

    * It merges adjacent delimiters. If you use a comma as your
    delimiter, then "a,,b,c" will be divided into three tokens,
    not four. This is often the wrong thing to do. In fact, it
    is only the right thing to do, in my experience, when the
    delimiter set contains white space (for dividing a string
    into "words") or it is known in advance that there will be
    no adjacent delimiters.

    * The identity of the delimiter is lost, because it is
    changed to a null terminator.

    * It modifies the string that it tokenizes. This is bad
    because it forces you to make a copy of the string if
    you want to use it later. It also means that you can't
    tokenize a string literal with it; this is not
    necessarily something you'd want to do all the time but
    it is surprising.

    * It can only be used once at a time. If a sequence of
    strtok() calls is ongoing and another one is started,
    the state of the first one is lost. This isn't a
    problem for small programs but it is easy to lose track
    of such things in hierarchies of nested functions in
    large programs. In other words, strtok() breaks
    encapsulation.

    --
    "I'm not here to convince idiots not to be stupid.
    They won't listen anyway."
    --Dann Corbit
     
    Ben Pfaff, Aug 13, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. strtok problem

    , Aug 28, 2003, in forum: C Programming
    Replies:
    4
    Views:
    509
  2. Ram Laxman

    strtok problem

    Ram Laxman, Apr 11, 2004, in forum: C Programming
    Replies:
    3
    Views:
    5,101
    Ram Laxman
    May 3, 2004
  3. collinm

    strtok problem - strcmp

    collinm, Mar 24, 2005, in forum: C Programming
    Replies:
    4
    Views:
    801
    Mark McIntyre
    Mar 24, 2005
  4. Replies:
    3
    Views:
    500
    Steven Kobes
    Jul 27, 2005
  5. ern

    strtok() problem

    ern, Sep 20, 2005, in forum: C Programming
    Replies:
    12
    Views:
    640
    Default User
    Sep 22, 2005
Loading...

Share This Page