problem with strtok()

M

Michael

Hi,

I have a proble I don't understand when using strtok(). It seems that if I
make a call to strtok(), then make a call to another function that also
makes use of strtok(), the original call is somehow confused or upset.

I have the following code, which I am using to tokenise some input which is
in th form x:y:1.2:

int tokenize_input(Sale *sale, char *string){

char *temp;
int temp_int;
int result = TRUE;

if((temp = strtok(string, ":")) == NULL){
result = FALSE;
} else {
sale -> sale_id = atoi(temp);
}

if((temp = strtok('\0',":")) == NULL){
result = FALSE;
} else {
if(get_date(temp)
> -1){ /* when I added this
line, my problem started*/
strncpy(sale -> date, temp, DATE_LENGTH);
} else
{
/*These were added at the same time*/
result = FALSE;
/**/
}
/**/
}

if((temp = strtok('\0',".")) ==
NULL){ /*this now returns NULL*/
result = FALSE;
} else {
temp_int = atoi(temp)*100;
}

if((temp = strtok('\0',":")) == NULL){
result = FALSE;
} else {
temp_int = temp_int + atoi(temp);
sale -> price = temp_int;
}

return result;
}

get_date() is also using strtok(). It all worked fine until I added the
marked lines in order to do some validation of input, at which point the
later strtok() began returning NULL.

Can anyone explain why this would occur and how can get around it?

Thanks for your help

Michael
 
K

Keith Thompson

Michael said:
I have a proble I don't understand when using strtok(). It seems that if I
make a call to strtok(), then make a call to another function that also
makes use of strtok(), the original call is somehow confused or upset.

Yup. strtok() is not reentrant. It uses internal static data that
makes it impossible to use more than once concurrently.

[...]
Can anyone explain why this would occur and how can get around it?

Either serialize your calls to strtok(), so each use finishes before
you start another one, or use something other than strtok().

Some systems provide a strtok_r() function. This is non-standard, and
any code that uses it will be portable only to systems that provide
it, but it might suit your purposes anyway. (strtok_r() is likely to
be present on any non-ancient Unix-like system.)
 
P

pete

Keith said:
Michael said:
I have a proble I don't understand when using strtok().
It seems that if I
make a call to strtok(),
then make a call to another function that also
makes use of strtok(),
the original call is somehow confused or upset.

Yup. strtok() is not reentrant. It uses internal static data that
makes it impossible to use more than once concurrently.

[...]
Can anyone explain why this would occur and how can get around it?

Either serialize your calls to strtok(), so each use finishes before
you start another one, or use something other than strtok().

Some systems provide a strtok_r() function. This is non-standard, and
any code that uses it will be portable only to systems that provide
it, but it might suit your purposes anyway. (strtok_r() is likely to
be present on any non-ancient Unix-like system.)

/* BEGIN new.c */

#include <stdio.h>
#include <string.h>

#define STRING "\n\n\n\tThere's\n a\r beat in \r\tmy head.\n\n\n"
#define WHITE "\n\r\t"

char *str_tok_r(char *s1, const char *s2, char **s3);
char *str_sep(char **s1, const char *s2);
/*
** K&R2 Exercise 2-4
** alternate squeeze functions
*/
char *str_squeeze(char *s1, const char *s2);
char *str_squeeze_r(char *s1, const char *s2);
char *str_squeeze_s(char *s1, const char *s2);

int main(void)
{
char s1[sizeof STRING];

puts(strcpy(s1, STRING));
puts(str_squeeze(s1, WHITE));

puts(strcpy(s1, STRING));
puts(str_squeeze_r(s1, WHITE));

puts(strcpy(s1, STRING));
puts(str_squeeze_s(s1, WHITE));

return 0;
}

char *str_tok_r(char *s1, const char *s2, char **s3)
{
if (s1 != NULL) {
*s3 = s1;
}
s1 = *s3 + strspn(*s3, s2);
if (*s1 == '\0') {
return NULL;
}
*s3 = s1 + strcspn(s1, s2);
if (**s3 != '\0') {
*(*s3)++ = '\0';
}
return s1;
}

char *str_sep(char **s1, const char *s2)
{
char *const p1 = *s1;

if (p1 != NULL) {
*s1 = strpbrk(p1, s2);
if (*s1 != NULL) {
*(*s1)++ = '\0';
}
}
return p1;
}

char *str_squeeze(char *s1, const char *s2)
{
char *const p1 = s1;
const char *const p2 = s2;

s2 = strtok(p1, p2);
while (s2 != NULL) {
do {
*s1++ = *s2++;
} while (*s2 != '\0');
s2 = strtok(NULL, p2);
}
*s1 = '\0';
return p1;
}

char *str_squeeze_r(char *s1, const char *s2)
{
char *const p1 = s1;
const char *const p2 = s2;
char *p3;

s2 = str_tok_r(p1, p2, &p3);
while (s2 != NULL) {
do {
*s1++ = *s2++;
} while (*s2 != '\0');
s2 = str_tok_r(NULL, p2, &p3);
}
*s1 = '\0';
return p1;
}

char *str_squeeze_s(char *s1, const char *s2)
{
char *const p1 = s1;
const char *const p2 = s2;
char *p3 = s1;

do {
s2 = str_sep(&p3, p2);
while (*s2 != '\0') {
*s1++ = *s2++;
}
} while (p3 != NULL);
*s1 = '\0';
return p1;
}

/* END new.c */
 
S

Stan Milam

Michael said:
Hi,

I have a proble I don't understand when using strtok(). It seems that if I
make a call to strtok(), then make a call to another function that also
makes use of strtok(), the original call is somehow confused or upset.

I have the following code, which I am using to tokenise some input which is
in th form x:y:1.2:

int tokenize_input(Sale *sale, char *string){

char *temp;
int temp_int;
int result = TRUE;

if((temp = strtok(string, ":")) == NULL){
result = FALSE;
} else {
sale -> sale_id = atoi(temp);
}

if((temp = strtok('\0',":")) == NULL){
result = FALSE;
} else {
if(get_date(temp)
line, my problem started*/
strncpy(sale -> date, temp, DATE_LENGTH);
} else
{
/*These were added at the same time*/
result = FALSE;
/**/
}
/**/
}

if((temp = strtok('\0',".")) ==
NULL){ /*this now returns NULL*/
result = FALSE;
} else {
temp_int = atoi(temp)*100;
}

if((temp = strtok('\0',":")) == NULL){
result = FALSE;
} else {
temp_int = temp_int + atoi(temp);
sale -> price = temp_int;
}

return result;
}

get_date() is also using strtok(). It all worked fine until I added the
marked lines in order to do some validation of input, at which point the
later strtok() began returning NULL.

Can anyone explain why this would occur and how can get around it?

Thanks for your help

Michael

The strtok() function uses a static char * to maintain the address of
the string it is parsing. If a new initializing call to strtok() is
made you will lose the address of the first string. Over the years I've
written several replacement functions for strtok() (which I believe
should be deprecated). My favorite is something I wrote a few years ago
in another language and ported recently to C. Here it is, so enjoy.

/**********************************************************************/
/* File Name: gettoken.c. */
/* Author: Stan Milam. */
/* Date Written: 15-Jan-2000. */
/* Description: */
/* Extract and remove a token from a string. Handles empty */
/* tokens. */
/* (c) Copyright 2006 by Stan Milam. */
/* All rights reserved. */
/* */
/**********************************************************************/

#include <errno.h>
#include <string.h>

#define strzcpy(d,s,l) (strncpy((d), (s), (l))[(l)] = '\0', (d))

/**********************************************************************/
/* Name: */
/* gettoken(). */
/* */
/* Synopsis: */
/* #include "strtools.h" */
/* char *gettoken( char *dest, char *source, char *delimters ); */
/* */
/* Description: */
/* The gettoken() function will extract tokens seperated by a */
/* specified set of delimiters from a string and store the token */
/* value in the dest argument. Furthermore, the token is removed */
/* from the source string along with the delimiter. Empty token */
/* fields cause the destination vaue to be an empty string. */
/* */
/* Arguments: */
/* char *dest - Address of a buffer where the token will be */
/* stored. */
/* char *source - The address of the string containing one or */
/* more tokens. */
/* char *delimiters - The address of a string of characters used */
/* as token delimiters. */
/* */
/* Return Value: */
/* The gettoken() function will return the address of the */
/* destination argument upon successful completion, and will */
/* return NULL when there no tokens left to extract or any one of */
/* the arguments are a NULL value. Should one of the arguments */
/* be a NULL pointer the global errno variable will be set to */
/* EINVAL. */
/* */
/**********************************************************************/

char *
gettoken( char *dest, char *source, const char *delimiters )
{
char *rv = NULL;

if ( dest == NULL || source == NULL || delimiters == NULL )
errno = EINVAL;
else {
*dest = '\0';
if ( *source ) {
char *ptr = strpbrk( source, delimiters );

/**********************************************************/
/* At this point we know we have something, perhaps an */
/* empty token. Default the return value to the */
/* destination address. If the result of strpbrk() is not */
/* NULL and not the same as the source, copy the token */
/* into the destination string. */
/**********************************************************/

rv = dest;
if ( ptr != NULL ) {
char *tmp = ptr++;
if ( source != tmp )
rv = strzcpy( dest, source, (size_t)(tmp-source) );
}


/**************************************************************/
/* If there are no delimters the source is the token. */
/**************************************************************/

else {
rv = strcpy( dest, source );
ptr = (char *) source + strlen( source );
}

/**********************************************************/
/* Copy the source string down past the token we just */
/* found. */
/**********************************************************/

memmove( (char *)source, ptr, strlen( ptr ) + 1 );
}
}
return rv;
}


#ifdef TEST

#include <stdio.h>
#include <assert.h>

int
main( void )
{
char dest[100];
char delim[]="|;!";
char a[] = "|B.B. Shagnasty|!Shagnasty, William B.|Billy Bob
Shagnasty|;!";

errno = 0;
assert( gettoken( NULL, a, delim ) == NULL);
assert( errno == EINVAL ); errno = 0;

assert( gettoken( dest, NULL, delim ) == NULL);
assert( errno == EINVAL ); errno = 0;

assert( gettoken( dest, a, NULL ) == NULL );
assert( errno == EINVAL ); errno = 0;

while( gettoken( dest, a, delim ) )
puts( dest );

return 0;
}
#endif


--
Regards,
Stan Milam
=============================================================
Charter Member of The Society for Mediocre Guitar Playing on
Expensive Instruments, Ltd.
=============================================================
 
B

Ben Pfaff

Michael said:
I have a proble I don't understand when using strtok(). It seems that if I
make a call to strtok(), then make a call to another function that also
makes use of strtok(), the original call is somehow confused or upset.

strtok() has at least these problems:

* It merges adjacent delimiters. If you use a comma as your
delimiter, then "a,,b,c" will be divided into three tokens,
not four. This is often the wrong thing to do. In fact, it
is only the right thing to do, in my experience, when the
delimiter set contains white space (for dividing a string
into "words") or it is known in advance that there will be
no adjacent delimiters.

* The identity of the delimiter is lost, because it is
changed to a null terminator.

* It modifies the string that it tokenizes. This is bad
because it forces you to make a copy of the string if
you want to use it later. It also means that you can't
tokenize a string literal with it; this is not
necessarily something you'd want to do all the time but
it is surprising.

* It can only be used once at a time. If a sequence of
strtok() calls is ongoing and another one is started,
the state of the first one is lost. This isn't a
problem for small programs but it is easy to lose track
of such things in hierarchies of nested functions in
large programs. In other words, strtok() breaks
encapsulation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Access violation reading location 0
strtok 7
strtok problem 16
Infinite loop problem 1
Why does strcat mess up the tokens in strtok (and strtok_r)? 92
problems with strtok() 2
strtok() 13
PyObject_CallObject freezing 0

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top