using strtok

M

Mr John FO Evans

I cam across an interesting limitation to the use of strtok.

I have two strings on which I want strtok to operate.
However since strtok has only one memory of the residual string I must
complete one set of operations before starting on the second. This
is inconvenient in the context of my program!

So far the only solution I can see is to write a replacement for strtok
to use on one of the strings. Can anyone offer an alternative?
 
M

matevzb

I cam across an interesting limitation to the use of strtok.

I have two strings on which I want strtok to operate.
However since strtok has only one memory of the residual string I must
complete one set of operations before starting on the second. This
is inconvenient in the context of my program!

So far the only solution I can see is to write a replacement for strtok
to use on one of the strings. Can anyone offer an alternative?
Not really, unless you'd be willing to use <OT>POSIX/SUS's strtok_r()</
OT> and skip on the portability. Otherwise, look at CBFalconer's
replacement toksplit(), discussed at
http://groups.google.com/group/comp.lang.c/browse_frm/thread/7b58085dd57c3a5b.
 
S

santosh

Mr said:
I cam across an interesting limitation to the use of strtok.

I have two strings on which I want strtok to operate.
However since strtok has only one memory of the residual string I must
complete one set of operations before starting on the second. This
is inconvenient in the context of my program!

So far the only solution I can see is to write a replacement for strtok
to use on one of the strings. Can anyone offer an alternative?

POSIX specifies a strtok_r that was designed to work around the
reentrancy issue of strtok. If you want full portability however,
you'll have to roll your own version. It's not difficult and can be
done in completely standard C. CBFalconer periodically publishes his
toksplit function to this group. Use Google Group's search facility to
locate the source.
 
C

CBFalconer

Mr said:
I cam across an interesting limitation to the use of strtok.

I have two strings on which I want strtok to operate.
However since strtok has only one memory of the residual string I
must complete one set of operations before starting on the second.
This is inconvenient in the context of my program!

So far the only solution I can see is to write a replacement for
strtok to use on one of the strings. Can anyone offer an
alternative?

Try this:

/* ------- file toksplit.c ----------*/
#include "toksplit.h"

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.

The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.

A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) src++;

while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */

#ifdef TESTING
#include <stdio.h>

#define ABRsize 6 /* length of acceptable token abbreviations */

/* ---------------- */

static void showtoken(int i, char *tok)
{
putchar(i + '1'); putchar(':');
puts(tok);
} /* showtoken */

/* ---------------- */

int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";

const char *t, *s = teststring;
int i;
char token[ABRsize + 1];

puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = toksplit(t, ',', token, ABRsize);
showtoken(i, token);
}

puts("\nHow to detect 'no more tokens' while truncating");
t = s; i = 0;
while (*t) {
t = toksplit(t, ',', token, 3);
showtoken(i, token);
i++;
}

puts("\nUsing blanks as token delimiters");
t = s; i = 0;
while (*t) {
t = toksplit(t, ' ', token, ABRsize);
showtoken(i, token);
i++;
}
return 0;
} /* main */

#endif
/* ------- end file toksplit.c ----------*/

/* ------- file toksplit.h ----------*/
#ifndef H_toksplit_h
# define H_toksplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next token from an input string, after
skipping leading blanks (or other whitespace?). The
token is terminated by the first appearance of tokchar,
or by the end of the source string.

The caller must supply sufficient space in token to
receive any token, Otherwise tokens will be truncated.

Returns: a pointer past the terminating tokchar.

This will happily return an infinity of empty tokens if
called with src pointing to the end of a string. Tokens
will never include a copy of tokchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
*/

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh); /* length token can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file toksplit.h ----------*/

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
S

Servé Laurijssen

matevzb said:
Not really, unless you'd be willing to use <OT>POSIX/SUS's strtok_r()</
OT> and skip on the portability. Otherwise, look at CBFalconer's
replacement toksplit(), discussed at

are POSIX sources available?
 
C

CBFalconer

Servé Laurijssen said:
are POSIX sources available?

Anything published here, as ready to go, will either run on POSIX
or there will be much wailing, teeth gnashing and berating from the
regulars. We deal with portable code here. Which, in turn, is why
POSIX is off-topic.

Please do not remove attribution lines for material you quote.
Those are the initial lines that say "Joe wrote:" or similar.
 
M

matevzb

are POSIX sources available?
POSIX/SUS, similar to ISO C, is a specification, so the answer would
be no. Source code for specific implementations may be available (e.g.
GNU libc), but whether or not they are portable and/or conform to
POSIX is another question. I'd say you're better off with toksplit().
 
S

Servé Laurijssen

CBFalconer said:
Anything published here, as ready to go, will either run on POSIX
or there will be much wailing, teeth gnashing and berating from the
regulars. We deal with portable code here. Which, in turn, is why
POSIX is off-topic.

I was just wondering why your function is not considered off-topic but every
time posix is mentioned its off topic. One could mention that to get a
portable version of strtok_r you can strip it from posix.
Please do not remove attribution lines for material you quote.
Those are the initial lines that say "Joe wrote:" or similar.

was mistake sorry
 
B

Ben Pfaff

Servé Laurijssen said:
I was just wondering why your function is not considered off-topic but every
time posix is mentioned its off topic. One could mention that to get a
portable version of strtok_r you can strip it from posix.

POSIX is a standard. It's not a collection of source code from
which you can strip anything.
 
P

pete

santosh wrote:
POSIX specifies a strtok_r that was designed to work around the
reentrancy issue of strtok. If you want full portability however,
you'll have to roll your own version. It's not difficult and can be
done in completely standard C.

#include <stddef.h>

char *str_tok_r(char *s1, const char *s2, char **s3);
char *str_chr(const char *s, int c);
size_t str_spn(const char *s1, const char *s2);
size_t str_cspn(const char *s1, const char *s2);

char *str_tok_r(char *s1, const char *s2, char **s3)
{
if (s1 != NULL) {
*s3 = s1;
}
s1 = *s3 + str_spn(*s3, s2);
if (*s1 == '\0') {
return NULL;
}
*s3 = s1 + str_cspn(s1, s2);
if (**s3 != '\0') {
*(*s3)++ = '\0';
}
return s1;
}

size_t str_spn(const char *s1, const char *s2)
{
size_t n;

for (n = 0; *s1 != '\0' && str_chr(s2, *s1) != NULL; ++s1) {
++n;
}
return n;
}

size_t str_cspn(const char *s1, const char *s2)
{
size_t n;

for (n = 0; str_chr(s2, *s1) == NULL; ++s1) {
++n;
}
return n;
}

char *str_chr(const char *s, int c)
{
while (*s != (char)c) {
if (*s == '\0') {
return NULL;
}
++s;
}
return (char *)s;
}
 
R

Richard Bos

Mr John FO Evans said:
I cam across an interesting limitation to the use of strtok.

I have two strings on which I want strtok to operate.
However since strtok has only one memory of the residual string I must
complete one set of operations before starting on the second. This
is inconvenient in the context of my program!

So far the only solution I can see is to write a replacement for strtok
to use on one of the strings. Can anyone offer an alternative?

No. It's one of the many ways in which strtok() is unsuitable for most
of the jobs it was intended for. I've only come across a situation in
which strtok() was the right tool for the job, and I've since forgotten
what it was.

Richard
 
F

Flash Gordon

Servé Laurijssen wrote, On 11/03/07 19:02:
I was just wondering why your function is not considered off-topic but every
time posix is mentioned its off topic. One could mention that to get a
portable version of strtok_r you can strip it from posix.

Because Chuck provides the source for it in standard C when he mentions
it, and code written in standard C is topical here. If you provide the
code for a POSIX function in standard C then we can talk about that, but
a number of things POSIX provides *cannot* be implemented in standard C.
 
J

Joe Wright

Richard said:
No. It's one of the many ways in which strtok() is unsuitable for most
of the jobs it was intended for. I've only come across a situation in
which strtok() was the right tool for the job, and I've since forgotten
what it was.

Richard

Hear. strtok() bit me several years ago. I investigated and determined
why. As a result, I haven't used strtok() again.
 
A

Al Balmer

Nope, it's fine for the jobs it was intended for. It has problems when
used for jobs it wasn't intended for.
I've only come across a situation in

I've used it a number of times. If you are reading, tokenizing and
discarding a line, and it's guaranteed not to have missing items
(consecutive delimiters), or you don't care if it does, it works just
fine.
Hear. strtok() bit me several years ago. I investigated and determined
why. As a result, I haven't used strtok() again.

Perhaps you haven't had occasion to use it, but having determined what
it does, you certainly shouldn't be afraid to.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

strtok 7
strtok and strsep 6
strtok problem 16
Segmentation fault when using strtok 4
strtok exception handling 7
Why does strcat mess up the tokens in strtok (and strtok_r)? 92
C++ strtok 5
strtok problem 4

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top