using strtok

Discussion in 'C Programming' started by Mr John FO Evans, Mar 10, 2007.

  1. I cam across an interesting limitation to the use of strtok.

    I have two strings on which I want strtok to operate.
    However since strtok has only one memory of the residual string I must
    complete one set of operations before starting on the second. This
    is inconvenient in the context of my program!

    So far the only solution I can see is to write a replacement for strtok
    to use on one of the strings. Can anyone offer an alternative?



    --
    _ _________________________________________
    / \._._ |_ _ _ /' Orpheus Internet Services
    \_/| |_)| |(/_|_|_> / 'Internet for Everyone'
    _______ | ___________./ http://www.orpheusinternet.co.uk
     
    Mr John FO Evans, Mar 10, 2007
    #1
    1. Advertising

  2. Mr John FO Evans

    matevzb Guest

    On Mar 10, 12:35 pm, Mr John FO Evans <> wrote:
    > I cam across an interesting limitation to the use of strtok.
    >
    > I have two strings on which I want strtok to operate.
    > However since strtok has only one memory of the residual string I must
    > complete one set of operations before starting on the second. This
    > is inconvenient in the context of my program!
    >
    > So far the only solution I can see is to write a replacement for strtok
    > to use on one of the strings. Can anyone offer an alternative?

    Not really, unless you'd be willing to use <OT>POSIX/SUS's strtok_r()</
    OT> and skip on the portability. Otherwise, look at CBFalconer's
    replacement toksplit(), discussed at
    http://groups.google.com/group/comp.lang.c/browse_frm/thread/7b58085dd57c3a5b.
    --
    WYCIWYG - what you C is what you get
     
    matevzb, Mar 10, 2007
    #2
    1. Advertising

  3. Mr John FO Evans

    santosh Guest

    Mr John FO Evans wrote:
    > I cam across an interesting limitation to the use of strtok.
    >
    > I have two strings on which I want strtok to operate.
    > However since strtok has only one memory of the residual string I must
    > complete one set of operations before starting on the second. This
    > is inconvenient in the context of my program!
    >
    > So far the only solution I can see is to write a replacement for strtok
    > to use on one of the strings. Can anyone offer an alternative?


    POSIX specifies a strtok_r that was designed to work around the
    reentrancy issue of strtok. If you want full portability however,
    you'll have to roll your own version. It's not difficult and can be
    done in completely standard C. CBFalconer periodically publishes his
    toksplit function to this group. Use Google Group's search facility to
    locate the source.
     
    santosh, Mar 10, 2007
    #3
  4. Mr John FO Evans

    CBFalconer Guest

    Mr John FO Evans wrote:
    >
    > I cam across an interesting limitation to the use of strtok.
    >
    > I have two strings on which I want strtok to operate.
    > However since strtok has only one memory of the residual string I
    > must complete one set of operations before starting on the second.
    > This is inconvenient in the context of my program!
    >
    > So far the only solution I can see is to write a replacement for
    > strtok to use on one of the strings. Can anyone offer an
    > alternative?


    Try this:

    /* ------- file toksplit.c ----------*/
    #include "toksplit.h"

    /* copy over the next token from an input string, after
    skipping leading blanks (or other whitespace?). The
    token is terminated by the first appearance of tokchar,
    or by the end of the source string.

    The caller must supply sufficient space in token to
    receive any token, Otherwise tokens will be truncated.

    Returns: a pointer past the terminating tokchar.

    This will happily return an infinity of empty tokens if
    called with src pointing to the end of a string. Tokens
    will never include a copy of tokchar.

    A better name would be "strtkn", except that is reserved
    for the system namespace. Change to that at your risk.

    released to Public Domain, by C.B. Falconer.
    Published 2006-02-20. Attribution appreciated.
    Revised 2006-06-13
    */

    const char *toksplit(const char *src, /* Source of tokens */
    char tokchar, /* token delimiting char */
    char *token, /* receiver of parsed token */
    size_t lgh) /* length token can receive */
    /* not including final '\0' */
    {
    if (src) {
    while (' ' == *src) src++;

    while (*src && (tokchar != *src)) {
    if (lgh) {
    *token++ = *src;
    --lgh;
    }
    src++;
    }
    if (*src && (tokchar == *src)) src++;
    }
    *token = '\0';
    return src;
    } /* toksplit */

    #ifdef TESTING
    #include <stdio.h>

    #define ABRsize 6 /* length of acceptable token abbreviations */

    /* ---------------- */

    static void showtoken(int i, char *tok)
    {
    putchar(i + '1'); putchar(':');
    puts(tok);
    } /* showtoken */

    /* ---------------- */

    int main(void)
    {
    char teststring[] = "This is a test, ,, abbrev, more";

    const char *t, *s = teststring;
    int i;
    char token[ABRsize + 1];

    puts(teststring);
    t = s;
    for (i = 0; i < 4; i++) {
    t = toksplit(t, ',', token, ABRsize);
    showtoken(i, token);
    }

    puts("\nHow to detect 'no more tokens' while truncating");
    t = s; i = 0;
    while (*t) {
    t = toksplit(t, ',', token, 3);
    showtoken(i, token);
    i++;
    }

    puts("\nUsing blanks as token delimiters");
    t = s; i = 0;
    while (*t) {
    t = toksplit(t, ' ', token, ABRsize);
    showtoken(i, token);
    i++;
    }
    return 0;
    } /* main */

    #endif
    /* ------- end file toksplit.c ----------*/

    /* ------- file toksplit.h ----------*/
    #ifndef H_toksplit_h
    # define H_toksplit_h

    # ifdef __cplusplus
    extern "C" {
    # endif

    #include <stddef.h>

    /* copy over the next token from an input string, after
    skipping leading blanks (or other whitespace?). The
    token is terminated by the first appearance of tokchar,
    or by the end of the source string.

    The caller must supply sufficient space in token to
    receive any token, Otherwise tokens will be truncated.

    Returns: a pointer past the terminating tokchar.

    This will happily return an infinity of empty tokens if
    called with src pointing to the end of a string. Tokens
    will never include a copy of tokchar.

    released to Public Domain, by C.B. Falconer.
    Published 2006-02-20. Attribution appreciated.
    */

    const char *toksplit(const char *src, /* Source of tokens */
    char tokchar, /* token delimiting char */
    char *token, /* receiver of parsed token */
    size_t lgh); /* length token can receive */
    /* not including final '\0' */

    # ifdef __cplusplus
    }
    # endif
    #endif
    /* ------- end file toksplit.h ----------*/

    --
    <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
    <http://www.securityfocus.com/columnists/423>

    "A man who is right every time is not likely to do very much."
    -- Francis Crick, co-discover of DNA
    "There is nothing more amazing than stupidity in action."
    -- Thomas Matthews



    --
    Posted via a free Usenet account from http://www.teranews.com
     
    CBFalconer, Mar 10, 2007
    #4
  5. "matevzb" <> wrote in message
    news:...
    >> So far the only solution I can see is to write a replacement for strtok
    >> to use on one of the strings. Can anyone offer an alternative?

    > Not really, unless you'd be willing to use <OT>POSIX/SUS's strtok_r()</
    > OT> and skip on the portability. Otherwise, look at CBFalconer's
    > replacement toksplit(), discussed at


    are POSIX sources available?
     
    Servé Laurijssen, Mar 11, 2007
    #5
  6. Mr John FO Evans

    CBFalconer Guest

    "Servé Laurijssen" wrote:
    > "matevzb" <> wrote in message
    >
    >>> So far the only solution I can see is to write a replacement for
    >>> strtok to use on one of the strings. Can anyone offer an alternative?

    >>
    >> Not really, unless you'd be willing to use <OT>POSIX/SUS's strtok_r()<
    >> /OT> and skip on the portability. Otherwise, look at CBFalconer's
    >> replacement toksplit(), discussed at

    >
    > are POSIX sources available?


    Anything published here, as ready to go, will either run on POSIX
    or there will be much wailing, teeth gnashing and berating from the
    regulars. We deal with portable code here. Which, in turn, is why
    POSIX is off-topic.

    Please do not remove attribution lines for material you quote.
    Those are the initial lines that say "Joe wrote:" or similar.

    --
    Chuck F (cbfalconer at maineline dot net)
    Available for consulting/temporary embedded and systems.
    <http://cbfalconer.home.att.net>



    --
    Posted via a free Usenet account from http://www.teranews.com
     
    CBFalconer, Mar 11, 2007
    #6
  7. Mr John FO Evans

    matevzb Guest

    On Mar 11, 3:59 pm, "Servé Laurijssen" <> wrote:
    > "matevzb" <> wrote in message
    >
    > news:...
    >
    > >> So far the only solution I can see is to write a replacement for strtok
    > >> to use on one of the strings. Can anyone offer an alternative?

    > > Not really, unless you'd be willing to use <OT>POSIX/SUS's strtok_r()</
    > > OT> and skip on the portability. Otherwise, look at CBFalconer's
    > > replacement toksplit(), discussed at

    >
    > are POSIX sources available?

    POSIX/SUS, similar to ISO C, is a specification, so the answer would
    be no. Source code for specific implementations may be available (e.g.
    GNU libc), but whether or not they are portable and/or conform to
    POSIX is another question. I'd say you're better off with toksplit().
    --
    WYCIWYG - what you C is what you get
     
    matevzb, Mar 11, 2007
    #7
  8. "CBFalconer" <> wrote in message
    news:...
    > Anything published here, as ready to go, will either run on POSIX
    > or there will be much wailing, teeth gnashing and berating from the
    > regulars. We deal with portable code here. Which, in turn, is why
    > POSIX is off-topic.


    I was just wondering why your function is not considered off-topic but every
    time posix is mentioned its off topic. One could mention that to get a
    portable version of strtok_r you can strip it from posix.

    > Please do not remove attribution lines for material you quote.
    > Those are the initial lines that say "Joe wrote:" or similar.


    was mistake sorry
     
    Servé Laurijssen, Mar 11, 2007
    #8
  9. Mr John FO Evans

    Ben Pfaff Guest

    "Servé Laurijssen" <> writes:

    > I was just wondering why your function is not considered off-topic but every
    > time posix is mentioned its off topic. One could mention that to get a
    > portable version of strtok_r you can strip it from posix.


    POSIX is a standard. It's not a collection of source code from
    which you can strip anything.
    --
    int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.\
    \n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
    );while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p\
    );}return 0;}
     
    Ben Pfaff, Mar 11, 2007
    #9
  10. Mr John FO Evans

    pete Guest

    santosh wrote:

    > POSIX specifies a strtok_r that was designed to work around the
    > reentrancy issue of strtok. If you want full portability however,
    > you'll have to roll your own version. It's not difficult and can be
    > done in completely standard C.


    #include <stddef.h>

    char *str_tok_r(char *s1, const char *s2, char **s3);
    char *str_chr(const char *s, int c);
    size_t str_spn(const char *s1, const char *s2);
    size_t str_cspn(const char *s1, const char *s2);

    char *str_tok_r(char *s1, const char *s2, char **s3)
    {
    if (s1 != NULL) {
    *s3 = s1;
    }
    s1 = *s3 + str_spn(*s3, s2);
    if (*s1 == '\0') {
    return NULL;
    }
    *s3 = s1 + str_cspn(s1, s2);
    if (**s3 != '\0') {
    *(*s3)++ = '\0';
    }
    return s1;
    }

    size_t str_spn(const char *s1, const char *s2)
    {
    size_t n;

    for (n = 0; *s1 != '\0' && str_chr(s2, *s1) != NULL; ++s1) {
    ++n;
    }
    return n;
    }

    size_t str_cspn(const char *s1, const char *s2)
    {
    size_t n;

    for (n = 0; str_chr(s2, *s1) == NULL; ++s1) {
    ++n;
    }
    return n;
    }

    char *str_chr(const char *s, int c)
    {
    while (*s != (char)c) {
    if (*s == '\0') {
    return NULL;
    }
    ++s;
    }
    return (char *)s;
    }

    --
    pete
     
    pete, Mar 11, 2007
    #10
  11. In article <>,
    Ben Pfaff <> wrote:
    >> One could mention that to get a
    >> portable version of strtok_r you can strip it from posix.


    >POSIX is a standard. It's not a collection of source code from
    >which you can strip anything.


    s/posix/one of the free posix implementations/

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
     
    Richard Tobin, Mar 11, 2007
    #11
  12. Mr John FO Evans

    Richard Bos Guest

    Mr John FO Evans <> wrote:

    > I cam across an interesting limitation to the use of strtok.
    >
    > I have two strings on which I want strtok to operate.
    > However since strtok has only one memory of the residual string I must
    > complete one set of operations before starting on the second. This
    > is inconvenient in the context of my program!
    >
    > So far the only solution I can see is to write a replacement for strtok
    > to use on one of the strings. Can anyone offer an alternative?


    No. It's one of the many ways in which strtok() is unsuitable for most
    of the jobs it was intended for. I've only come across a situation in
    which strtok() was the right tool for the job, and I've since forgotten
    what it was.

    Richard
     
    Richard Bos, Mar 12, 2007
    #12
  13. Mr John FO Evans

    Flash Gordon Guest

    Servé Laurijssen wrote, On 11/03/07 19:02:
    > "CBFalconer" <> wrote in message
    > news:...
    >> Anything published here, as ready to go, will either run on POSIX
    >> or there will be much wailing, teeth gnashing and berating from the
    >> regulars. We deal with portable code here. Which, in turn, is why
    >> POSIX is off-topic.

    >
    > I was just wondering why your function is not considered off-topic but every
    > time posix is mentioned its off topic. One could mention that to get a
    > portable version of strtok_r you can strip it from posix.


    Because Chuck provides the source for it in standard C when he mentions
    it, and code written in standard C is topical here. If you provide the
    code for a POSIX function in standard C then we can talk about that, but
    a number of things POSIX provides *cannot* be implemented in standard C.
    --
    Flash Gordon
     
    Flash Gordon, Mar 12, 2007
    #13
  14. Mr John FO Evans

    Joe Wright Guest

    Richard Bos wrote:
    > Mr John FO Evans <> wrote:
    >
    >> I cam across an interesting limitation to the use of strtok.
    >>
    >> I have two strings on which I want strtok to operate.
    >> However since strtok has only one memory of the residual string I must
    >> complete one set of operations before starting on the second. This
    >> is inconvenient in the context of my program!
    >>
    >> So far the only solution I can see is to write a replacement for strtok
    >> to use on one of the strings. Can anyone offer an alternative?

    >
    > No. It's one of the many ways in which strtok() is unsuitable for most
    > of the jobs it was intended for. I've only come across a situation in
    > which strtok() was the right tool for the job, and I've since forgotten
    > what it was.
    >
    > Richard


    Hear. strtok() bit me several years ago. I investigated and determined
    why. As a result, I haven't used strtok() again.

    --
    Joe Wright
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Mar 13, 2007
    #14
  15. Mr John FO Evans

    Al Balmer Guest

    On Tue, 13 Mar 2007 15:58:32 -0400, Joe Wright
    <> wrote:

    >> No. It's one of the many ways in which strtok() is unsuitable for most
    >> of the jobs it was intended for.


    Nope, it's fine for the jobs it was intended for. It has problems when
    used for jobs it wasn't intended for.

    > I've only come across a situation in
    >> which strtok() was the right tool for the job, and I've since forgotten
    >> what it was.


    I've used it a number of times. If you are reading, tokenizing and
    discarding a line, and it's guaranteed not to have missing items
    (consecutive delimiters), or you don't care if it does, it works just
    fine.
    >>
    >> Richard

    >
    >Hear. strtok() bit me several years ago. I investigated and determined
    >why. As a result, I haven't used strtok() again.


    Perhaps you haven't had occasion to use it, but having determined what
    it does, you certainly shouldn't be afraid to.

    --
    Al Balmer
    Sun City, AZ
     
    Al Balmer, Mar 13, 2007
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ern

    Using strcmp() after using strtok()

    ern, Sep 21, 2005, in forum: C Programming
    Replies:
    3
    Views:
    972
    Default User
    Sep 22, 2005
  2. Replies:
    20
    Views:
    3,263
    Ben Bacarisse
    Feb 18, 2006
  3. BGP
    Replies:
    12
    Views:
    708
    Default User
    Jun 21, 2005
  4. Replies:
    5
    Views:
    464
  5. wreckingcru
    Replies:
    11
    Views:
    1,234
    red floyd
    Feb 1, 2006
Loading...

Share This Page