tokenize a string

Discussion in 'C++' started by Kelvin@!!!, Feb 24, 2005.

  1. Kelvin@!!!

    Kelvin@!!! Guest

    hi:
    in C, we can use strtok() to tokenize a char*
    but i can't find any similar member function of string that can tokenize a
    string
    so how so i tokenize a string in C++?
    do it the C way?

    thanks
    --
    { Kelvin@!!! }
    remove the last .hk to reply
    thanks
     
    Kelvin@!!!, Feb 24, 2005
    #1
    1. Advertising

  2. Kelvin@!!!

    red floyd Guest

    Kelvin@!!! wrote:
    > hi:
    > in C, we can use strtok() to tokenize a char*
    > but i can't find any similar member function of string that can tokenize a
    > string
    > so how so i tokenize a string in C++?
    > do it the C way?
    >
    > thanks


    Look up std::istringstream in your favorite reference book.
     
    red floyd, Feb 24, 2005
    #2
    1. Advertising

  3. Kelvin@!!!

    ulrich Guest

    On Thu, 24 Feb 2005 06:24:31 GMT, Kelvin@!!!
    <> wrote:

    > hi:
    > in C, we can use strtok() to tokenize a char*
    > but i can't find any similar member function of string that can tokenize
    > a
    > string
    > so how so i tokenize a string in C++?


    you may want to try boost::tokenizer an relatives.

    http://www.boost.org/libs/tokenizer/index.html
     
    ulrich, Feb 24, 2005
    #3
  4. Kelvin@!!!

    rossum Guest

    On Thu, 24 Feb 2005 06:24:31 GMT, "Kelvin@!!!"
    <> wrote:

    >hi:
    >in C, we can use strtok() to tokenize a char*
    >but i can't find any similar member function of string that can tokenize a
    >string
    >so how so i tokenize a string in C++?
    >do it the C way?
    >
    >thanks


    There is a sample chapter from Accelerated C++ on the web at
    http://www.awprofessional.com/articles/article.asp?p=25333

    The chapter has a function called split() which does what you seem to
    want, it takes a string and returns a vector of all the individual
    words:

    // true if the argument is whitespace, false otherwise
    bool space(char c) { return isspace(c); }

    // false if the argument is whitespace, true otherwise
    bool not_space(char c) { return !isspace(c); }

    vector<string> split(const string& str) {
    typedef string::const_iterator iter;
    vector<string> ret;
    iter i = str.begin();
    while (i != str.end()) {
    // ignore leading blanks
    i = find_if(i, str.end(), not_space);
    // find end of next word
    iter j = find_if(i, str.end(), space);
    // copy the characters in [i, j)
    if (i != str.end()) ret.push_back(string(i, j));
    i = j;
    }
    return ret;
    }

    There is a detailed explanation of the functino in the text.

    rossum



    --

    The ultimate truth is that there is no Ultimate Truth
     
    rossum, Feb 24, 2005
    #4
  5. Kelvin@!!!

    Guest

    rossum wrote:

    > // true if the argument is whitespace, false otherwise
    > bool space(char c) { return isspace(c); }
    >
    > // false if the argument is whitespace, true otherwise
    > bool not_space(char c) { return !isspace(c); }
    >
    > vector<string> split(const string& str) {
    > typedef string::const_iterator iter;
    > vector<string> ret;
    > iter i = str.begin();
    > while (i != str.end()) {
    > // ignore leading blanks
    > i = find_if(i, str.end(), not_space);
    > // find end of next word
    > iter j = find_if(i, str.end(), space);
    > // copy the characters in [i, j)
    > if (i != str.end()) ret.push_back(string(i, j));
    > i = j;
    > }
    > return ret;
    > }


    This would be better if it was templatized by an insertion iterator
    rather than returning a vector by value. Something along the lines of
    (untested)

    template <typename InsertIter>
    int
    tokenize(const std::string& buf,
    const std::string& delims,
    InsertIter it)
    {
    std::string::size_type sp; // start position
    std::string::size_type ep; // end position
    int numTokens = 0;

    do {
    sp = buf.find_first_not_of(delims, sp);
    ep = buf.find_first_of(delims, sp);
    if (sp != ep) {
    if (ep == buf.npos) {
    ep = buf.length();
    }
    *it++ = buf.substr(sp, ep - sp);
    ++numTokens;
    sp = buf.find_first_not_of(delims, ep + 1);
    }
    } while (sp != buf.npos);

    if (sp != buf.npos) {
    *it++ = buf.substr(sp, buf.length() - sp);
    ++numTokens;
    }

    return numTokens;
    }

    called as

    std::deque<std::string> tokens;
    int numTokens = tokenize(buf, delims, std::back_inserter(tokens));

    /david
     
    , Feb 25, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lans
    Replies:
    9
    Views:
    481
    Chris \( Val \)
    Jul 10, 2003
  2. Replies:
    20
    Views:
    3,140
    Ben Bacarisse
    Feb 18, 2006
  3. Sree

    string tokenize...

    Sree, Mar 8, 2007, in forum: Java
    Replies:
    1
    Views:
    475
    Robert Klemme
    Mar 8, 2007
  4. WP
    Replies:
    3
    Views:
    354
    David Harmon
    Nov 23, 2007
  5. vijayanand2k
    Replies:
    0
    Views:
    548
    vijayanand2k
    Apr 2, 2009
Loading...

Share This Page