Splitting strings

Discussion in 'C++' started by Alan Woodland, Nov 19, 2009.

  1. Hi,

    I was looking for a clean, generic way to split strings around a
    character using STL algorithms. The best I could manage was this
    example, which isn't exactly great to say the least.

    #include <cassert>
    #include <vector>
    #include <algorithm>
    #include <string>
    #include <sstream>
    #include <iterator>
    #include <iostream>

    namespace {
    template <typename T>
    struct SplitHelper {
    std::basic_ostringstream<typename T::value_type> next;
    std::vector<T> result;
    typename T::value_type match;

    static bool test(SplitHelper& h, const typename T::value_type c) {
    if (c == h.match) {
    h.result.push_back(h.next.str());
    h.next.str(T());
    }

    return c == h.match;
    }
    };
    }

    std::vector<T> split(const T& str, const typename T::value_type c='/') {
    SplitHelper<T> h;
    h.match = c;
    h.result.reserve(std::count(str.begin(), str.end(), c));
    std::remove_copy_if(str.begin(), str.end(),
    std::eek:stream_iterator<typename T::value_type>(h.next),
    std::bind1st(std::ptr_fun(&h.test), h));
    h.result.push_back(h.next.str());
    return h.result;
    }

    #include <iostream>
    int main() {
    const std::string path = "Hello/cruel/world";
    const std::vector<std::string>& result = split(path);
    std::cout << result.size() << std::endl;
    assert(3==result.size());
    std::cout << result[0] << std::endl;
    std::cout << result[1] << std::endl;
    std::cout << result[2] << std::endl;
    return 0;
    }

    Is this really the tidiest way to do this using STL algorithms?
    Obviously it wouldn't be hard at all to do just using a for loop and two
    pointers, but I was trying to do this 'the STL way'.

    Thanks for any suggestions,
    Alan
    Alan Woodland, Nov 19, 2009
    #1
    1. Advertising

  2. Alan Woodland

    Jeff Flinn Guest

    Alan Woodland wrote:
    > Hi,
    >
    > I was looking for a clean, generic way to split strings around a
    > character using STL algorithms. The best I could manage was this
    > example, which isn't exactly great to say the least.
    >
    > #include <cassert>
    > #include <vector>
    > #include <algorithm>
    > #include <string>
    > #include <sstream>
    > #include <iterator>
    > #include <iostream>
    >
    > namespace {
    > template <typename T>
    > struct SplitHelper {
    > std::basic_ostringstream<typename T::value_type> next;
    > std::vector<T> result;
    > typename T::value_type match;
    >
    > static bool test(SplitHelper& h, const typename T::value_type c) {
    > if (c == h.match) {
    > h.result.push_back(h.next.str());
    > h.next.str(T());
    > }
    >
    > return c == h.match;
    > }
    > };
    > }
    >
    > std::vector<T> split(const T& str, const typename T::value_type c='/') {
    > SplitHelper<T> h;
    > h.match = c;
    > h.result.reserve(std::count(str.begin(), str.end(), c));
    > std::remove_copy_if(str.begin(), str.end(),
    > std::eek:stream_iterator<typename T::value_type>(h.next),
    > std::bind1st(std::ptr_fun(&h.test), h));
    > h.result.push_back(h.next.str());
    > return h.result;
    > }
    >
    > #include <iostream>
    > int main() {
    > const std::string path = "Hello/cruel/world";
    > const std::vector<std::string>& result = split(path);
    > std::cout << result.size() << std::endl;
    > assert(3==result.size());
    > std::cout << result[0] << std::endl;
    > std::cout << result[1] << std::endl;
    > std::cout << result[2] << std::endl;
    > return 0;
    > }
    >
    > Is this really the tidiest way to do this using STL algorithms?
    > Obviously it wouldn't be hard at all to do just using a for loop and two
    > pointers, but I was trying to do this 'the STL way'.
    >
    > Thanks for any suggestions,
    > Alan


    If you want to 'split' any string based on any character use
    boost::tokenizer, regex, xpressive or spirit. If you want to walk a file
    path use boost::filesystem.

    See www.boost.org

    Jeff
    Jeff Flinn, Nov 19, 2009
    #2
    1. Advertising

  3. Alan Woodland

    White Wolf Guest

    Jeff Flinn wrote:
    > Alan Woodland wrote:
    >> Hi,
    >>
    >> I was looking for a clean, generic way to split strings around a
    >> character using STL algorithms. The best I could manage was this
    >> example, which isn't exactly great to say the least.

    [SNIP]
    >> Is this really the tidiest way to do this using STL algorithms?
    >> Obviously it wouldn't be hard at all to do just using a for loop and two
    >> pointers, but I was trying to do this 'the STL way'.
    >>
    >> Thanks for any suggestions,
    >> Alan

    >
    > If you want to 'split' any string based on any character use
    > boost::tokenizer, regex, xpressive or spirit. If you want to walk a file
    > path use boost::filesystem.
    >
    > See www.boost.org



    While I agree that boost.org provides the solution, but if we are into
    that, then it is string_algo and split, and not the rest you mention.

    However the OP's question was not how to split a path, or where he could
    find a library that provides split functionality. He (see quotes) very
    clearly said that a) he wants to make this and b) using STL algorithms
    (not Boost, not for loops).

    BR, WW
    White Wolf, Nov 19, 2009
    #3
  4. Alan Woodland

    White Wolf Guest

    Alan Woodland wrote:
    [SNIP]
    > Is this really the tidiest way to do this using STL algorithms?
    > Obviously it wouldn't be hard at all to do just using a for loop and two
    > pointers, but I was trying to do this 'the STL way'.


    I rarely use the STL algorithms, so my approach may be stupid. But I
    believe that what you need is not copy_if kind of thing. What you do is
    a "sort of copying" all elements, because even the separator changes the
    output, it starts a new string:

    output_iterator copy( input_iterator start, input_iterator end,
    output_iterator dest );

    Where start and begin are begin/end of your string.

    The output iterator needs to be a special insert iterator that adapts
    our "container" (std::vector<std::string>) into an output iterator that
    will add the character to the last existing string if the character is
    not a delimiter and add a new empty string otherwise.

    struct split_result {
    split_result(char splitchar) : r(1), sc(splitchar) { ; }
    void push_back(char const &c) {
    if (c==sc) {
    r.push_back(std::string());
    } else {
    r.top()+=c;
    }
    }
    private:
    std::vector<std::string> r;
    const char sc;
    };

    split_result sr;
    std::copy(str.begin(), str.end(),std::back_inserter(sr));

    I have not tried to compile this, it is off the top of my head, but I
    think it demonstrates the idea.

    BR, WW
    White Wolf, Nov 19, 2009
    #4
  5. Alan Woodland

    red floyd Guest

    On Nov 19, 3:59 am, Alan Woodland <> wrote:
    > Hi,
    >
    > I was looking for a clean, generic way to split strings around a
    > character using STL algorithms. The best I could manage was this
    > example, which isn't exactly great to say the least.
    >
    > #include <cassert>
    > #include <vector>
    > #include <algorithm>
    > #include <string>
    > #include <sstream>
    > #include <iterator>
    > #include <iostream>
    >
    > namespace {
    >   template <typename T>
    >   struct SplitHelper {
    >     std::basic_ostringstream<typename T::value_type> next;
    >     std::vector<T> result;
    >     typename T::value_type match;
    >
    >     static bool test(SplitHelper& h, const typename T::value_type c) {
    >       if (c == h.match) {
    >         h.result.push_back(h.next.str());
    >         h.next.str(T());
    >       }
    >
    >       return c == h.match;
    >     }
    >   };
    >
    > }
    >
    > std::vector<T> split(const T& str, const typename T::value_type c='/') {
    >   SplitHelper<T> h;
    >   h.match = c;
    >   h.result.reserve(std::count(str.begin(), str.end(), c));
    >   std::remove_copy_if(str.begin(), str.end(),
    > std::eek:stream_iterator<typename T::value_type>(h.next),
    > std::bind1st(std::ptr_fun(&h.test), h));
    >   h.result.push_back(h.next.str());
    >   return h.result;
    >
    > }
    >
    > #include <iostream>
    > int main() {
    >   const std::string path = "Hello/cruel/world";
    >   const std::vector<std::string>& result = split(path);
    >   std::cout << result.size() << std::endl;
    >   assert(3==result.size());
    >   std::cout << result[0] << std::endl;
    >   std::cout << result[1] << std::endl;
    >   std::cout << result[2] << std::endl;
    >   return 0;
    >
    > }
    >
    > Is this really the tidiest way to do this using STL algorithms?
    > Obviously it wouldn't be hard at all to do just using a for loop and two
    > pointers, but I was trying to do this 'the STL way'.


    What's wrong with using an istringstream?
    red floyd, Nov 19, 2009
    #5
  6. Alan Woodland

    Jeff Flinn Guest

    White Wolf wrote:
    > Jeff Flinn wrote:
    >> Alan Woodland wrote:
    >>> Hi,
    >>>
    >>> I was looking for a clean, generic way to split strings around a
    >>> character using STL algorithms. The best I could manage was this
    >>> example, which isn't exactly great to say the least.

    > [SNIP]
    >>> Is this really the tidiest way to do this using STL algorithms?
    >>> Obviously it wouldn't be hard at all to do just using a for loop and two
    >>> pointers, but I was trying to do this 'the STL way'.
    >>>
    >>> Thanks for any suggestions,
    >>> Alan

    >>
    >> If you want to 'split' any string based on any character use
    >> boost::tokenizer, regex, xpressive or spirit. If you want to walk a
    >> file path use boost::filesystem.
    >>
    >> See www.boost.org

    >
    >
    > While I agree that boost.org provides the solution, but if we are into
    > that, then it is string_algo and split, and not the rest you mention.


    Aah, forgot that, thanks.

    > However the OP's question was not how to split a path, or where he could


    But his example is exactly that.

    > find a library that provides split functionality. He (see quotes) very
    > clearly said that a) he wants to make this and b) using STL algorithms
    > (not Boost, not for loops).


    So why not broaden the OP's knowledge of the solution domain. If there
    were a direct and easy way of doing this with standard algorithms(just
    what does one mean by STL these day), there would not have been all of
    the aforementioned ways of skinning this cat. If boost is usable the OP
    will use it, if not it's a source for alternative methods that may or
    may not be doable with C++ library or language facilities. As a matter
    of fact regex is part of C++ tr1.

    Jeff Flinn
    Jeff Flinn, Nov 19, 2009
    #6
  7. Alan Woodland

    James Kanze Guest

    On Nov 19, 12:59 pm, Alan Woodland <> wrote:

    > I was looking for a clean, generic way to split strings around
    > a character using STL algorithms. The best I could manage was
    > this example, which isn't exactly great to say the least.


    > #include <cassert>
    > #include <vector>
    > #include <algorithm>
    > #include <string>
    > #include <sstream>
    > #include <iterator>
    > #include <iostream>


    > namespace {
    > template <typename T>
    > struct SplitHelper {
    > std::basic_ostringstream<typename T::value_type> next;
    > std::vector<T> result;
    > typename T::value_type match;
    >
    > static bool test(SplitHelper& h, const typename T::value_type c) {
    > if (c == h.match) {
    > h.result.push_back(h.next.str());
    > h.next.str(T());
    > }
    > return c == h.match;
    > }
    > };
    > }


    > std::vector<T> split(const T& str, const typename T::value_type c='/') {
    > SplitHelper<T> h;
    > h.match = c;
    > h.result.reserve(std::count(str.begin(), str.end(), c));
    > std::remove_copy_if(str.begin(), str.end(),
    > std::eek:stream_iterator<typename T::value_type>(h.next),
    > std::bind1st(std::ptr_fun(&h.test), h));
    > h.result.push_back(h.next.str());
    > return h.result;
    > }


    [...]
    > Is this really the tidiest way to do this using STL algorithms?


    Certainly not. If it were, I don't think anyone would use the
    STL. I haven't understood all of it, but I don't see why you
    would need a stringstream, for example. Something as simple as:

    std::vector< std::string >
    split( std::string const& original, char separator = ':' )
    {
    std::vector< std::string >
    result;
    typedef std::string::const_iterator
    TextIter;
    TextIter end = original.end();
    TextIter current
    = std::find( original.begin(), end, separator );
    result.push_back( std::string( original.begin(), current ) );
    while ( current != end ) {
    ++ current;
    TextIter next
    = std::find( current, end, separator );
    result.push_back( std::string( current, next ) );
    current = next;
    }
    return result;
    }

    (This is just off the top of my head, so there may be some
    issues with border conditions. For that matter, you haven't
    really defined adequately what the function should do to be able
    to write it correctly.)

    > Obviously it wouldn't be hard at all to do just using a for
    > loop and two pointers, but I was trying to do this 'the STL
    > way'.


    The STL way is to use iterators (instead of pointers) and
    algorithms. You still need the outer loop; you could probably
    design a special iterator, based on std::string::const_iterator,
    and returning an std::string when dereferences, then use
    something like std::copy and a back inserter, but that's really
    more complexity than you want. (If the standard iterators used
    the GoF idiom, it would be very simple, but the STL idiom is
    designed to make everything twice as complex as needs be.)

    --
    James Kanze
    James Kanze, Nov 19, 2009
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. leni
    Replies:
    3
    Views:
    920
    Dag Sunde
    Aug 7, 2005
  2. John Ericson
    Replies:
    0
    Views:
    423
    John Ericson
    Jul 19, 2003
  3. Mark
    Replies:
    0
    Views:
    440
  4. John Dibling
    Replies:
    0
    Views:
    411
    John Dibling
    Jul 19, 2003
  5. Ben

    Strings, Strings and Damned Strings

    Ben, Jun 22, 2006, in forum: C Programming
    Replies:
    14
    Views:
    755
    Malcolm
    Jun 24, 2006
Loading...

Share This Page