Re: Splitting text at whitespace but keeping the whitespace in thereturned list

Discussion in 'Python' started by MRAB, Jan 24, 2010.

  1. MRAB

    MRAB Guest

    wrote:
    > I need to parse some ASCII text into 'word' sized chunks of text AND
    > collect the whitespace that seperates the split items. By 'word' I mean
    > any string of characters seperated by whitespace (newlines, carriage
    > returns, tabs, spaces, soft-spaces, etc). This means that my split text
    > can contain punctuation and numbers - just not whitespace.
    >
    > The split( None ) method works fine for returning the word sized chunks
    > of text, but destroys the whitespace separators that I need.
    >
    > Is there a variation of split() that returns delimiters as well as tokens?
    >

    I'd use the re module:

    >>> import re
    >>> re.split(r'(\s+)', "Hello world!")

    ['Hello', ' ', 'world!']
     
    MRAB, Jan 24, 2010
    #1
    1. Advertising

  2. MRAB

    Tim Arnold Guest

    "MRAB" <> wrote in message
    news:...
    > wrote:
    >> I need to parse some ASCII text into 'word' sized chunks of text AND
    >> collect the whitespace that seperates the split items. By 'word' I mean
    >> any string of characters seperated by whitespace (newlines, carriage
    >> returns, tabs, spaces, soft-spaces, etc). This means that my split text
    >> can contain punctuation and numbers - just not whitespace.
    >> The split( None ) method works fine for returning the word sized chunks
    >> of text, but destroys the whitespace separators that I need.
    >> Is there a variation of split() that returns delimiters as well as
    >> tokens?
    >>

    > I'd use the re module:
    >
    > >>> import re
    > >>> re.split(r'(\s+)', "Hello world!")

    > ['Hello', ' ', 'world!']


    also, partition works though it returns a tuple instead of a list.
    >>> s = 'hello world'
    >>> s.partition(' ')

    ('hello', ' ', 'world')
    >>>


    --Tim Arnold
     
    Tim Arnold, Jan 25, 2010
    #2
    1. Advertising

  3. MRAB

    Roy Smith Guest

    In article <hjklfd$llm$>,
    "Tim Arnold" <> wrote:

    > also, partition works though it returns a tuple instead of a list.
    > >>> s = 'hello world'
    > >>> s.partition(' ')

    > ('hello', ' ', 'world')


    I've never used partition() before; my first thought on reading the above
    was, "That's weird, it should be returning a list". Then I went and looked
    at the docs. Given the description (returns specifically a 3-tuple), I
    guess a tuple makes sense, but now I'm wondering what the use case was for
    this method when it was invented?

    Having a variant of split() which either leaves the delimiter on the end of
    each word, or returns a list of alternating [word, delimiter, word,
    delimiter, word] seems logical and orthogonal. In fact, partition() is
    really just the hypothetical whitespace-preserving variant of split(), with
    maxsplit=1, except that it returns a tuple instead of a list.

    So, what was the original problem partition() was trying to solve?
     
    Roy Smith, Jan 26, 2010
    #3
  4. MRAB

    Aahz Guest

    In article <>,
    Roy Smith <> wrote:
    >
    >I've never used partition() before; my first thought on reading the above
    >was, "That's weird, it should be returning a list". Then I went and looked
    >at the docs. Given the description (returns specifically a 3-tuple), I
    >guess a tuple makes sense, but now I'm wondering what the use case was for
    >this method when it was invented?


    http://docs.python.org/whatsnew/2.5.html
    --
    Aahz () <*> http://www.pythoncraft.com/

    import antigravity
     
    Aahz, Jan 26, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Gary C40
    Replies:
    6
    Views:
    166
    MonkeeSage
    Dec 16, 2007
  2. Kyle Schmitt
    Replies:
    11
    Views:
    203
    William James
    May 2, 2008
  3. Ruby Student
    Replies:
    2
    Views:
    104
    Reid Thompson
    Apr 21, 2009
  4. Sandman

    Splitting and keeping the delimiter

    Sandman, Sep 10, 2003, in forum: Perl Misc
    Replies:
    7
    Views:
    466
    Sandman
    Sep 12, 2003
  5. Sandman

    Splitting and keeping key/value

    Sandman, Sep 26, 2006, in forum: Perl Misc
    Replies:
    17
    Views:
    193
    Sandman
    Sep 27, 2006
Loading...

Share This Page