Question on Python Split

Discussion in 'Python' started by subhabangalore@gmail.com, Oct 7, 2012.

  1. Guest

    Dear Group,

    Suppose I have a string as,

    "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

    I am terming it as,

    str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

    I am working now with a split function,

    str_words=str1.split()
    so, I would get the result as,
    ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']

    But I am looking for,

    ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']

    This can be done if we assign the string as,

    str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"

    and then assign the split statement as,

    str1_word=str1.split(",")

    would produce,

    ['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']

    My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,

    [(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']

    as I see if I assign it as

    for i in str1_word:
    print i
    ti=tuple(i)
    print ti

    I am not getting the desired result.

    If I work again from tuple point, I get it as,
    >>> tup1=('Project Gutenberg')
    >>> tup2=('has 36000')
    >>> tup3=('free ebooks')
    >>> tup4=('for Kindle')
    >>> tup5=('Android iPad')
    >>> tup6=tup1+tup2+tup3+tup4+tup5
    >>> print tup6

    Project Gutenberghas 36000free ebooksfor KindleAndroid iPad

    Then how may I achieve it? If any one of the learned members can kindly guide me.
    Thanks in Advance,
    Regards,
    Subhabrata.

    NB: Apology for some minor errors.
    , Oct 7, 2012
    #1
    1. Advertising

  2. MRAB Guest

    On 2012-10-07 20:30, wrote:
    > Dear Group,
    >
    > Suppose I have a string as,
    >
    > "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
    >
    > I am terming it as,
    >
    > str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
    >
    > I am working now with a split function,
    >
    > str_words=str1.split()
    > so, I would get the result as,
    > ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']
    >
    > But I am looking for,
    >
    > ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
    >
    > This can be done if we assign the string as,
    >
    > str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"
    >
    > and then assign the split statement as,
    >
    > str1_word=str1.split(",")
    >
    > would produce,
    >
    > ['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']
    >

    It can also be done like this:

    >>> str1 = "Project Gutenberg has 36000 free ebooks for Kindle Android

    iPad iPhone."
    >>> # Splitting into words:
    >>> s = str1.split()
    >>> s

    ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for',
    'Kindle', 'Android', 'iPad', 'iPhone.']
    >>> # Using slicing with a stride of 2 gives:
    >>> s[0 : : 2]

    ['Project', 'has', 'free', 'for', 'Android', 'iPhone.']
    >>> # Similarly for the other words gives:
    >>> s[1 : : 2]

    ['Gutenberg', '36000', 'ebooks', 'Kindle', 'iPad']
    >>> # Combining them in pairs, and adding an extra empty string in case

    there's an odd number of words:
    >>> [(x + ' ' + y).rstrip() for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]

    ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android
    iPad', 'iPhone.']

    > My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,
    >
    > [(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']
    >
    > as I see if I assign it as
    >
    > for i in str1_word:
    > print i
    > ti=tuple(i)
    > print ti
    >
    > I am not getting the desired result.
    >
    > If I work again from tuple point, I get it as,
    >>>> tup1=('Project Gutenberg')
    >>>> tup2=('has 36000')
    >>>> tup3=('free ebooks')
    >>>> tup4=('for Kindle')
    >>>> tup5=('Android iPad')
    >>>> tup6=tup1+tup2+tup3+tup4+tup5
    >>>> print tup6

    > Project Gutenberghas 36000free ebooksfor KindleAndroid iPad
    >

    It's the comma that makes the tuple, not the parentheses, except for the
    empty tuple which is just empty parentheses, i.e. ().

    > Then how may I achieve it? If any one of the learned members can kindly guide me.


    >>> [((x + ' ' + y).rstrip(), ) for x, y in zip(s[0 : : 2], s[1 : : 2]

    + [''])]
    [('Project Gutenberg',), ('has 36000',), ('free ebooks',), ('for
    Kindle',), ('Android iPad',), ('iPhone.',)]

    Is this what you want?

    If you want it to be a list of pairs of words, then:

    >>> [(x, y) for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]

    [('Project', 'Gutenberg'), ('has', '36000'), ('free', 'ebooks'), ('for',
    'Kindle'), ('Android', 'iPad'), ('iPhone.', '')]
    MRAB, Oct 7, 2012
    #2
    1. Advertising

  3. Terry Reedy Guest

    On 10/7/2012 3:30 PM, wrote:

    > If I work again from tuple point, I get it as,
    >>>> tup1=('Project Gutenberg')
    >>>> tup2=('has 36000')
    >>>> tup3=('free ebooks')
    >>>> tup4=('for Kindle')
    >>>> tup5=('Android iPad')


    These are strings, not tuples. Numbered names like this are a bad idea.

    >>>> tup6=tup1+tup2+tup3+tup4+tup5
    >>>> print tup6

    > Project Gutenberghas 36000free ebooksfor KindleAndroid iPad


    tup1=('Project Gutenberg')
    tup2=('has 36000')
    tup3=('free ebooks')
    tup4=('for Kindle')
    tup5=('Android iPad')
    print(' '.join((tup1,tup2,tup3,tup4,tup5)))

    >>>

    Project Gutenberg has 36000 free ebooks for Kindle Android iPad

    --
    Terry Jan Reedy
    Terry Reedy, Oct 7, 2012
    #3
  4. On Sun, 7 Oct 2012 12:30:52 -0700 (PDT),
    declaimed the following in gmane.comp.python.general:

    >
    > But I am looking for,
    >
    > ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
    >


    Is splitting a sentence at every other word really what you want? Or
    are you intending, at some point, to have the splitting take place on
    syntactic/semantic features (subject, verb, object...).

    If the latter, you may be in need of some Natural Language
    Processing (NLP) libraries/algorithms. (First google hit:
    http://nltk.org/ )
    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Oct 8, 2012
    #4
  5. Guest

    On Monday, October 8, 2012 1:00:52 AM UTC+5:30, wrote:
    > Dear Group,
    >
    >
    >
    > Suppose I have a string as,
    >
    >
    >
    > "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
    >
    >
    >
    > I am terming it as,
    >
    >
    >
    > str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
    >
    >
    >
    > I am working now with a split function,
    >
    >
    >
    > str_words=str1.split()
    >
    > so, I would get the result as,
    >
    > ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']
    >
    >
    >
    > But I am looking for,
    >
    >
    >
    > ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
    >
    >
    >
    > This can be done if we assign the string as,
    >
    >
    >
    > str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"
    >
    >
    >
    > and then assign the split statement as,
    >
    >
    >
    > str1_word=str1.split(",")
    >
    >
    >
    > would produce,
    >
    >
    >
    > ['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']
    >
    >
    >
    > My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,
    >
    >
    >
    > [(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']
    >
    >
    >
    > as I see if I assign it as
    >
    >
    >
    > for i in str1_word:
    >
    > print i
    >
    > ti=tuple(i)
    >
    > print ti
    >
    >
    >
    > I am not getting the desired result.
    >
    >
    >
    > If I work again from tuple point, I get it as,
    >
    > >>> tup1=('Project Gutenberg')

    >
    > >>> tup2=('has 36000')

    >
    > >>> tup3=('free ebooks')

    >
    > >>> tup4=('for Kindle')

    >
    > >>> tup5=('Android iPad')

    >
    > >>> tup6=tup1+tup2+tup3+tup4+tup5

    >
    > >>> print tup6

    >
    > Project Gutenberghas 36000free ebooksfor KindleAndroid iPad
    >
    >
    >
    > Then how may I achieve it? If any one of the learned members can kindly guide me.
    >
    > Thanks in Advance,
    >
    > Regards,
    >
    > Subhabrata.
    >
    >
    >
    > NB: Apology for some minor errors.


    Thank you for nice answer. Your codes and discussions always inspire me.

    Regards,
    Subhabrata.
    , Oct 8, 2012
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    456
  2. Carlos Ribeiro
    Replies:
    11
    Views:
    688
    Alex Martelli
    Sep 17, 2004
  3. trans.  (T. Onoma)

    split on '' (and another for split -1)

    trans. (T. Onoma), Dec 27, 2004, in forum: Ruby
    Replies:
    10
    Views:
    204
    Florian Gross
    Dec 28, 2004
  4. Sam Kong
    Replies:
    5
    Views:
    231
    Rick DeNatale
    Aug 12, 2006
  5. Stanley Xu
    Replies:
    2
    Views:
    595
    Stanley Xu
    Mar 23, 2011
Loading...

Share This Page