Question on Python Split

S

subhabangalore

Dear Group,

Suppose I have a string as,

"Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

I am terming it as,

str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

I am working now with a split function,

str_words=str1.split()
so, I would get the result as,
['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']

But I am looking for,

['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']

This can be done if we assign the string as,

str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"

and then assign the split statement as,

str1_word=str1.split(",")

would produce,

['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']

My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,

[(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']

as I see if I assign it as

for i in str1_word:
print i
ti=tuple(i)
print ti

I am not getting the desired result.

If I work again from tuple point, I get it as,Project Gutenberghas 36000free ebooksfor KindleAndroid iPad

Then how may I achieve it? If any one of the learned members can kindly guide me.
Thanks in Advance,
Regards,
Subhabrata.

NB: Apology for some minor errors.
 
M

MRAB

Dear Group,

Suppose I have a string as,

"Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

I am terming it as,

str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

I am working now with a split function,

str_words=str1.split()
so, I would get the result as,
['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']

But I am looking for,

['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']

This can be done if we assign the string as,

str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"

and then assign the split statement as,

str1_word=str1.split(",")

would produce,

['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']
It can also be done like this:
['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for',
'Kindle', 'Android', 'iPad', 'iPhone.']
# Using slicing with a stride of 2 gives:
s[0 : : 2] ['Project', 'has', 'free', 'for', 'Android', 'iPhone.']
# Similarly for the other words gives:
s[1 : : 2] ['Gutenberg', '36000', 'ebooks', 'Kindle', 'iPad']
# Combining them in pairs, and adding an extra empty string in case there's an odd number of words:
[(x + ' ' + y).rstrip() for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]
['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android
iPad', 'iPhone.']
My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,

[(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']

as I see if I assign it as

for i in str1_word:
print i
ti=tuple(i)
print ti

I am not getting the desired result.

If I work again from tuple point, I get it as,Project Gutenberghas 36000free ebooksfor KindleAndroid iPad
It's the comma that makes the tuple, not the parentheses, except for the
empty tuple which is just empty parentheses, i.e. ().
Then how may I achieve it? If any one of the learned members can kindly guide me.
[((x + ' ' + y).rstrip(), ) for x, y in zip(s[0 : : 2], s[1 : : 2]
+ [''])]
[('Project Gutenberg',), ('has 36000',), ('free ebooks',), ('for
Kindle',), ('Android iPad',), ('iPhone.',)]

Is this what you want?

If you want it to be a list of pairs of words, then:
[(x, y) for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]
[('Project', 'Gutenberg'), ('has', '36000'), ('free', 'ebooks'), ('for',
'Kindle'), ('Android', 'iPad'), ('iPhone.', '')]
 
T

Terry Reedy

If I work again from tuple point, I get it as,

These are strings, not tuples. Numbered names like this are a bad idea.
Project Gutenberghas 36000free ebooksfor KindleAndroid iPad

tup1=('Project Gutenberg')
tup2=('has 36000')
tup3=('free ebooks')
tup4=('for Kindle')
tup5=('Android iPad')
print(' '.join((tup1,tup2,tup3,tup4,tup5)))
Project Gutenberg has 36000 free ebooks for Kindle Android iPad
 
D

Dennis Lee Bieber

But I am looking for,

['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']

Is splitting a sentence at every other word really what you want? Or
are you intending, at some point, to have the splitting take place on
syntactic/semantic features (subject, verb, object...).

If the latter, you may be in need of some Natural Language
Processing (NLP) libraries/algorithms. (First google hit:
http://nltk.org/ )
 
S

subhabangalore

Dear Group,



Suppose I have a string as,



"Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."



I am terming it as,



str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."



I am working now with a split function,



str_words=str1.split()

so, I would get the result as,

['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']



But I am looking for,



['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']



This can be done if we assign the string as,



str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"



and then assign the split statement as,



str1_word=str1.split(",")



would produce,



['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']



My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,



[(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']



as I see if I assign it as



for i in str1_word:

print i

ti=tuple(i)

print ti



I am not getting the desired result.



If I work again from tuple point, I get it as,

Project Gutenberghas 36000free ebooksfor KindleAndroid iPad



Then how may I achieve it? If any one of the learned members can kindly guide me.

Thanks in Advance,

Regards,

Subhabrata.



NB: Apology for some minor errors.

Thank you for nice answer. Your codes and discussions always inspire me.

Regards,
Subhabrata.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top