trying to find repeated substrings with regular expression

R

Robert Dodier

Hello all,

I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'.

I've tried numerous variations on '.*(FOO((?!FOO).)*)+.*'
and everything I've tried either matches too much or too little.

I've decided it's easier for me just to search for FOO, and then
break up the string based on the locations of FOO.

But I'd like to better understand regular expressions.
Can someone suggest a regular expression which will return
groups corresponding to the FOO substrings above?

Thanks for any insights, I appreciate it a lot.

Robert Dodier
 
G

Giovanni Bajo

Robert said:
Hello all,

I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'.

Can someone suggest a regular expression which will return
groups corresponding to the FOO substrings above?

FOO.*?(?=(?:FOO|$))
 
K

Kent Johnson

Robert said:
Hello all,

I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'.

I've tried numerous variations on '.*(FOO((?!FOO).)*)+.*'
and everything I've tried either matches too much or too little.
FOO(.*?)(?=FOO|$)


I've decided it's easier for me just to search for FOO, and then
break up the string based on the locations of FOO.

Use re.split() for this.

Kent
 
J

johnzenger

Robert said:
I've decided it's easier for me just to search for FOO, and then
break up the string based on the locations of FOO.

But I'd like to better understand regular expressions.

Those who cannot learn regular expressions are doomed to repeat string
searches. Which is not such a bad thing.

txt = "blah FOO blah1a blah1b FOO blah2 FOO blah3a blah3b blah3b"

def fa(s, pat):
retlist = []
try:
while True:
i = s.rindex(pat)
retlist.insert(0,s[i:])
s = s[:i]
except:
return retlist

print fa(txt, "FOO")
 
R

Raymond Hettinger

[Robert Dodier]
I'm trying to find substrings that look like 'FOO blah blah blah'
in a string. For example give 'blah FOO blah1a blah1b FOO blah2
FOO blah3a blah3b blah3b' I want to get three substrings,
'FOO blah1a blah1b', 'FOO blah2', and 'FOO blah3a blah3b blah3b'.

No need for regular expressions on this one:
s = 'blah FOO blah1a blah1b FOO blah2 FOO blah3a blah3b blah3b'
['FOO' + tail for tail in s.split('FOO')[1:]]
['FOO blah1a blah1b ', 'FOO blah2 ', 'FOO blah3a blah3b blah3b']

I've tried numerous variations on '.*(FOO((?!FOO).)*)+.*'
and everything I've tried either matches too much or too little.

The regular expression way is to find the target phrase followed by any
text followed by the target phrase. The first two are in a group and
the last is not included in the result group. The any-text section is
non-greedy:
['FOO blah1a blah1b ', 'FOO blah2 ', 'FOO blah3a blah3b blah3b']


Raymond
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top