Searching and replacing text ?

O

Oltmans

Hi,

I'm new to Python (and admittedly not a very good programmer) and I've
come across a scenario where I've to search and replace text in a
file.

For the sake of an example, I'm searching for every occurence of the
text
[[http://www.hotmail.com -> Hotmail]]

I've to replace it with
[http://www.hotmail.com Hotmail]

I've come up with following scheme
p=re.compile(r'\[\[')
q=re.compile(r'->')

p.sub('[',txt)
q.sub('\b',txt)

Give that I don't have very strong RegEX background, this doesn't look
very elegant. Is there some other way I can accomplish the same thing?
Moreover, please note that I'm using 'p' and 'q' for two regex and
then calling 'sub()' on both p and q. Can't I just do that by
employing one RegEx and then calling sub() only once?

Please enlighten me. Thanks in advance.
 
G

Gabriel Genellina

Hi,

I'm new to Python (and admittedly not a very good programmer) and I've
come across a scenario where I've to search and replace text in a
file.

For the sake of an example, I'm searching for every occurence of the
text
[[http://www.hotmail.com -> Hotmail]]

I've to replace it with
[http://www.hotmail.com Hotmail]

I've come up with following scheme
p=re.compile(r'\[\[')
q=re.compile(r'->')

p.sub('[',txt)
q.sub('\b',txt)

Is it a constant text? Then use the replace method of string objects: no
re is needed.

text = "This is [[http://www.hotmail.com -> Hotmail]] some text"
print text.replace("[[http://www.hotmail.com -> Hotmail]]",
"[http://www.hotmail.com Hotmail]")
output: This is [http://www.hotmail.com Hotmail] some text

Or, do you want to find '[[' followed by some word followed by ' -> '
followed by another word followed by ']]'?

r = re.compile(r'\[\[(.+?)\s*->\s*(.+?)\]\]')
print r.sub(r'[\1 \2]', text)
(same output as above)

() in the regular expression indicate groups, that you can reference as \1
\2 in the replacement string.
 
J

Jeff

If you know that, for instance, every occurrence of '[[' needs to be
replaced with ']]', then it is much faster to use regular string
methods. If you have the text stored in the variable foo:

foo = foo.replace('[[', '[').replace(']]', ']').replace('->', '')

If you need to recognize the actual pattern, use REs:

regex = re.compile(r'\[\[(.+?) -> (.+?)\]\]')
regex.sub(r'[\1 \2]', foo)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top