Replacing large number of substrings

W

Will McGugan

Hi,

Is there a simple way of replacing a large number of substrings in a
string? I was hoping that str.replace could take a dictionary and use it
to replace the occurrences of the keys with the dict values, but that
doesnt seem to be the case.

To clarify, something along these lines..
"x y c"


Regards,

Will McGugan
 
T

tiissa

Will said:
Hi,

Is there a simple way of replacing a large number of substrings in a
string? I was hoping that str.replace could take a dictionary and use it
to replace the occurrences of the keys with the dict values, but that
doesnt seem to be the case.

You can look at the re.sub [1] and try:

d={'a':'x', 'b':'y'}

def repl(match):
return d.get(match.group(0), '')

print re.sub("(a|b)", repl, "a b c")



Above, I gave the pattern myself but you can try to have it generated
from the keys:


def dict_replace(s, d):
pattern = '(%s)'%'|'.join(d.keys())
def repl(match):
return d.get(match.group(0), '')
return re.sub(pattern, repl, s)





[1] http://python.org/doc/2.4.1/lib/node114.html
 
R

Robert Kern

Will said:
Hi,

Is there a simple way of replacing a large number of substrings in a
string? I was hoping that str.replace could take a dictionary and use it
to replace the occurrences of the keys with the dict values, but that
doesnt seem to be the case.

To clarify, something along these lines..

"x y c"

(n.b. untested!)

def dict_replace(string, replacements):
for key, value in replacements.iteritems():
string = string.replace(key, value)
return string

How well this works depends on how large is "large." If "large" is
really very large, then you might want to build something using a more
suitable algorithm like the Aho-Corasick algorithm.

http://www.lehuen.com/nicolas/download/pytst/
http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
M

Michael J. Fromberger

Will McGugan said:
Hi,

Is there a simple way of replacing a large number of substrings in a
string? I was hoping that str.replace could take a dictionary and use it
to replace the occurrences of the keys with the dict values, but that
doesnt seem to be the case.

To clarify, something along these lines..

"x y c"

Hi, Will,

Perhaps the following solution might appeal to you:

.. import re
..
.. def replace_many(s, r):
.. """Replace substrings of s. The parameter r is a dictionary in
.. which each key is a substring of s to be replaced and the
.. corresponding value is the string to replace it with.
.. """
.. exp = re.compile('|'.join(re.escape(x) for x in r.keys()))
.. return exp.sub(lambda m: r.get(m.group()), s)

Cheers,
-M
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top