Performing a number of substitutions on a unicode string

A

Arnaud Delobelle

Hi all,

I've got to escape some unicode text according to the following map:

escape_map = {
u'\n': u'\\n',
u'\t': u'\\t',
u'\r': u'\\r',
u'\f': u'\\f',
u'\\': u'\\\\'
}

The simplest solution is to use str.replace:

def escape_text(text):
return text.replace('\\', '\\\\').replace('\n',
'\\n').replace('\t', '\\t').replace('\r', '\\r').replace('\f', '\\f')

But it creates 4 intermediate strings, which is quite inefficient
(I've got 10s of MB's worth of unicode strings to escape)

I can think of another way using regular expressions:

escape_ptn = re.compile(r"[\n\t\f\r\\]")

# escape_map is defined above
def escape_match(m, map=escape_map):
return map[m.group(0)]

def escape_text(text, sub=escape_match):
return escape_ptn.sub(sub, text)

Is there a better way?

Thanks,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top