Replace every n instances of a string

T

Tom Cross

Hello-

I have a function that returns to me a text representation of Unicode
data, which looks like this:

\u0013\u0021\u003c\u003f\u0044\u001f\u006a\u005a\u0050\u0015\u0018\u001d\u007e\u006b\u004e\u007d\u006a\u006e\u0068\u0042\u0026\u003c\u004f\u0059\u0056\u002b\u001a\u0077\u0065\u006a\u000a\u0021\u005f\u0025\u003f\u0025\u0024\u007e\u0020\u0011\u0060\u002c\u0037\u0067\u007a\u0074\u0074\u0003\u0003\u000f\u0039\u0018\u0059\u0038\u0029\u0001\u0073\u0034\u0009\u0069\u005e\u0003\u006e\u000d\u004c\u001d\u00
f\u006e\u001b\u006e\u0063\u000b\u0014\u0071\u007c\u004e\u006a\u0011\u004a\u001f\u0063\u0016\u003d\u0020\u0065\u003e\u0043\u0012\u0047\u0026\u0062\u0004\u0025\u003b\u0005\u004c\u002e\u005a\u0070\u0048

I would like to add carriage returns to this for usability. But I
don't want to add a return after each "\u" I encounter in the text
(regexp comes to mind if I did). I want to add a return after each 12
"\\u"s I encounter in the string.

Any ideas? Do I not want to search for "\\u" but instead just insert
a \n after each 72 characters (equivalent to 12 \uXXXX codes)? Would
this provide better performance? If so, what would be the easiest way
to do that?

Thanks much!
 
T

Terry Reedy

Tom Cross said:
I have a function that returns to me a text representation of Unicode
data, which looks like this: ....
I would like to add carriage returns...after each 12
"\\u"s I encounter in the string.

Any ideas? Do I not want to search for "\\u" but instead just insert
a \n after each 72 characters (equivalent to 12 \uXXXX codes)? Would
this provide better performance? If so, what would be the easiest way
to do that?

Split string into list of 6*n (72) char chunks and join with \n:

#unirep = textrep(unidata) #ie, call your func and store result. for
illustration...
unirep =
r'\u0013\u0021\u003c\u003f\u0044\u001f\u006a\u005a\u0050\u0015\u0018'

blocklen = 6*4 #instead of 6*12 to get multiple lines with short
unirep
unilist = []
for i in range(0, len(unirep), blocklen):
unilist.append(unirep[i:i+blocklen])

unilines = '\n'.join(unilist)\u0013\u0021\u003c\u003f
\u0044\u001f\u006a\u005a
\u0050\u0015\u0018

Consider whether you want to change r'\n' to something else like
spaces for easier viewing. If so, do so on unirep before chop into
blocks and adjust blocklen if replacement is not two chars.

Terry J. Reedy




Terry J. Reedy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top