change only the nth occurrence of a pattern in a string

TP · Dec 31, 2008

Hi everybody,

I would like to change only the nth occurence of a pattern in a string. The
problem with "replace" method of strings, and "re.sub" is that we can only
define the number of occurrences to change from the first one.
'ciucou'

What is the best way to change only the nth occurence (occurrence number n)?

Why this default behavior? For the user, it would be easier to put re.sub or
replace in a loop to change the first n occurences.

Thanks

Julien
--
python -c "print ''.join([chr(154 - ord(c)) for c in '*9(9&(18%.\
9&1+,\'Z4(55l4('])"

"When a distinguished but elderly scientist states that something is
possible, he is almost certainly right. When he states that something is
impossible, he is very probably wrong." (first law of AC Clarke)

Roy Smith · Dec 31, 2008

TP said:
Hi everybody,

I would like to change only the nth occurence of a pattern in a string.

It's a little ugly, but the following looks like it works. The gist is to
split the string on your pattern, then re-join the pieces using the
original delimiter everywhere except for the n'th splice. Split() is a
wonderful tool. I'm a hard-core regex geek, but I find that most things I
might have written a big hairy regex for are easier solved by doing split()
and then attacking the pieces.

There may be some fencepost errors here. I got the basics working, and
left the details as an exercise for the reader

This version assumes the pattern is a literal string. If it's really a
regex, you'll need to put the pattern in parens when you call split(); this
will return the exact text matched each time as elements of the list. And
then your post-processing gets a little more complicated, but nothing
that's too bad.

This does a couple of passes over the data, but at least all the operations
are O(n), so the whole thing is O(n).

#!/usr/bin/python

import re

v = "coucoucoucou"

pattern = "o"
n = 2
parts = re.split(pattern, v)
print parts

first = parts[:n]
last = parts[n:]
print first
print last

j1 = pattern.join(first)
j2 = pattern.join(last)
print j1
print j2
print "i".join([j1, j2])
print v

Steven D'Aprano · Dec 31, 2008

Hi everybody,

I would like to change only the nth occurence of a pattern in a string.
The problem with "replace" method of strings, and "re.sub" is that we
can only define the number of occurrences to change from the first one.

'ciucou'

What is the best way to change only the nth occurence (occurrence number
n)?

Step 1: Find the nth occurrence.
Step 2: Change it.

def findnth(source, target, n):
num = 0
start = -1
while num < n:
start = source.find(target, start+1)
if start == -1: return -1
num += 1
return start

def replacenth(source, old, new, n):
p = findnth(source, old, n)
if n == -1: return source
return source[

] + new + source[p+len(old):]

And in use:

'abcabcWXYZabcabc'

Why this default behavior? For the user, it would be easier to put
re.sub or replace in a loop to change the first n occurences.

Easier than just calling a function? I don't think so.

I've never needed to replace only the nth occurrence of a string, and I
guess the Python Development team never did either. Or they thought that
the above two functions were so trivial that anyone could write them.

Tim Chase · Dec 31, 2008

I would like to change only the nth occurence of a pattern in

a string. The problem with "replace" method of strings, and
"re.sub" is that we can only define the number of occurrences
to change from the first one.

'ciucou'

What is the best way to change only the nth occurence
(occurrence number n)?

Well, there are multiple ways of doing this, including munging
the regexp to skip over the first instances of a match.
Something like the following untested:

re.sub("((?:[^o]*o){2})o", r"\1i", s)

However, for a more generic solution, you could use something like

import re
class Nth(object):
def __init__(self, n_min, n_max, replacement):
#assert n_min <= n_max, \
# "Hey, look, I don't know what I'm doing!"
if n_max > n_min:
# don't be a dope
n_min, n_max = n_max, n_min
self.n_min = n_min
self.n_max = n_max
self.replacement = replacement
self.calls = 0
def __call__(self, matchobj):
self.calls += 1
if self.n_min <= self.calls <= self.n_max:
return self.replacement
return matchobj.group(0)

s = 'coucoucoucou'
print "Initial:"
print s
print "Just positions 3-4:"
print re.sub('o', Nth(3,4,'i'), s)
for params in [
(1, 1, 'i'), # just the 1st
(1, 2, 'i'), # 1-2
(2, 2, 'i'), # just the 2nd
(2, 3, 'i'), # 2-3
(2, 4, 'i'), # 2-4
(4, 4, 'i'), # just the 4th
]:
print "Nth(%i, %i, %s)" % params
print re.sub('o', Nth(*params), s)

Why this default behavior?

Can't answer that one, but with so many easy solutions, it's not
been a big concern of mine.

-tkc

Antoon Pardon · Jan 12, 2009

Hi everybody,

I would like to change only the nth occurence of a pattern in a string. The
problem with "replace" method of strings, and "re.sub" is that we can only
define the number of occurrences to change from the first one.

'ciucou'

What is the best way to change only the nth occurence (occurrence number n)?

Why this default behavior? For the user, it would be easier to put re.sub or
replace in a loop to change the first n occurences.

I would do it as follows:

1) Change the pattern n times to somethings that doesn't occur in your string
2) Change it back n-1 times
3) Change the remaining one to what you want.
'couciu'

MRAB · Jan 14, 2009

Antoon said:
>
> I would do it as follows:
>
> 1) Change the pattern n times to somethings that doesn't occur in your string
> 2) Change it back n-1 times
> 3) Change the remaining one to what you want.
>
> 'couciu'
>

Sorry for the last posting, but it did occur to me that str.replace()
could grow another parameter 'start', so it would become:

s.replace(old, new[[, start], end]]) -> string

(In Python 2.x the method doesn't accept keyword arguments, so that
isn't a problem.)

If the possible replacements are numbered from 0, then 'start' is the
first one actually to perform and 'end' the one after the last to perform.

The 2-argument form would be s.replace(old, new) with 'start' defaulting
to 0 and 'end' to None => replacing all occurrences, same as now.

The 3-argument form would be s.replace(old, new, end) with 'start'
defaulting to 0 => equivalent to replacing the first 'end' occurrences,
same as now.

The 4-argument form would be s.replace(old, new, start, end) =>
replacing from the 'start'th to before the 'end'th occurrence,
additional behaviour as requested.

equivalent of bash "set -x" in Python	3	Sep 26, 2010
length of a tuple or a list containing only one element	9	Nov 3, 2008
FAQ 4.28 How do I change the Nth occurrence of something?	0	Feb 18, 2011
list parameter of a recursive function	11	Oct 6, 2010
how to construct a list of only one tuple	5	Nov 27, 2008
adding a method to an existing builtin class	1	Sep 26, 2010
context not cleaned at the end of a loop containing yield?	2	Mar 26, 2009
intricated functions: how to share a variable	7	Aug 5, 2009

change only the nth occurrence of a pattern in a string

TP

Roy Smith

Steven D'Aprano

Tim Chase

Antoon Pardon

MRAB

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads