change only the nth occurrence of a pattern in a string

Discussion in 'Python' started by TP, Dec 31, 2008.

  1. TP

    TP Guest

    Hi everybody,

    I would like to change only the nth occurence of a pattern in a string. The
    problem with "replace" method of strings, and "re.sub" is that we can only
    define the number of occurrences to change from the first one.

    >>> v="coucou"
    >>> v.replace("o","i",2)

    'ciuciu'
    >>> import re
    >>> re.sub( "o", "i", v,2)

    'ciuciu'
    >>> re.sub( "o", "i", v,1)

    'ciucou'

    What is the best way to change only the nth occurence (occurrence number n)?

    Why this default behavior? For the user, it would be easier to put re.sub or
    replace in a loop to change the first n occurences.

    Thanks

    Julien
    --
    python -c "print ''.join([chr(154 - ord(c)) for c in '*9(9&(18%.\
    9&1+,\'Z4(55l4('])"

    "When a distinguished but elderly scientist states that something is
    possible, he is almost certainly right. When he states that something is
    impossible, he is very probably wrong." (first law of AC Clarke)
     
    TP, Dec 31, 2008
    #1
    1. Advertising

  2. TP

    Roy Smith Guest

    In article <>,
    TP <> wrote:

    > Hi everybody,
    >
    > I would like to change only the nth occurence of a pattern in a string.


    It's a little ugly, but the following looks like it works. The gist is to
    split the string on your pattern, then re-join the pieces using the
    original delimiter everywhere except for the n'th splice. Split() is a
    wonderful tool. I'm a hard-core regex geek, but I find that most things I
    might have written a big hairy regex for are easier solved by doing split()
    and then attacking the pieces.

    There may be some fencepost errors here. I got the basics working, and
    left the details as an exercise for the reader :)

    This version assumes the pattern is a literal string. If it's really a
    regex, you'll need to put the pattern in parens when you call split(); this
    will return the exact text matched each time as elements of the list. And
    then your post-processing gets a little more complicated, but nothing
    that's too bad.

    This does a couple of passes over the data, but at least all the operations
    are O(n), so the whole thing is O(n).


    #!/usr/bin/python

    import re

    v = "coucoucoucou"

    pattern = "o"
    n = 2
    parts = re.split(pattern, v)
    print parts

    first = parts[:n]
    last = parts[n:]
    print first
    print last

    j1 = pattern.join(first)
    j2 = pattern.join(last)
    print j1
    print j2
    print "i".join([j1, j2])
    print v
     
    Roy Smith, Dec 31, 2008
    #2
    1. Advertising

  3. On Wed, 31 Dec 2008 15:40:32 +0100, TP wrote:

    > Hi everybody,
    >
    > I would like to change only the nth occurence of a pattern in a string.
    > The problem with "replace" method of strings, and "re.sub" is that we
    > can only define the number of occurrences to change from the first one.
    >
    >>>> v="coucou"
    >>>> v.replace("o","i",2)

    > 'ciuciu'
    >>>> import re
    >>>> re.sub( "o", "i", v,2)

    > 'ciuciu'
    >>>> re.sub( "o", "i", v,1)

    > 'ciucou'
    >
    > What is the best way to change only the nth occurence (occurrence number
    > n)?


    Step 1: Find the nth occurrence.
    Step 2: Change it.


    def findnth(source, target, n):
    num = 0
    start = -1
    while num < n:
    start = source.find(target, start+1)
    if start == -1: return -1
    num += 1
    return start

    def replacenth(source, old, new, n):
    p = findnth(source, old, n)
    if n == -1: return source
    return source[:p] + new + source[p+len(old):]


    And in use:

    >>> replacenth("abcabcabcabcabc", "abc", "WXYZ", 3)

    'abcabcWXYZabcabc'


    > Why this default behavior? For the user, it would be easier to put
    > re.sub or replace in a loop to change the first n occurences.


    Easier than just calling a function? I don't think so.

    I've never needed to replace only the nth occurrence of a string, and I
    guess the Python Development team never did either. Or they thought that
    the above two functions were so trivial that anyone could write them.



    --
    Steven
     
    Steven D'Aprano, Dec 31, 2008
    #3
  4. TP

    Tim Chase Guest

    > I would like to change only the nth occurence of a pattern in
    > a string. The problem with "replace" method of strings, and
    > "re.sub" is that we can only define the number of occurrences
    > to change from the first one.
    >
    >>>> v="coucou"
    >>>> v.replace("o","i",2)

    > 'ciuciu'
    >>>> import re
    >>>> re.sub( "o", "i", v,2)

    > 'ciuciu'
    >>>> re.sub( "o", "i", v,1)

    > 'ciucou'
    >
    > What is the best way to change only the nth occurence
    > (occurrence number n)?


    Well, there are multiple ways of doing this, including munging
    the regexp to skip over the first instances of a match.
    Something like the following untested:

    re.sub("((?:[^o]*o){2})o", r"\1i", s)

    However, for a more generic solution, you could use something like

    import re
    class Nth(object):
    def __init__(self, n_min, n_max, replacement):
    #assert n_min <= n_max, \
    # "Hey, look, I don't know what I'm doing!"
    if n_max > n_min:
    # don't be a dope
    n_min, n_max = n_max, n_min
    self.n_min = n_min
    self.n_max = n_max
    self.replacement = replacement
    self.calls = 0
    def __call__(self, matchobj):
    self.calls += 1
    if self.n_min <= self.calls <= self.n_max:
    return self.replacement
    return matchobj.group(0)

    s = 'coucoucoucou'
    print "Initial:"
    print s
    print "Just positions 3-4:"
    print re.sub('o', Nth(3,4,'i'), s)
    for params in [
    (1, 1, 'i'), # just the 1st
    (1, 2, 'i'), # 1-2
    (2, 2, 'i'), # just the 2nd
    (2, 3, 'i'), # 2-3
    (2, 4, 'i'), # 2-4
    (4, 4, 'i'), # just the 4th
    ]:
    print "Nth(%i, %i, %s)" % params
    print re.sub('o', Nth(*params), s)

    > Why this default behavior?


    Can't answer that one, but with so many easy solutions, it's not
    been a big concern of mine.

    -tkc
     
    Tim Chase, Dec 31, 2008
    #4
  5. On 2008-12-31, TP <> wrote:
    > Hi everybody,
    >
    > I would like to change only the nth occurence of a pattern in a string. The
    > problem with "replace" method of strings, and "re.sub" is that we can only
    > define the number of occurrences to change from the first one.
    >
    >>>> v="coucou"
    >>>> v.replace("o","i",2)

    > 'ciuciu'
    >>>> import re
    >>>> re.sub( "o", "i", v,2)

    > 'ciuciu'
    >>>> re.sub( "o", "i", v,1)

    > 'ciucou'
    >
    > What is the best way to change only the nth occurence (occurrence number n)?
    >
    > Why this default behavior? For the user, it would be easier to put re.sub or
    > replace in a loop to change the first n occurences.


    I would do it as follows:

    1) Change the pattern n times to somethings that doesn't occur in your string
    2) Change it back n-1 times
    3) Change the remaining one to what you want.

    >>> v="coucou"
    >>> v.replace('o', 'O', 2).replace('O', 'o', 1).replace('O', 'i')

    'couciu'

    --
    Antoon Pardon
     
    Antoon Pardon, Jan 12, 2009
    #5
  6. TP

    MRAB Guest

    Antoon Pardon wrote:
    > On 2008-12-31, TP <> wrote:
    >> Hi everybody,
    >>
    >> I would like to change only the nth occurence of a pattern in a

    string. The
    >> problem with "replace" method of strings, and "re.sub" is that we

    can only
    >> define the number of occurrences to change from the first one.
    >>
    >>>>> v="coucou"
    >>>>> v.replace("o","i",2)

    >> 'ciuciu'
    >>>>> import re
    >>>>> re.sub( "o", "i", v,2)

    >> 'ciuciu'
    >>>>> re.sub( "o", "i", v,1)

    >> 'ciucou'
    >>
    >> What is the best way to change only the nth occurence (occurrence

    number n)?
    >>
    >> Why this default behavior? For the user, it would be easier to put

    re.sub or
    >> replace in a loop to change the first n occurences.

    >
    > I would do it as follows:
    >
    > 1) Change the pattern n times to somethings that doesn't occur in

    your string
    > 2) Change it back n-1 times
    > 3) Change the remaining one to what you want.
    >
    >>>> v="coucou"
    >>>> v.replace('o', 'O', 2).replace('O', 'o', 1).replace('O', 'i')

    > 'couciu'
    >

    Sorry for the last posting, but it did occur to me that str.replace()
    could grow another parameter 'start', so it would become:

    s.replace(old, new[[, start], end]]) -> string

    (In Python 2.x the method doesn't accept keyword arguments, so that
    isn't a problem.)

    If the possible replacements are numbered from 0, then 'start' is the
    first one actually to perform and 'end' the one after the last to perform.

    The 2-argument form would be s.replace(old, new) with 'start' defaulting
    to 0 and 'end' to None => replacing all occurrences, same as now.

    The 3-argument form would be s.replace(old, new, end) with 'start'
    defaulting to 0 => equivalent to replacing the first 'end' occurrences,
    same as now.

    The 4-argument form would be s.replace(old, new, start, end) =>
    replacing from the 'start'th to before the 'end'th occurrence,
    additional behaviour as requested.
     
    MRAB, Jan 14, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Code4u
    Replies:
    4
    Views:
    2,697
    Stephen Howe
    Jul 13, 2005
  2. Ross
    Replies:
    15
    Views:
    306
    John W. Kennedy
    Jul 7, 2005
  3. Replies:
    12
    Views:
    466
    Ben Morrow
    Jul 9, 2008
  4. PerlFAQ Server
    Replies:
    0
    Views:
    124
    PerlFAQ Server
    Jan 12, 2011
  5. PerlFAQ Server
    Replies:
    0
    Views:
    142
    PerlFAQ Server
    Feb 18, 2011
Loading...

Share This Page