Regular expression confusion

Discussion in 'Python' started by York, Sep 24, 2006.

  1. York

    York Guest

    I have two backslash - a. and I want to replace them with one backslash,
    but I failed:

    >>> import re
    >>> a = '\\\\'
    >>> re.sub(r'\\\\', '\\', '\\\\')

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/lib/python2.3/sre.py", line 143, in sub
    return _compile(pattern, 0).sub(repl, string, count)
    File "/usr/lib/python2.3/sre.py", line 258, in _subx
    template = _compile_repl(template, pattern)
    File "/usr/lib/python2.3/sre.py", line 245, in _compile_repl
    raise error, v # invalid expression
    sre_constants.error: bogus escape (end of line)
    >>>


    anybody knows why?

    Thanks,

    York
    York, Sep 24, 2006
    #1
    1. Advertising

  2. York

    John Machin Guest

    York wrote:
    > I have two backslash - a. and I want to replace them with one backslash,
    > but I failed:
    >
    > >>> import re
    > >>> a = '\\\\'
    > >>> re.sub(r'\\\\', '\\', '\\\\')

    > Traceback (most recent call last):
    > File "<stdin>", line 1, in ?
    > File "/usr/lib/python2.3/sre.py", line 143, in sub
    > return _compile(pattern, 0).sub(repl, string, count)
    > File "/usr/lib/python2.3/sre.py", line 258, in _subx
    > template = _compile_repl(template, pattern)
    > File "/usr/lib/python2.3/sre.py", line 245, in _compile_repl
    > raise error, v # invalid expression
    > sre_constants.error: bogus escape (end of line)
    > >>>

    >
    > anybody knows why?


    Yep. There are *two* levels of escaping happening (1) Python compiler
    (2) re compiler (in the first two args, but of course only Python in
    the 3rd).
    To get your single backslash you need to start out with four cooked or
    two raw:

    | >>> re.sub(r'\\\\', '\\\\', '\\\\')
    '\\'
    | >>> re.sub(r'\\\\', r'\\', '\\\\')
    '\\'

    Cheers,
    John
    John Machin, Sep 24, 2006
    #2
    1. Advertising

  3. York

    York Guest

    Oh, that's right, the second arg is escaped by re compiler too.

    Thank you, John.

    York



    John Machin wrote:
    > York wrote:
    >
    >>I have two backslash - a. and I want to replace them with one backslash,
    >>but I failed:
    >>
    >> >>> import re
    >> >>> a = '\\\\'
    >> >>> re.sub(r'\\\\', '\\', '\\\\')

    >>Traceback (most recent call last):
    >> File "<stdin>", line 1, in ?
    >> File "/usr/lib/python2.3/sre.py", line 143, in sub
    >> return _compile(pattern, 0).sub(repl, string, count)
    >> File "/usr/lib/python2.3/sre.py", line 258, in _subx
    >> template = _compile_repl(template, pattern)
    >> File "/usr/lib/python2.3/sre.py", line 245, in _compile_repl
    >> raise error, v # invalid expression
    >>sre_constants.error: bogus escape (end of line)
    >> >>>

    >>
    >>anybody knows why?

    >
    >
    > Yep. There are *two* levels of escaping happening (1) Python compiler
    > (2) re compiler (in the first two args, but of course only Python in
    > the 3rd).
    > To get your single backslash you need to start out with four cooked or
    > two raw:
    >
    > | >>> re.sub(r'\\\\', '\\\\', '\\\\')
    > '\\'
    > | >>> re.sub(r'\\\\', r'\\', '\\\\')
    > '\\'
    >
    > Cheers,
    > John
    >
    York, Sep 24, 2006
    #3
  4. York

    John Machin Guest

    York wrote:
    > Oh, that's right, the second arg is escaped by re compiler too.


    No, that's wrong, you [should] do the escaping, the compiler unescapes
    :)

    Cheers,
    John
    John Machin, Sep 24, 2006
    #4
  5. York wrote:

    > I have two backslash - a. and I want to replace them with one backslash,
    > but I failed:
    >
    > >>> import re
    > >>> a = '\\\\'
    > >>> re.sub(r'\\\\', '\\', '\\\\')


    John has already sorted the RE-specific part of the problem, but it's
    also worth noting that using the RE engine for literal strings is over-
    kill; an ordinary replace is easier to use and faster:

    >>> a = "\\\\"
    >>> a.replace("\\\\", "\\")

    '\\'

    </F>
    Fredrik Lundh, Sep 24, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,273
  2. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    832
    Alan Moore
    Dec 2, 2005
  3. GIMME
    Replies:
    3
    Views:
    11,924
    vforvikash
    Dec 29, 2008
  4. Tim Johnson
    Replies:
    2
    Views:
    277
    Chris Dollin
    Dec 12, 2006
  5. Christie Taylor

    Regular Expression confusion

    Christie Taylor, Oct 31, 2004, in forum: Perl Misc
    Replies:
    13
    Views:
    160
    Brian McCauley
    Oct 31, 2004
Loading...

Share This Page