problem with regex

Discussion in 'Python' started by abcd, Jul 28, 2006.

  1. abcd

    abcd Guest

    I have a regex: '[A-Za-z]:\\([^/:\*\?"<>\|])*'

    when I do, re.compile('[A-Za-z]:\\([^/:\*\?"<>\|])*') ...I get

    sre_constants.error: unbalanced parenthesis

    do i need to escape something else? i see that i have matching
    parenthesis.

    thx
    abcd, Jul 28, 2006
    #1
    1. Advertising

  2. abcd

    Rob Wolfe Guest

    abcd wrote:
    > I have a regex: '[A-Za-z]:\\([^/:\*\?"<>\|])*'
    >
    > when I do, re.compile('[A-Za-z]:\\([^/:\*\?"<>\|])*') ...I get
    >
    > sre_constants.error: unbalanced parenthesis
    >
    > do i need to escape something else? i see that i have matching
    > parenthesis.


    You should use raw string:

    re.compile(r'[A-Za-z]:\\([^/:\*\?"<>\|])*')

    Regards,
    Rob
    Rob Wolfe, Jul 28, 2006
    #2
    1. Advertising

  3. abcd

    Barry Guest

    On 28 Jul 2006 05:45:05 -0700, abcd <> wrote:
    > I have a regex: '[A-Za-z]:\\([^/:\*\?"<>\|])*'
    >
    > when I do, re.compile('[A-Za-z]:\\([^/:\*\?"<>\|])*') ...I get
    >
    > sre_constants.error: unbalanced parenthesis
    >
    > do i need to escape something else? i see that i have matching
    > parenthesis.
    >
    > thx
    >
    > --


    Try making the argument a raw string:
    re.compile(r'[A-Za-z]:\\([^/:\*\?"<>\|])*')
    Barry, Jul 28, 2006
    #3
  4. abcd

    Tim Chase Guest

    > when I do, re.compile('[A-Za-z]:\\([^/:\*\?"<>\|])*') ...I get
    >
    > sre_constants.error: unbalanced parenthesis



    Because you're not using raw strings, the escapables become
    escaped, making your regexp something like

    [A-Za-z]:\([^/:\*\?"<>\|])*

    (because it knows what "\\" is, but likely doesn't attribute
    significance to "\?" or "\|", and thus leaves them alone).

    Thus, you have "\(" in your regexp, which is a literal
    open-paren. But you have a ")", which is a "close a grouping"
    paren. The error is indicating that the "close a grouping" paren
    doesn't close some previously opened paren.

    General good practice shoves all this stuff in a raw string:

    r"[A-Za-z]:\\([^/:\*\?"<>\|])*"

    which solves much of the headache.

    -tkc
    Tim Chase, Jul 28, 2006
    #4
  5. abcd

    abcd Guest

    well thanks for the quick replies, but now my regex doesn't work.

    Code:
    import re
    p = re.compile(r'[A-Za-z]:\\([^/:\*?"<>\|])*')
    
    x = p.match("c:\test")
    
    x is None

    any ideas why? i escape the back-slash, the asterisk *, and the PIPE |
    .....b/c they are regex special characters.
    abcd, Jul 28, 2006
    #5
  6. abcd enlightened us with:
    > well thanks for the quick replies, but now my regex doesn't work.
    >
    >
    Code:
    > import re
    > p = re.compile(r'[A-Za-z]:\\([^/:\*?"<>\|])*')
    >
    > x = p.match("c:\test")
    > 
    >
    > x is None
    >
    > any ideas why?


    Yes, because after the "c:" you expect a backslash, and not a tab
    character. Read the manual again about raw strings and character
    escaping, it'll do you good.

    Sybren
    --
    The problem with the world is stupidity. Not saying there should be a
    capital punishment for stupidity, but why don't we just take the
    safety labels off of everything and let the problem solve itself?
    Frank Zappa
    Sybren Stuvel, Jul 28, 2006
    #6
  7. abcd

    abcd Guest

    sorry i forgot to escape the question mark...

    >
    Code:
    > import re
    > p = re.compile(r'[A-Za-z]:\\([^/:\*?"<>\|])*')[/color]
    
    even when I escape that it still doesnt work as expected.
    
    p = re.compile(r'[A-Za-z]:\\([^/:\*\?"<>\|])*')
    
    p.match('c:\test')  still returns None.
    abcd, Jul 28, 2006
    #7
  8. abcd

    Tim Chase Guest

    > p = re.compile(r'[A-Za-z]:\\([^/:\*?"<>\|])*')
    >
    > x = p.match("c:\test")


    > any ideas why? i escape the back-slash, the asterisk *, and the PIPE |
    > ....b/c they are regex special characters.



    Same problem, only now in the other string:

    >>> s = "c:\test"
    >>> print s

    c: est

    Your "\t" is interpreted as as tab character. Thus, you want

    s = r"c:\test"

    or

    s = "c:\\test"

    which you'll find should now be successfully found with

    p.match(s)

    -tkc
    Tim Chase, Jul 28, 2006
    #8
  9. abcd

    abcd Guest

    Sybren Stuvel wrote:
    > Yes, because after the "c:" you expect a backslash, and not a tab
    > character. Read the manual again about raw strings and character
    > escaping, it'll do you good.



    doh. i shall do that.

    thanks.
    abcd, Jul 28, 2006
    #9
  10. abcd

    abcd Guest

    not sure why this passes:


    >>> regex = r'[A-Za-z]:\\([^/:\*\?"<>\|])*'
    >>> p = re.compile(regex)
    >>> p.match('c:\\test')

    <_sre.SRE_Match object at 0x009D77E0>
    >>> p.match('c:\\test?:/')

    <_sre.SRE_Match object at 0x009D7720>
    >>>


    the last example shouldnt give a match
    abcd, Jul 28, 2006
    #10
  11. abcd

    Tim Chase Guest

    >>>> regex = r'[A-Za-z]:\\([^/:\*\?"<>\|])*'
    >>>> p = re.compile(regex)
    >>>> p.match('c:\\test')

    > <_sre.SRE_Match object at 0x009D77E0>
    >>>> p.match('c:\\test?:/')

    > <_sre.SRE_Match object at 0x009D7720>
    >
    > the last example shouldnt give a match


    Ah, but it should, because it *does* match.

    >>> m = p.match('c:\\test?:/')
    >>> m.group(0)

    'c:\\test'
    >>> # add a "$" at the end to anchor it
    >>> # to the end of the line
    >>> regex = r'[A-Za-z]:\\([^/:\*\?"<>\|])*$'
    >>> p = re.compile(regex)
    >>> m = p.match('c:\\test?:/')
    >>> m


    By adding the "$" to ensure that you're matching the whole string
    passed to match() and not just as much as possible given the
    regexp, you solve the problem you describe.

    -tkc
    Tim Chase, Jul 28, 2006
    #11
  12. abcd

    Rob Wolfe Guest

    abcd wrote:
    > not sure why this passes:
    >
    >
    > >>> regex = r'[A-Za-z]:\\([^/:\*\?"<>\|])*'
    > >>> p = re.compile(regex)
    > >>> p.match('c:\\test')

    > <_sre.SRE_Match object at 0x009D77E0>
    > >>> p.match('c:\\test?:/')

    > <_sre.SRE_Match object at 0x009D7720>
    > >>>

    >
    > the last example shouldnt give a match


    If you want to learn RE I suggest to use great tool redemo.py (tk app).
    Then you can play with regular expressions to find the result
    you are looking for.
    It can be found in Python 2.4 in Tools\Scripts.

    Regards,
    Rob
    Rob Wolfe, Jul 28, 2006
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    690
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,614
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    589
  4. Xah Lee
    Replies:
    1
    Views:
    931
    Ilias Lazaridis
    Sep 22, 2006
  5. Replies:
    3
    Views:
    734
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page