Re: trouble with regex?

Discussion in 'Python' started by MRAB, Oct 8, 2009.

  1. MRAB

    MRAB Guest

    inhahe wrote:
    > Can someone tell me why this doesn't work?
    >
    > colorre = re.compile ('('
    > '^'
    > '|'
    > '(?:'
    > '\x0b(?:10|11|12|13|14|15|0\\d|\\d)'
    > '(?:'
    > ',(?:10|11|12|13|14|15|0\\d|\\d)'
    > ')?'
    > ')'
    > ')(.*?)')
    >
    > I'm trying to extract mirc color codes.
    >
    > this works:
    >
    > colorre = re.compile ('\x0b(?:10|11|12|13|14|15|0\\d|\\d)'
    > '(?:'
    > ',(?:10|11|12|13|14|15|0\\d|\\d)'
    > ')?'
    > )
    >
    > but I wanted to modify it so that it returns me groups of (color code,
    > text after the code), except for the first text at the beginning of the
    > string before any color code, for which it should return ('', text).
    > that's what the first paste above is trying to do, but it doesn't work.
    > here are some results:
    >
    > >>> colorre.findall('a\x0b1,1')

    > [('', ''), ('\x0b1,1', '')]
    > >>> colorre.findall('a\x0b1,1b')

    > [('', ''), ('\x0b1,1', '')]
    > >>> colorre.findall('ab')

    > [('', '')]
    > >>> colorre.findall('\x0b1,1')

    > [('', '')]
    > >>> colorre.findall('\x0b1,1a')

    > [('', '')]
    > >>>

    >
    > i can easily work with the string that does work and just use group
    > starting and ending positions, but i'm curious as to why i can't get it
    > working teh way i want :/
    >

    The problem with the regex is that .*? is a lazy repeat: it'll try to
    match as few characters as possible, which is why the second group is
    always ''. Try a greedy repeat instead, but matching only
    non-backspaces:

    colorre = re.compile('('
    '^'
    '|'
    '(?:'
    '\x0b(?:10|11|12|13|14|15|0\\d|\\d)'
    '(?:'
    ',(?:10|11|12|13|14|15|0\\d|\\d)'
    ')?'
    ')'
    ')([^\x0b]*)')
    MRAB, Oct 8, 2009
    #1
    1. Advertising

  2. MRAB

    Paul McGuire Guest

    On Oct 8, 11:42 am, MRAB <> wrote:
    > inhahe wrote:
    > > Can someone tell me why this doesn't work?

    >
    > > colorre = re.compile ('('
    > >                         '^'
    > >                        '|'
    > >                         '(?:'
    > >                            '\x0b(?:10|11|12|13|14|15|0\\d|\\d)'
    > >                            '(?:'
    > >                               ',(?:10|11|12|13|14|15|0\\d|\\d)'
    > >                            ')?'
    > >                         ')'
    > >                       ')(.*?)')

    >
    > > I'm trying to extract mirc color codes.

    >


    You might find this site interesting (http://utilitymill.com/utility/
    Regex_For_Range) to generate RE's for numeric ranges.

    -- Paul
    Paul McGuire, Oct 8, 2009
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hypo
    Replies:
    6
    Views:
    406
  2. Fernando Rodriguez

    Trouble with regex

    Fernando Rodriguez, Nov 14, 2003, in forum: Python
    Replies:
    2
    Views:
    273
    Jim Shapiro
    Nov 14, 2003
  3. Fernando Rodriguez
    Replies:
    5
    Views:
    397
    Roel Mathys
    Nov 20, 2003
  4. marek

    unicode regex example: trouble

    marek, May 21, 2004, in forum: Python
    Replies:
    1
    Views:
    363
    Peter Otten
    May 21, 2004
  5. Replies:
    3
    Views:
    748
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page