python regex "negative lookahead assertions" problems

Discussion in 'Python' started by Jelle Smet, Nov 22, 2009.

  1. Jelle Smet

    Jelle Smet Guest

    Hi List,

    I'm trying to match lines in python using the re module.
    The end goal is to have a regex which enables me to skip lines which have ok and warning in it.
    But for some reason I can't get negative lookaheads working, the way it's explained in "http://docs.python.org/library/re.html".

    Consider this example:

    Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03)
    [GCC 4.4.1] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
    >>> re.match('.*(?!warning)',line)

    <_sre.SRE_Match object at 0xb75b1598>

    I would expect that this would NOT match as it's a negative lookahead and warning is in the string.


    Thanks,


    --
    Jelle Smet
    http://www.smetj.net
     
    Jelle Smet, Nov 22, 2009
    #1
    1. Advertising

  2. On 11/22/09 14:58, Jelle Smet wrote:
    > Hi List,
    >
    > I'm trying to match lines in python using the re module.
    > The end goal is to have a regex which enables me to skip lines which have ok and warning in it.
    > But for some reason I can't get negative lookaheads working, the way it's explained in "http://docs.python.org/library/re.html".
    >
    > Consider this example:
    >
    > Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03)
    > [GCC 4.4.1] on linux2
    > Type "help", "copyright", "credits" or "license" for more information.
    >>>> import re
    >>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
    >>>> re.match('.*(?!warning)',line)

    > <_sre.SRE_Match object at 0xb75b1598>
    >
    > I would expect that this would NOT match as it's a negative lookahead and warning is in the string.
    >


    '.*' eats all of line. Now, when at end of line, there is no 'warning' anymore, so it matches.
    What are you trying to achieve?

    If you just want to single out lines with 'ok' or warning in it, why not just
    if re.search('(ok|warning)') : call_skip

    Helmut.

    --
    Helmut Jarausch

    Lehrstuhl fuer Numerische Mathematik
    RWTH - Aachen University
    D 52056 Aachen, Germany
     
    Helmut Jarausch, Nov 22, 2009
    #2
    1. Advertising

  3. On 11/22/09 16:05, Helmut Jarausch wrote:
    > On 11/22/09 14:58, Jelle Smet wrote:
    >> Hi List,
    >>
    >> I'm trying to match lines in python using the re module.
    >> The end goal is to have a regex which enables me to skip lines which
    >> have ok and warning in it.
    >> But for some reason I can't get negative lookaheads working, the way
    >> it's explained in "http://docs.python.org/library/re.html".
    >>
    >> Consider this example:
    >>
    >> Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03)
    >> [GCC 4.4.1] on linux2
    >> Type "help", "copyright", "credits" or "license" for more information.
    >>>>> import re
    >>>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh
    >>>>> qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf
    >>>>> lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
    >>>>> re.match('.*(?!warning)',line)

    >> <_sre.SRE_Match object at 0xb75b1598>
    >>
    >> I would expect that this would NOT match as it's a negative lookahead
    >> and warning is in the string.
    >>

    >
    > '.*' eats all of line. Now, when at end of line, there is no 'warning'
    > anymore, so it matches.
    > What are you trying to achieve?
    >
    > If you just want to single out lines with 'ok' or warning in it, why not
    > just
    > if re.search('(ok|warning)') : call_skip
    >


    Probably you don't want words like 'joke' to match 'ok'.
    So, a better regex is

    if re.search('\b(ok|warning)\b',line) : SKIP_ME

    Helmut.



    --
    Helmut Jarausch

    Lehrstuhl fuer Numerische Mathematik
    RWTH - Aachen University
    D 52056 Aachen, Germany
     
    Helmut Jarausch, Nov 23, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Xah Lee
    Replies:
    1
    Views:
    972
    Ilias Lazaridis
    Sep 22, 2006
  2. Xah Lee
    Replies:
    8
    Views:
    482
    Ilias Lazaridis
    Sep 26, 2006
  3. Replies:
    3
    Views:
    825
    Reedick, Andrew
    Jul 1, 2008
  4. MRAB
    Replies:
    0
    Views:
    941
  5. Xah Lee
    Replies:
    2
    Views:
    240
    Xah Lee
    Sep 25, 2006
Loading...

Share This Page