python regex "negative lookahead assertions" problems

Discussion in 'Python' started by Jelle Smet, Nov 22, 2009.

  1. Jelle Smet

    Jelle Smet Guest

    Hi List,

    I'm trying to match lines in python using the re module.
    The end goal is to have a regex which enables me to skip lines which have ok and warning in it.
    But for some reason I can't get negative lookaheads working, the way it's explained in "http://docs.python.org/library/re.html".

    Consider this example:

    Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03)
    [GCC 4.4.1] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
    >>> re.match('.*(?!warning)',line)

    <_sre.SRE_Match object at 0xb75b1598>

    I would expect that this would NOT match as it's a negative lookahead and warning is in the string.


    Thanks,


    --
    Jelle Smet
    http://www.smetj.net
     
    Jelle Smet, Nov 22, 2009
    #1
    1. Advertisements

  2. On 11/22/09 14:58, Jelle Smet wrote:
    > Hi List,
    >
    > I'm trying to match lines in python using the re module.
    > The end goal is to have a regex which enables me to skip lines which have ok and warning in it.
    > But for some reason I can't get negative lookaheads working, the way it's explained in "http://docs.python.org/library/re.html".
    >
    > Consider this example:
    >
    > Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03)
    > [GCC 4.4.1] on linux2
    > Type "help", "copyright", "credits" or "license" for more information.
    >>>> import re
    >>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
    >>>> re.match('.*(?!warning)',line)

    > <_sre.SRE_Match object at 0xb75b1598>
    >
    > I would expect that this would NOT match as it's a negative lookahead and warning is in the string.
    >


    '.*' eats all of line. Now, when at end of line, there is no 'warning' anymore, so it matches.
    What are you trying to achieve?

    If you just want to single out lines with 'ok' or warning in it, why not just
    if re.search('(ok|warning)') : call_skip

    Helmut.

    --
    Helmut Jarausch

    Lehrstuhl fuer Numerische Mathematik
    RWTH - Aachen University
    D 52056 Aachen, Germany
     
    Helmut Jarausch, Nov 22, 2009
    #2
    1. Advertisements

  3. On 11/22/09 16:05, Helmut Jarausch wrote:
    > On 11/22/09 14:58, Jelle Smet wrote:
    >> Hi List,
    >>
    >> I'm trying to match lines in python using the re module.
    >> The end goal is to have a regex which enables me to skip lines which
    >> have ok and warning in it.
    >> But for some reason I can't get negative lookaheads working, the way
    >> it's explained in "http://docs.python.org/library/re.html".
    >>
    >> Consider this example:
    >>
    >> Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03)
    >> [GCC 4.4.1] on linux2
    >> Type "help", "copyright", "credits" or "license" for more information.
    >>>>> import re
    >>>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh
    >>>>> qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf
    >>>>> lqsuhf lqksjfhqisudfh qiusdfhq iusfh'
    >>>>> re.match('.*(?!warning)',line)

    >> <_sre.SRE_Match object at 0xb75b1598>
    >>
    >> I would expect that this would NOT match as it's a negative lookahead
    >> and warning is in the string.
    >>

    >
    > '.*' eats all of line. Now, when at end of line, there is no 'warning'
    > anymore, so it matches.
    > What are you trying to achieve?
    >
    > If you just want to single out lines with 'ok' or warning in it, why not
    > just
    > if re.search('(ok|warning)') : call_skip
    >


    Probably you don't want words like 'joke' to match 'ok'.
    So, a better regex is

    if re.search('\b(ok|warning)\b',line) : SKIP_ME

    Helmut.



    --
    Helmut Jarausch

    Lehrstuhl fuer Numerische Mathematik
    RWTH - Aachen University
    D 52056 Aachen, Germany
     
    Helmut Jarausch, Nov 23, 2009
    #3
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    957
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    2,030
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    786
  4. Xah Lee
    Replies:
    1
    Views:
    1,181
    Ilias Lazaridis
    Sep 22, 2006
  5. Xah Lee
    Replies:
    8
    Views:
    690
    Ilias Lazaridis
    Sep 26, 2006
  6. Replies:
    3
    Views:
    1,110
    Reedick, Andrew
    Jul 1, 2008
  7. RolfK
    Replies:
    1
    Views:
    2,182
    Martin Honnen
    Jun 7, 2009
  8. Xah Lee
    Replies:
    2
    Views:
    405
    Xah Lee
    Sep 25, 2006
Loading...