REGEX Negation

Discussion in 'Perl Misc' started by Rusty Phillips, Jun 23, 2004.

  1. I know about negative lookahead and negative character closures,
    but I can't find any good way to do actual negation.

    One thing I'd like to use this for is to match quotes while
    guaranteeing that I'm not matching backslashed quotes (that is, if I
    find a backslash in the string, the quote in front of it should not
    be matched).
    This string:
    String = q{She said,
    Welcome to the party\\""}

    Should match
    He said, \\"Welcome to the party\\"
    as the part within the quotes, and not
    He said, \\

    There are many more places where I'd like to use a negation
    technique - especially I'd like to match things of the form:
    "match the largest string that doesn't contain the character sequence
    'blah.'"

    Are there any ways to do either of these types of negation?
    Rusty Phillips, Jun 23, 2004
    #1
    1. Advertising

  2. Rusty Phillips <> writes:

    > I know about negative lookahead and negative character closures,
    > but I can't find any good way to do actual negation.


    There is in general no way to do negation in regex.

    > One thing I'd like to use this for is to match quotes while
    > guaranteeing that I'm not matching backslashed quotes (that is, if I
    > find a backslash in the string, the quote in front of it should not
    > be matched).


    You are talking about negative lookbehind. This is documented not far
    from where negative lookahead is documented.

    However, one usually looks for an even number of backslahes followed
    by a quote. (Note: zero is an even number).

    /(<?!\\)(?:\\\\)*"/

    Another approach not using lookbehind is given in the answer to the
    FAQ "How can I split a [character] delimited string except when inside
    [character]? (Comma-separated files)"

    Not, of course, that you could have been expected to guess that
    because yours not really the same question but is in fact the _next_
    question people usually ask after asking the one in the FAQ.

    > There are many more places where I'd like to use a negation
    > technique -


    Sorry, you have to refactor your question so that is does not invlove
    negation.

    > especially I'd like to match things of the form:
    > "match the largest string that doesn't contain the character sequence
    > 'blah.'"


    Regex can never find the longest - it will find always the first (or
    occasionally the last). Within matches starting at the same position
    it can be made to favour long or short. So to get the globally
    longest match you need to find all such strings and sort.

    These strings will be the same set as the set of shortest strings to
    start at the beginning of the input or at the 'l' of 'blah' and to end
    at the end of the input or at the 'a' of 'blah'

    my @substrings = /(?=((?:^|(?<=b)lah).*?(?:$|bla(?=h))))/g;

    For example for $_='xxxxblablahwibbleblahfoo' this gives @substrings =
    ('xxxxblabla','lahwibblebla','lahfoo').

    You can then find the longest with sort() or List::Util::reduce().

    --
    \\ ( )
    . _\\__[oo
    .__/ \\ /\@
    . l___\\
    # ll l\\
    ###LL LL\\
    Brian McCauley, Jun 24, 2004
    #2
    1. Advertising

  3. That second technique is how I'm doing things now for the one
    "negation" I'm doing now (actually /(.*?)(?=blah|$)/, which I know
    will at least find the first match not containing the lookahead
    string (assuming that such a string is a token, and should not be
    absorbed by the regex). I'd just hoped there was a more natural way.

    Didn't consider negative lookbehinds for doing quotes, though.
    Thanks for the help.
    Rusty Phillips, Jun 24, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. George Sakkis

    Negation in regular expressions

    George Sakkis, Sep 8, 2006, in forum: Python
    Replies:
    6
    Views:
    523
  2. joshc

    unary negation operator question

    joshc, Apr 1, 2005, in forum: C Programming
    Replies:
    17
    Views:
    555
    Keith Thompson
    Apr 1, 2005
  3. jimmij

    negation operator !

    jimmij, Dec 8, 2006, in forum: C++
    Replies:
    3
    Views:
    777
    John Carson
    Dec 9, 2006
  4. Andreas Hochsteger

    Negation of RegEx

    Andreas Hochsteger, Apr 21, 2004, in forum: Perl Misc
    Replies:
    3
    Views:
    83
    Randal L. Schwartz
    Apr 21, 2004
  5. Dan

    Negation of RegEx

    Dan, Jul 8, 2004, in forum: Perl Misc
    Replies:
    11
    Views:
    152
    Jeff 'japhy' Pinyan
    Jul 12, 2004
Loading...

Share This Page