variable-width negative look-behind emulation

Discussion in 'Perl Misc' started by Xavier Noria, Sep 13, 2003.

  1. Xavier Noria

    Xavier Noria Guest

    I was playing with variable-width negative look-behind emulation and
    worked out this regexp that supports assertions about substrings of
    the pre-match:

    /^.*(??{ index($&,"foo") == -1 ? "" : "(?!)" })bar/

    That matches "bar" if it is not preceded at any point by "foo". I know
    that can be achieved by reverse + negative look-ahead but this has to
    be considered an exercise (that comes from a question in freenode#perl
    that required a single regexp to solve the problem).

    I tried a few tricks to extend that to patterns as

    /^.*(??{ $& =~ m,fo+, ? "(?!)" : "" })bar/

    or

    /^.*(?{ local $m = $& })(??{ $m =~ m,fo+, ? "(?!)" : "" })bar/

    and the like, but I got segmentation faults (maybe that hits some
    corner of the documented experimentalness of that stuff). Looks like
    the problem comes from nesting m//, but that's a guess.

    Any comments?

    -- fxn
     
    Xavier Noria, Sep 13, 2003
    #1
    1. Advertising

  2. On 13 Sep 2003, Xavier Noria wrote:

    >I was playing with variable-width negative look-behind emulation and
    >worked out this regexp that supports assertions about substrings of
    >the pre-match:
    >
    > /^.*(??{ index($&,"foo") == -1 ? "" : "(?!)" })bar/
    >
    >That matches "bar" if it is not preceded at any point by "foo". I know
    >that can be achieved by reverse + negative look-ahead but this has to
    >be considered an exercise (that comes from a question in freenode#perl
    >that required a single regexp to solve the problem).


    You could use

    /^(?:[^f]*|f+(?!f|oo))*bar/;

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
     
    Jeff 'japhy' Pinyan, Sep 14, 2003
    #2
    1. Advertising

  3. Xavier Noria

    Xavier Noria Guest

    Jeff 'japhy' Pinyan <> wrote in message news:<>...
    > On 13 Sep 2003, Xavier Noria wrote:
    >
    > >I was playing with variable-width negative look-behind emulation and
    > >worked out this regexp that supports assertions about substrings of
    > >the pre-match:
    > >
    > > /^.*(??{ index($&,"foo") == -1 ? "" : "(?!)" })bar/
    > >
    > >That matches "bar" if it is not preceded at any point by "foo". I know
    > >that can be achieved by reverse + negative look-ahead but this has to
    > >be considered an exercise (that comes from a question in freenode#perl
    > >that required a single regexp to solve the problem).

    >
    > You could use
    >
    > /^(?:[^f]*|f+(?!f|oo))*bar/;


    Ah, better, thank you.

    Maybe we could take benefit of atomic grouping here? Like this

    /^ (?> [^f] | f(?!oo) )*? bar/x

    or even better maybe this way

    /^(?> (?!foo). )*? bar/x

    which has been suggested by Iain Truskett right now on freenode#regex.

    -- fxn
     
    Xavier Noria, Sep 14, 2003
    #3
  4. On 14 Sep 2003, Xavier Noria wrote:

    >> > /^.*(??{ index($&,"foo") == -1 ? "" : "(?!)" })bar/

    >>
    >> /^(?:[^f]*|f+(?!f|oo))*bar/;

    >
    >Ah, better, thank you.
    >
    >Maybe we could take benefit of atomic grouping here? Like this
    >
    > /^ (?> [^f] | f(?!oo) )*? bar/x


    I'd rather see [^f]+. (That should have been [^f]+ in my regex.)

    >or even better maybe this way
    >
    > /^(?> (?!foo). )*? bar/x


    That crawls one character at a time, and the (?>) shouldn't be needed.

    --
    Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
     
    Jeff 'japhy' Pinyan, Sep 14, 2003
    #4
  5. Xavier Noria

    Xavier Noria Guest

    Jeff 'japhy' Pinyan <> wrote in message news:<>...
    > On 14 Sep 2003, Xavier Noria wrote:
    >
    > >> > /^.*(??{ index($&,"foo") == -1 ? "" : "(?!)" })bar/
    > >>
    > >> /^(?:[^f]*|f+(?!f|oo))*bar/;

    > >
    > >Ah, better, thank you.
    > >
    > >Maybe we could take benefit of atomic grouping here? Like this
    > >
    > > /^ (?> [^f] | f(?!oo) )*? bar/x

    >
    > I'd rather see [^f]+. (That should have been [^f]+ in my regex.)


    The problem there is that it seems we loose the states corresponding
    to [^f]+, which in spite of the outer *? eats too much and would need
    to backtrack:

    % perl -wle 'print 1 if "xbar" =~ /^(?>[^f]+|f(?!oo))*?bar/'
    % perl -wle 'print 1 if "xbar" =~ /^(?>[^f]|f(?!oo))*?bar/'
    1

    But [^f]+ is fine if we don't use atomic grouping:

    % perl -wle 'print 1 if "xbar" =~ /^(?:[^f]+|f(?!oo))*?bar/'
    1

    > >or even better maybe this way
    > >
    > > /^(?> (?!foo). )*? bar/x

    >
    > That crawls one character at a time, and the (?>) shouldn't be needed.


    Yeah, it's there just to indicate to the engine it can forget about
    the states. We either match in one shot or fail.

    -- fxn
     
    Xavier Noria, Sep 14, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bhargava

    Negative look-behind

    Bhargava, Jun 1, 2004, in forum: Python
    Replies:
    4
    Views:
    510
    Paul McGuire
    Jun 7, 2004
  2. inhahe
    Replies:
    3
    Views:
    2,503
    Diez B. Roggisch
    Jan 28, 2005
  3. Ruby Nuby

    Regex negative look-behind bug?

    Ruby Nuby, Nov 23, 2010, in forum: Ruby
    Replies:
    12
    Views:
    283
    Ammar Ali
    Nov 29, 2010
  4. Replies:
    1
    Views:
    103
    Dr.Ruud
    Nov 7, 2007
  5. Replies:
    3
    Views:
    222
Loading...

Share This Page