RegEx: Is there such a thing as "non-greedy backwards"?

Discussion in 'Java' started by mrclean_ii@hotmail.com, Jan 19, 2005.

  1. Guest

    Let me explain: If I have a text like
    "...target1.....target2...target2..." the pattern
    "target1[\s\S]*target2" will match from "target1" to the LAST
    "target2". If we slightly change the pattern to
    "target1[\s\S]*?target2" the expression becomes non-greedy and the
    pattern will match from "target1" to the FIRST "target2".

    Now suppose the text is "...target1.....target1...target2..." and I
    want to match from the LAST "target1" to the "target2" (what I would
    call "non-greedy backwards"). Any help would be appreciated, thank you!
     
    , Jan 19, 2005
    #1
    1. Advertising

  2. Alan Moore Guest

    There's no built-in mechanism that does this, but you can do it
    yourself like this:

    Pattern p = Pattern.compile("target1(?:[^t]++|t(?!arget1))*+target2");

    In other words, after matching the first token, you look for any
    character that's not the first letter of the token, OR that letter as
    long as it's not followed by the rest of the token. It's important to
    use the aggressive quantifiers ("++" and "*+"), because the regex could
    be prohibitively slow without them.
     
    Alan Moore, Jan 23, 2005
    #2
    1. Advertising

  3. Guest

    Thanks!

    It seems to work on very small strings for me without the aggressive
    quantifiers. I need to use it in VB(Script) and there is no Compile
    method and the aggressive quantifiers result in runtime errors. I
    posted the question here because the regex is pretty similar and I
    didn't find anything else for VBScript. Is there a workaround?
     
    , Jan 24, 2005
    #3
  4. Alan Moore Guest

    First, a correction. The regex that I posted wouldn't have worked
    anyway (the middle part would gobble up the second token and never give
    it back). That technique only works if the first token and the second
    token are the same or, with a small modification, if they start with
    the same letter. In the case of your example, with tokens of "target1"
    and "target2", the regex would be

    "target1(?:[^t]++|t(?!arget[12]))*+target2"

    For completely different tokens, a more elaborate regex is needed:

    "foo(?:[^fb]++|f(?!oo)|b(?!ar))*+bar"

    If you don't have aggressive quantifiers, try this version:

    "foo(?:[^fb]|f(?!oo)|b(?!ar))*bar"

    It's much less efficient because you're only matching "[^fb]" once each
    time through, but it will probably be fast enough.
     
    Alan Moore, Jan 25, 2005
    #4
  5. Guest

    Thanks Alan, it works like a charm. Even the less efficient one is fast
    on long strings. You guessed it, I really was looking for the foo/bar
    pattern. It looks pretty complicated and I think that the non-greedy
    specifier ? should work in both directions (see the first message in
    thread).

    Thanks again!
     
    , Jan 25, 2005
    #5
  6. Guest

    Thanks Alan, it works like a charm. Even the less efficient one is fast
    on long strings. You guessed it, I really was looking for the foo/bar
    pattern. It looks pretty complicated and I think that the non-greedy
    specifier ? should work in both directions (see the first message in
    thread).

    Thanks again!
     
    , Jan 25, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Edward A Thompson

    Is there such a thing as a JVM monitor...

    Edward A Thompson, Oct 2, 2003, in forum: Java
    Replies:
    3
    Views:
    4,675
    Michael Borgwardt
    Oct 2, 2003
  2. moo moo
    Replies:
    2
    Views:
    389
    Henrique Seganfredo
    Nov 20, 2003
  3. Elhanan
    Replies:
    1
    Views:
    363
    IchBin
    Apr 4, 2006
  4. Charles Law
    Replies:
    26
    Views:
    867
    Jon Paal
    May 4, 2006
  5. ASP 1.1-VB6.0 developer

    VB versus VB.net IS there such a thing?

    ASP 1.1-VB6.0 developer, Jul 27, 2006, in forum: ASP .Net
    Replies:
    2
    Views:
    730
    =?Utf-8?B?UGV0ZXIgQnJvbWJlcmcgW0MjIE1WUF0=?=
    Jul 27, 2006
Loading...

Share This Page