Regex in Ruby question

Discussion in 'Ruby' started by J. Cooper, Feb 22, 2008.

  1. J. Cooper

    J. Cooper Guest

    Another newbie question:

    I've just started learning about regex "officially" (reading the
    O'reilly book) and was trying out some of the examples (porting them to
    Ruby from Perl). From what I can tell, Ruby's regex engine doesn't
    support "lookbehind", and as I had only a smattering of knowledge about
    regex before and no prior experience with lookaround, I was curious if
    this was significant.

    i.e. is there any problem that can't be solved without lookbehind?

    (Note that I am not facing such a problem; I'm just curious if
    lookbehind is just an optional feature that makes certain problems
    easier, as opposed to being an essential thing.)

    Thanks!
    --
    Posted via http://www.ruby-forum.com/.
     
    J. Cooper, Feb 22, 2008
    #1
    1. Advertising

  2. J. Cooper

    ThoML Guest

    > From what I can tell, Ruby's regex engine doesn't
    > support "lookbehind"


    Ruby 1.9 has look-behind:

    (?<=subexp) look-behind
    (?<!subexp) negative look-behind

    With ruby 1.8, you can install Oniguruma.

    Regards,
    Thomas.
     
    ThoML, Feb 22, 2008
    #2
    1. Advertising

  3. J. Cooper wrote:
    > i.e. is there any problem that can't be solved without lookbehind?


    Yes, there are. For example: you want to match any occurence of "bar" except
    if it is preceeded by "foo". I.e. you'd want to match "blabar" or "oofbar",
    but not "foobar". You can't do that without negative lookbehind.
    It might be interesting to note though, that any such problem could also not
    be solved by a regular grammar, so "regular" expressions that need lookbehind
    aren't, as such, regular anymore.

    HTH,
    Sebastian
    --
    Jabber:
    ICQ: 205544826
     
    Sebastian Hungerecker, Feb 22, 2008
    #3
  4. J. Cooper

    ThoML Guest

    > Yes, there are. For example: you want to match any occurence of "bar" except
    > if it is preceeded by "foo". I.e. you'd want to match "blabar" or "oofbar",
    > but not "foobar".


    I think it's important to state that the look-behind matches with zero
    width, i.e. the match isn't included in the match.

    If it's okay to include the prefix in the match (e.g., in a gsub, the
    prefix could then be referenced as a group), this could also be
    achieved without lookbehind:

    require 'strscan'
    # 0 1 2 3
    # 0123456789012345678901234567890123
    s = StringScanner.new('blabar oofbar foobar ofobar offbar')
    # ^ ^ ^ ^
    until s.eos?
    m = s.scan_until(/([^o]|[^o]o|[^f]oo)(bar)/)
    p s.pos
    end

    # =>
    6
    13
    27
    34

    pos 20 is missing.

    There are of course situations when this isn't possible.

    Regards,
    Thomas.
     
    ThoML, Feb 22, 2008
    #4
  5. J. Cooper

    Mark Bush Guest

    ThoML wrote:
    > There are of course situations when this isn't possible.


    require 'strscan'
    # 0 1 2 3
    # 0123456789012345678901234567890123456789
    s = StringScanner.new('obar blabar oofbar foobar ofobar offbar')
    # ^ ^ ^ ^
    until s.eos?
    m = s.scan_until(/([^o]|[^o]o|[^f]oo)(bar)/)
    p s.pos
    end

    # =>
    11
    18
    32
    39

    :-(
    However, with:
    m = s.scan_until(/(([^o]|^)|([^o]|^)o|([^f]|^)oo)(bar)/)

    =>
    4
    11
    18
    32
    39

    :)
    --
    Posted via http://www.ruby-forum.com/.
     
    Mark Bush, Feb 22, 2008
    #5
  6. J. Cooper

    ThoML Guest

    > However, with:
    > m = s.scan_until(/(([^o]|^)|([^o]|^)o|([^f]|^)oo)(bar)/)


    Oh well. It's probably a good thing we have look-behind now. :)

    Regards,
    Thomas.
     
    ThoML, Feb 22, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    724
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,650
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    614
  4. Xah Lee
    Replies:
    1
    Views:
    955
    Ilias Lazaridis
    Sep 22, 2006
  5. Replies:
    3
    Views:
    795
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page