Ruby Regexp vs Perl and C#

Discussion in 'Ruby' started by Chris Meyers, Oct 13, 2006.

  1. Chris Meyers

    Chris Meyers Guest

    I am having trouble with the Ruby regexp engine and don't know if it is
    my lack of experience with Ruby, or if it just isn't possible to do
    certain things with the Ruby engine. Basically I have the following
    code in perl and C#, simplified for the example:

    Perl:
    $line = "banana";
    while( $line =~ /(an)*/g )
    {
    print $`. "<<" . $& . ">>" . $' . "\n";
    }

    C#:
    using System;
    using System.Text.RegularExpressions;

    public class MyClass
    {
    public static void Main()
    {
    string testString = "banana";
    MatchCollection matches = Regex.Matches( testString, "(an)*" );
    foreach( Match match in matches)
    {
    Console.WriteLine( testString.Substring( 0, match.Index) +
    "<<" + match.Value + ">>" + testString.Substring( match.Index +
    match.Length));
    }
    }
    }

    Output from both:
    <<>>banana
    b<<anan>>a
    banan<<>>a
    banana<<>>

    While the C# way of getting the answer is a bit more messy, it is still
    possible. I am wondering if/how to do this in Ruby. From what I have
    seen, all the Ruby options are a single match which doesn't help me,
    especially with the given example where Ruby's only match would be the
    one before the string thus being empty. Using the string#scan in ruby
    also doesn't help as it gives a 2 dimensional array of the following:
    Ruby:
    irb(main):001:0> line = "banana"
    => "banana"
    irb(main):002:0> line.scan(/(an)*/)
    => [[nil], ["an"], [nil], [nil]]

    So while Ruby finds that there should be 4 matches, it returns them in a
    difficult to use format, with no index information, and matches only
    "an" rather than "anan" as I want.

    So basically is there a way to do this in Ruby? Which can also be asked
    as, how do you do multiple matches in Ruby with full match information?

    Thanks,
    Chris

    --
    Posted via http://www.ruby-forum.com/.
     
    Chris Meyers, Oct 13, 2006
    #1
    1. Advertising

  2. Chris Meyers

    matt neuburg Guest

    matt neuburg, Oct 13, 2006
    #2
    1. Advertising

  3. Chris Meyers

    matt neuburg Guest

    matt neuburg, Oct 13, 2006
    #3
  4. On Oct 12, 2006, at 7:00 PM, Chris Meyers wrote:

    > I am having trouble with the Ruby regexp engine and don't know if
    > it is
    > my lack of experience with Ruby, or if it just isn't possible to do
    > certain things with the Ruby engine. Basically I have the following
    > code in perl and C#, simplified for the example:
    >
    > Perl:
    > $line = "banana";
    > while( $line =~ /(an)*/g )
    > {
    > print $`. "<<" . $& . ">>" . $' . "\n";
    > }


    scan() can also take a block:

    "banana".scan { puts "#{$`}<<#{$&}>>#{$'}" }

    James Edward Gray II
     
    James Edward Gray II, Oct 13, 2006
    #4
  5. Chris Meyers

    Verno Miller Guest

    > Chris Meyers wrote:
    > ...
    > ... and [Ruby] matches only "an" rather than "anan" as I want.
    >
    > So basically is there a way to do this in Ruby? Which can also be asked
    > as, how do you do multiple matches in Ruby with full match information?
    >
    > Thanks,
    > Chris



    It's a bit tricky but still doable!

    line = "banana"

    If your goal is to match just (an)* in "banana" use:

    line.scan(/((an)*)/) { |str|
    #puts $1.inspect
    puts $1 if $1 != ""
    }


    To match surrounding characters as well use:

    line.scan(/([^a]|a(?=[^n]|$)|(an)*)/) { |str|
    puts $1.inspect unless $1.empty?
    }


    Additional characters can be inserted this way:

    line = "bananaxyzantuia"

    str = ""

    line.scan(/([^a]|a(?=[^n]|$)|(an)*)/) {

    match = $1

    puts match if match =~ /^[^a]$/
    puts match if match =~ /^a$/
    puts match if match =~ /^(an)+$/

    if match =~ /^[^a]$/ then str << match
    elsif match =~ /^a$/ then str << match
    elsif match =~ /^(an)+$/ then str << "<<" << match << ">>"
    end

    }

    puts str #=> b<<anan>>axyz<<an>>tuia


    Instead of if-elsif-end you can also use case-when-end of course (cf.
    http://www.bigbold.com/snippets/posts/show/1313 ).


    Cheers,
    verno


    --
    Posted via http://www.ruby-forum.com/.
     
    Verno Miller, Oct 15, 2006
    #5
  6. Chris Meyers

    Chris Meyers Guest

    Verno Miller wrote:
    >
    > It's a bit tricky but still doable!
    >
    > line = "banana"
    >
    > If your goal is to match just (an)* in "banana" use:
    >
    > line.scan(/((an)*)/) { |str|
    > #puts $1.inspect
    > puts $1 if $1 != ""
    > }


    Thank you Verno, that is what I needed. By using the block syntax I am
    able to get the same output as Perl or C# and I can adapt that to my
    problem at hand. For anyone interested, to recreate the same output in
    Ruby as my Perl and C# example the following code works:

    line="banana"
    line.scan(/(an)*/) {
    puts $` + "<<" + $& + ">>" + $' + "\n"
    }

    Thanks again, Chris

    --
    Posted via http://www.ruby-forum.com/.
     
    Chris Meyers, Oct 16, 2006
    #6
  7. On 10/16/06, Chris Meyers <> wrote:
    >
    > line="banana"
    > line.scan(/(an)*/) {
    > puts $` + "<<" + $& + ">>" + $' + "\n"
    > }


    Also, if you do a

    require 'English'

    you can write that as

    puts $PREMATCH + "<<" + $MATCH + ">>" + $POSTMATCH

    (you don't need the explicit \n if you're using puts)

    martin
     
    Martin DeMello, Oct 17, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sam Dela Cruz
    Replies:
    3
    Views:
    116
    Sam Dela Cruz
    Jan 10, 2006
  2. Greg Hurrell
    Replies:
    4
    Views:
    177
    James Edward Gray II
    Feb 14, 2007
  3. Mikel Lindsaar
    Replies:
    0
    Views:
    540
    Mikel Lindsaar
    Mar 31, 2008
  4. Joao Silva
    Replies:
    16
    Views:
    402
    7stud --
    Aug 21, 2009
  5. Uldis  Bojars
    Replies:
    2
    Views:
    213
    Janwillem Borleffs
    Dec 17, 2006
Loading...

Share This Page