Regexp: rubular VS match. Why is the result different ?

Discussion in 'Ruby' started by Ale Ds, Oct 28, 2009.

  1. Ale Ds

    Ale Ds Guest

    I have to capture by means of regexp the content between '<' and '>'

    as instance:

    str = 'anystring<hour>anystring<min>anystring<sec>anystring'
    I need the array['hour','min,'sec']

    I have written the regexp: /(<([^<>]+)>)+/
    and I have tested it in rubular.com site (It work !)

    I have run it in irb:
    >> /(<([^<>]+)>)+/.match('anystring<hour>anystring<min>anystring<sec>anystring')

    => #<MatchData "<hour>" 1:"<hour>" 2:"hour">
    >>

    As you can see match method return just the first match in MatchData obj

    Do you know why ?

    thank you,
    Alessandro
    --
    Posted via http://www.ruby-forum.com/.
     
    Ale Ds, Oct 28, 2009
    #1
    1. Advertising

  2. 2009/10/28 Ale Ds <>:
    > I have to capture by means of regexp the content between '<' and '>'
    >
    > as instance:
    >
    > str = 'anystring<hour>anystring<min>anystring<sec>anystring'
    > I need the array['hour','min,'sec']
    >
    > I have written the regexp: /(<([^<>]+)>)+/
    > and I have tested it in rubular.com site (It work !)


    The "+" at the end is superfluous because this would match multiple
    concatenated sequences like <xx><yyy> which you want as separate
    items.

    > I have run it in irb:
    >>> /(<([^<>]+)>)+/.match('anystring<hour>anystring<min>anystring<sec>anystring')

    > => #<MatchData "<hour>" 1:"<hour>" 2:"hour">
    >>>

    > As you can see match method return just the first match in MatchData obj
    >
    > Do you know why ?


    That's the difference between #match and #scan. You want scan in your code.

    irb(main):001:0> str = 'anystring<hour>anystring<min>anystring<sec>anystring'
    => "anystring<hour>anystring<min>anystring<sec>anystring"
    irb(main):002:0> str.scan /<([^>]+)>/
    => [["hour"], ["min"], ["sec"]]
    irb(main):003:0> str.scan /<([^>]+)>/ do |m| p m end
    ["hour"]
    ["min"]
    ["sec"]
    => "anystring<hour>anystring<min>anystring<sec>anystring"
    irb(main):004:0> str.scan /<([^>]+)>/ do |m,| p m end
    "hour"
    "min"
    "sec"
    => "anystring<hour>anystring<min>anystring<sec>anystring"

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Oct 28, 2009
    #2
    1. Advertising

  3. Ale Ds

    Chris Shea Guest

    On Oct 28, 10:12 am, Ale Ds <> wrote:
    > I have to capture by means of regexp the content between '<' and '>'
    >
    > as instance:
    >
    > str = 'anystring<hour>anystring<min>anystring<sec>anystring'
    > I need the array['hour','min,'sec']
    >
    > I have written the regexp: /(<([^<>]+)>)+/
    > and I have tested it in rubular.com site (It work !)
    >
    > I have run it in irb:>> /(<([^<>]+)>)+/.match('anystring<hour>anystring<min>anystring<sec>anystring')
    >
    > => #<MatchData "<hour>" 1:"<hour>" 2:"hour">
    >
    > As you can see match method return just the first match in MatchData obj
    >
    > Do you know why ?
    >
    > thank you,
    > Alessandro
    > --
    > Posted viahttp://www.ruby-forum.com/.


    Alessandro,

    You'll want the String#scan method (http://www.ruby-doc.org/core/
    classes/String.html#M000812).

    015:0> regexp = /<([^<>]+)>/
    => /<([^<>]+)>/
    016:0> str = 'anystring<hour>anystring<min>anystring<sec>anystring'
    => "anystring<hour>anystring<min>anystring<sec>anystring"
    017:0> str.scan(regexp)
    => [["hour"], ["min"], ["sec"]]

    HTH,
    Chris
     
    Chris Shea, Oct 28, 2009
    #3
  4. Ale Ds

    Ale Ds Guest

    > The "+" at the end is superfluous because this would match multiple
    > concatenated sequences like <xx><yyy> which you want as separate
    > items.

    ...
    I agree with you

    >
    >> I have run it in irb:
    >>>> /(<([^<>]+)>)+/.match('anystring<hour>anystring<min>anystring<sec>anystring')

    >> => #<MatchData "<hour>" 1:"<hour>" 2:"hour">
    >>>>

    >> As you can see match method return just the first match in MatchData obj
    >>
    >> Do you know why ?

    >
    > That's the difference between #match and #scan. You want scan in your
    > code.

    ...

    yes, scan works !
    thanks a lot,
    Alessandro
    --
    Posted via http://www.ruby-forum.com/.
     
    Ale Ds, Oct 28, 2009
    #4
  5. Ale Ds

    Ale Ds Guest

    Ale Ds, Oct 28, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mr. SweatyFinger
    Replies:
    2
    Views:
    2,236
    Smokey Grindel
    Dec 2, 2006
  2. Michael Tan
    Replies:
    32
    Views:
    1,075
    Ara.T.Howard
    Jul 21, 2005
  3. Old Echo
    Replies:
    1
    Views:
    200
    Adam Shelly
    Sep 4, 2008
  4. Andreas Hansen
    Replies:
    7
    Views:
    158
  5. Intransition

    Rubular Rocks

    Intransition, Nov 11, 2009, in forum: Ruby
    Replies:
    7
    Views:
    158
    Tim Pease
    Nov 12, 2009
Loading...

Share This Page