J
James Edward Gray II
I keep running into some surprising points with Ruby's Regexp engine
today and this first one just looks plain wrong to me:
irb(main):001:0> html = "<p>one</p>\n\n<p>two</p>"
=> "<p>one</p>\n\n<p>two</p>"
irb(main):002:0> html.sub!(/<p>(.*?)<\/p>(.*)/) { $1.strip }
=> "one\n\n<p>two</p>"
irb(main):003:0> $2
=> ""
Can anyone explain to me how that isn't a bug?
Here's another surprise, for me:
irb(main):001:0> html = "<p>one</p>\n\n<p>two</p>"
=> "<p>one</p>\n\n<p>two</p>"
irb(main):002:0> html.sub!(/<p>(.*?)<\/p>(.*)\Z/) { $1.strip }
=> "<p>one</p>\n\ntwo"
Using an anchor there means that the left-most match doesn't win?
Here's my Ruby version:
$ ruby -v
ruby 1.8.2 (2004-12-25) [powerpc-darwin7.7.0]
Thanks for any wisdom you can impart.
James Edward Gray II
today and this first one just looks plain wrong to me:
irb(main):001:0> html = "<p>one</p>\n\n<p>two</p>"
=> "<p>one</p>\n\n<p>two</p>"
irb(main):002:0> html.sub!(/<p>(.*?)<\/p>(.*)/) { $1.strip }
=> "one\n\n<p>two</p>"
irb(main):003:0> $2
=> ""
Can anyone explain to me how that isn't a bug?
Here's another surprise, for me:
irb(main):001:0> html = "<p>one</p>\n\n<p>two</p>"
=> "<p>one</p>\n\n<p>two</p>"
irb(main):002:0> html.sub!(/<p>(.*?)<\/p>(.*)\Z/) { $1.strip }
=> "<p>one</p>\n\ntwo"
Using an anchor there means that the left-most match doesn't win?
Here's my Ruby version:
$ ruby -v
ruby 1.8.2 (2004-12-25) [powerpc-darwin7.7.0]
Thanks for any wisdom you can impart.
James Edward Gray II