#scan with or'd (`|`) subexpressions.

T

trans. (T. Onoma)

Does the new Ruby regexp engine do this?

irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
=> [["1", "2", nil, nil], [nil, nil, "3", "4"]]
irb(main):002:0>

Why would all the subexpressions be listed when there is an `|` (or) used? I
expected:

=> [["1", "2"], ["3", "4"]]

T.
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: #scan with or'd (`|`) subexpressions."

|Does the new Ruby regexp engine do this?
|
| irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
| => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
| irb(main):002:0>
|
|Why would all the subexpressions be listed when there is an `|` (or) used? I
|expected:
|
| => [["1", "2"], ["3", "4"]]

You will never know which subexpression is matched, if you get your
expected result. Is there any reason /(1|3)(2|4)/ is not sufficient?

matz.
 
P

Peter

|Does the new Ruby regexp engine do this?
|
| irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
| => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
| irb(main):002:0>
|
|Why would all the subexpressions be listed when there is an `|` (or) used? I
|expected:
|
| => [["1", "2"], ["3", "4"]]

You will never know which subexpression is matched, if you get your
expected result. Is there any reason /(1|3)(2|4)/ is not sufficient?

This matches 14 and 32 too. /(1(?=2)|3(?=4)(2|4)/ is better but more
complex and generally hard to do.

Peter
 
T

trans. (T. Onoma)

Hi Matz,

On Thursday 11 November 2004 11:52 am, Yukihiro Matsumoto wrote:
| Hi,
|
| In message "Re: #scan with or'd (`|`) subexpressions."
|
| on Thu, 11 Nov 2004 23:29:58 +0900, "trans. (T. Onoma)"
| |Does the new Ruby regexp engine do this?
| |
| | irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
| | => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
| | irb(main):002:0>
| |
| |Why would all the subexpressions be listed when there is an `|` (or) used?
| | I expected:
| |
| | => [["1", "2"], ["3", "4"]]
|
| You will never know which subexpression is matched, if you get your
| expected result.

Actually, trying to figure out which subexpression is matched is _exactly_ my
problem. I have a dozens of regexp in the form of:

(#{spre})(#{start})(#{spost})(.*?)(#{epre})(#{end})(#{epost})

All of these are in an array (r) and strung together:

re = Regexp.new( r.join('|') )

Then

m = []
str.scan( re ) { m << $~ }

How do I know which array index (r[?]) produced the match? How does the
current behavior allow me to figure out which match?

| Is there any reason /(1|3)(2|4)/ is not sufficient?

Hmm... well with a good bit of refactoring I might be able to do it this way.
Although some of my regexp's have zero-width look ahead and I suspect they
might be a problem here.

Thanks,
T.
 
M

Mark Hubbart

Hi,

In message "Re: #scan with or'd (`|`) subexpressions."

|Does the new Ruby regexp engine do this?


|
| irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
| => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
| irb(main):002:0>
|
|Why would all the subexpressions be listed when there is an `|` (or) used? I
|expected:
|
| => [["1", "2"], ["3", "4"]]

You will never know which subexpression is matched, if you get your
expected result. Is there any reason /(1|3)(2|4)/ is not sufficient?

This reminds me... when will Ruby support named subexpressions?
Oniguruma fully supports them now; but there doesn't appear to be a
way to access this in the ruby code.

thanks,
Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Ruby Hash Keys and Related Questions 6
Class instance method 2
convert date 5
what does print call internally? 12
parsedate (LoadError) 1.9.2 1
parentheses and newlines 2
Array#slice! bug? 0
Code help please 4

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top