Multiple matching with ()*

Alessandro Re · Jul 31, 2007

Hi there!
I'm Alessandro from Italy and I started using ruby some days ago,
so... Hello, Community!

Well, I was trying to match a pattern multiple times. I tried both
with normal match() and scan(), but i can't get the desired result.

The subject string is something like:
"1a2bend" or "beg1a2b3c4dend"
more generally, it should match /^beg(\d\w)*end$/ : always a begin and
ending pattern, and a unspecified number of central pattern.
The problem is that the central pattern must be extracted for every
time it's encountered.
For example, trying with
"x1A2B3C4Dz".scan /^(x)(\d\w)*(z)$/
returns
[["x", "4D", "z"]]
while i need something like
[["x", "1A", "2B", "3C", "4D", "z"]]

Why does ()* match just the last one? How can i get all the ()* that it matches?

Probabily i'm doing something wrong, but can't understand where :\

Thanks!

Jano Svitok · Jul 31, 2007

Hi there!
I'm Alessandro from Italy and I started using ruby some days ago,
so... Hello, Community!

Well, I was trying to match a pattern multiple times. I tried both
with normal match() and scan(), but i can't get the desired result.

The subject string is something like:
"1a2bend" or "beg1a2b3c4dend"
more generally, it should match /^beg(\d\w)*end$/ : always a begin and
ending pattern, and a unspecified number of central pattern.
The problem is that the central pattern must be extracted for every
time it's encountered.
For example, trying with
"x1A2B3C4Dz".scan /^(x)(\d\w)*(z)$/
returns
[["x", "4D", "z"]]
while i need something like
[["x", "1A", "2B", "3C", "4D", "z"]]

Why does ()* match just the last one? How can i get all the ()* that it matches?

Probabily i'm doing something wrong, but can't understand where :\

Try:

if "x1A2B3C4Dz" =~ /^(x)((?:\d\w)*)(z)$/

return [

Jano Svitok · Jul 31, 2007

Hi there!
I'm Alessandro from Italy and I started using ruby some days ago,
so... Hello, Community!

Well, I was trying to match a pattern multiple times. I tried both
with normal match() and scan(), but i can't get the desired result.

The subject string is something like:
"1a2bend" or "beg1a2b3c4dend"
more generally, it should match /^beg(\d\w)*end$/ : always a begin and
ending pattern, and a unspecified number of central pattern.
The problem is that the central pattern must be extracted for every
time it's encountered.
For example, trying with
"x1A2B3C4Dz".scan /^(x)(\d\w)*(z)$/
returns
[["x", "4D", "z"]]
while i need something like
[["x", "1A", "2B", "3C", "4D", "z"]]

Why does ()* match just the last one? How can i get all the ()* that it matches?

Probabily i'm doing something wrong, but can't understand where :\

Try:

if "x1A2B3C4Dz" =~ /^(x)((?:\d\w)*)(z)$/
a, b = $1, $3 #
return [a] + $2.scan(/\d\w/).flatten +
end

I don't know if it's possible to do it in one run though, maybe you
could use split as well...
Take care when doing nested searches as they will overwrite $1..9
(that's why I used a and b)

J.

Harry Kakueki · Jul 31, 2007

For example, trying with
"x1A2B3C4Dz".scan /^(x)(\d\w)*(z)$/
returns
[["x", "4D", "z"]]
while i need something like
[["x", "1A", "2B", "3C", "4D", "z"]]

Hi,

Try this.

str = "x1A2B3C4Dz"
p str.scan(/\d?\w/) #>["x", "1A", "2B", "3C", "4D", "z"]

Harry

Alessandro Re · Jul 31, 2007

Thanks, but i need to match the pattern OR don't match anything.
"lol1a2vasd".scan(/\d?\w/) => ["l", "o", "l", "1a", "2v", "a", "s", "d"]
while i need to be sure that the pattern begins with a regex "x" and
ends with "z"

(of course, x 1 a 2 b 3 c should be regexes not just chars)

thanks, you help is apreciated

For example, trying with
"x1A2B3C4Dz".scan /^(x)(\d\w)*(z)$/
returns
[["x", "4D", "z"]]
while i need something like
[["x", "1A", "2B", "3C", "4D", "z"]]

Click to expand...

Hi,

Try this.

str = "x1A2B3C4Dz"
p str.scan(/\d?\w/) #>["x", "1A", "2B", "3C", "4D", "z"]

Harry

Alessandro Re · Jul 31, 2007

Mh well, to me it seems a normal regex processing (i mean, it *should*
require only one instruction, since this pattern can be read with just
one regex, even if ruby doesn't allow it... but it would be really
bad).
Anyway well, splitting it there are different ways to do it - thanks
for your sudjestion.
But if ruby make it possible with one call, i'd prefer to use it.

Thanks!

Hi there!
I'm Alessandro from Italy and I started using ruby some days ago,
so... Hello, Community!

Well, I was trying to match a pattern multiple times. I tried both
with normal match() and scan(), but i can't get the desired result.

The subject string is something like:
"1a2bend" or "beg1a2b3c4dend"
more generally, it should match /^beg(\d\w)*end$/ : always a begin and
ending pattern, and a unspecified number of central pattern.
The problem is that the central pattern must be extracted for every
time it's encountered.
For example, trying with
"x1A2B3C4Dz".scan /^(x)(\d\w)*(z)$/
returns
[["x", "4D", "z"]]
while i need something like
[["x", "1A", "2B", "3C", "4D", "z"]]

Why does ()* match just the last one? How can i get all the ()* that it matches?

Probabily i'm doing something wrong, but can't understand where :\

Click to expand...

Try:

if "x1A2B3C4Dz" =~ /^(x)((?:\d\w)*)(z)$/
a, b = $1, $3 #
return [a] + $2.scan(/\d\w/).flatten +
end

I don't know if it's possible to do it in one run though, maybe you
could use split as well...
Take care when doing nested searches as they will overwrite $1..9
(that's why I used a and b)

J.

Harry Kakueki · Jul 31, 2007

Thanks, but i need to match the pattern OR don't match anything.
"lol1a2vasd".scan(/\d?\w/) => ["l", "o", "l", "1a", "2v", "a", "s", "d"]
while i need to be sure that the pattern begins with a regex "x" and
ends with "z"

str = "lol1a2vasd"
p str.scan(/\d\w|\w{3}/)

Harry

Robert Klemme · Jul 31, 2007

2007/7/31 said:
Mh well, to me it seems a normal regex processing (i mean, it *should*
require only one instruction, since this pattern can be read with just
one regex, even if ruby doesn't allow it... but it would be really
bad).
Anyway well, splitting it there are different ways to do it - thanks
for your sudjestion.
But if ruby make it possible with one call, i'd prefer to use it.

irb(main):006:0> s="x1A2B3C4Dz"
=> "x1A2B3C4Dz"
irb(main):007:0> s.scan /x(\d\w)*z/
=> [["4D"]]
irb(main):008:0> s.scan /x((?:\d\w)*?)z/
=> [["1A2B3C4D"]]
irb(main):009:0> s.scan(/x((?:\d\w)*?)z/).map {|a| a[0].scan(/\d\w/)}
=> [["1A", "2B", "3C", "4D"]]

Kind regards

robert

Alessandro Re · Jul 31, 2007

Thanks, this is an interesting solution!

2007/7/31 said:
2007/7/31 said:

Mh well, to me it seems a normal regex processing (i mean, it *should*
require only one instruction, since this pattern can be read with just
one regex, even if ruby doesn't allow it... but it would be really
bad).
Anyway well, splitting it there are different ways to do it - thanks
for your sudjestion.
But if ruby make it possible with one call, i'd prefer to use it.

Click to expand...

irb(main):006:0> s="x1A2B3C4Dz"
=> "x1A2B3C4Dz"
irb(main):007:0> s.scan /x(\d\w)*z/
=> [["4D"]]
irb(main):008:0> s.scan /x((?:\d\w)*?)z/
=> [["1A2B3C4D"]]
irb(main):009:0> s.scan(/x((?:\d\w)*?)z/).map {|a| a[0].scan(/\d\w/)}
=> [["1A", "2B", "3C", "4D"]]

Kind regards

robert

botp · Jul 31, 2007

Mh well, to me it seems a normal regex processing (i mean, it *should*
require only one instruction, since this pattern can be read with just
one regex, even if ruby doesn't allow it... but it would be really bad).

seems like you have a pattern within a pattern.
it may be easy to unwrap outer pattern first, then work on the inner
pattern. something like,

irb(main):096:0> "lol1a2vasd".scan(/lol(.+)asd/).to_s.scan(/\d\w/)
=> ["1a", "2v"]
irb(main):097:0> "beg1a2vend".scan(/beg(.+)end/).to_s.scan(/\d\w/)
=> ["1a", "2v"]
irb(main):098:0> "beg1a2vendxbeg3c4dend".scan(/beg(.+)end/).to_s.scan(/\d\w/)
=> ["1a", "2v", "3c", "4d"]

is that ok?
kind regards -botp

Wolfgang NÃ¡dasi-donner · Jul 31, 2007

Alessandro said:
For example, trying with
"x1A2B3C4Dz".scan /^(x)(\d\w)*(z)$/
returns
[["x", "4D", "z"]]
while i need something like
[["x", "1A", "2B", "3C", "4D", "z"]]

Does this goes more into the direction you wanted:

irb(main):001:0> "x1A2B3C4Dz".scan
/(?:^(?:x)|\G)(\d\w)(?=(?:\d\w)*(?:z)$)/
=> [["1A"], ["2B"], ["3C"], ["4D"]]

???

Wolfgang NÃ¡dasi-Donner

Harry Kakueki · Aug 1, 2007

while i need to be sure that the pattern begins with a regex "x" and
ends with "z"

(of course, x 1 a 2 b 3 c should be regexes not just chars)

Sorry, I misunderstood what you wanted.
Is this more like it?

str = "lol1a2vasd"
m = /^(\w{3})(.*)(\w{3})$/.match(str).captures
m[1] = m[1].scan(/\d\w/)
p m.flatten #> ["lol","1a","2v","asd"]

Harry

Robert Klemme · Aug 1, 2007

2007/7/31 said:
Thanks, this is an interesting solution!

2007/7/31 said:

Mh well, to me it seems a normal regex processing (i mean, it *should*
require only one instruction, since this pattern can be read with just
one regex, even if ruby doesn't allow it... but it would be really
bad).
Anyway well, splitting it there are different ways to do it - thanks
for your sudjestion.
But if ruby make it possible with one call, i'd prefer to use it.

Click to expand...

irb(main):006:0> s="x1A2B3C4Dz"
=> "x1A2B3C4Dz"
irb(main):007:0> s.scan /x(\d\w)*z/
=> [["4D"]]
irb(main):008:0> s.scan /x((?:\d\w)*?)z/
=> [["1A2B3C4D"]]
irb(main):009:0> s.scan(/x((?:\d\w)*?)z/).map {|a| a[0].scan(/\d\w/)}
=> [["1A", "2B", "3C", "4D"]]

Click to expand...

Give special attention to my usage of the reluctant qualifier which is
mandatory if your input contains multiple begin end pairs.

Kind regards

robert

PS: please do not top post.

Alessandro Re · Aug 2, 2007

while i need to be sure that the pattern begins with a regex "x" and
ends with "z"

(of course, x 1 a 2 b 3 c should be regexes not just chars)

Click to expand...

Sorry, I misunderstood what you wanted.
Is this more like it?

str = "lol1a2vasd"
m = /^(\w{3})(.*)(\w{3})$/.match(str).captures
m[1] = m[1].scan(/\d\w/)
p m.flatten #> ["lol","1a","2v","asd"]

Harry

Yep, it's like this.
I solved using 2 instructions as you did: first matching extern words,
then the middle ones, but i still think that one regex would have been
nicer

Thanks guys

Wolfgang NÃ¡dasi-donner · Aug 2, 2007

Alessandro said:
...but i still think that one regex would have been nicer

I don't think, that this will be "nice"...

irb(main):001:0>
"x1A2B3C4Dz".scan(/(?:\G|^(?:x))(x|\d\w|z)(?=(?:\d\w)*(?:z|)$)/)
=> [["x"], ["1A"], ["2B"], ["3C"], ["4D"], ["z"]]

..., and I didn't test it aganst wrong lines, but after a "flatten" it
ends up with the required result.

Wolfgang NÃ¡dasi-Donner

Alessandro Re · Aug 4, 2007

T24gOC8yLzA3LCBXb2xmZ2FuZyBOw6FkYXNpLWRvbm5lciA8ZWQub2Rhbm93QHdvbmFkby5kZT4g
d3JvdGU6Cj4gaXJiKG1haW4pOjAwMTowPgo+ICJ4MUEyQjNDNER6Ii5zY2FuKC8oPzpcR3xeKD86
eCkpKHh8XGRcd3x6KSg/PSg/OlxkXHcpKig/Onp8KSQpLykKPiA9PiBbWyJ4Il0sIFsiMUEiXSwg
WyIyQiJdLCBbIjNDIl0sIFsiNEQiXSwgWyJ6Il1dCgpXb25kZXJmdWwgOikKVGhhbmtzIQoKLS0g
Cn5BbGUK

Robert Klemme · Aug 6, 2007

2007/8/4 said:
irb(main):001:0>
"x1A2B3C4Dz".scan(/(?:\G|^(?:x))(x|\d\w|z)(?=3D(?:\d\w)*(?:z|)$)/)
=3D> [["x"], ["1A"], ["2B"], ["3C"], ["4D"], ["z"]]

Click to expand...

Wonderful
Thanks!

But this does not seem to work with strings that contain multiple sections:

irb(main):002:0>
"x1A2B3C4Dz1a".scan(/(?:\G|^(?:x))(x|\d\w|z)(?=3D(?:\d\w)*(?:z|)$)/)
=3D> []

So it's not suited for a one RX approach and still need two levels of
RX. If that's the case then we have seen simpler solutions for that.
(Btw, one reason why it's so awkward is that there is no lookbehind in
Ruby 1.8 - but this will change.)

Kind regards

robert

Wolfgang NÃ¡dasi-donner · Aug 6, 2007

Robert said:
(Btw, one reason why it's so awkward is that there is no lookbehind in
Ruby 1.8 - but this will change.)

I am waiting for this Christmas gift too...

Wolfgang NÃ¡dasi-Donner

Matching multiple line reg exp	3	Nov 22, 2010
help with regex matching multiple %e	0	Mar 3, 2011
Help with pattern matching	20	Apr 11, 2012
Generate random string matching specific pattern and length	7	May 10, 2011
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
matching '?' in a string ending with digits	15	Feb 26, 2011
Match a pattern multiple times, returning matches, captures andoffset?	9	Apr 5, 2011
Find and replace multiple RegEx search expressions	0	Mar 18, 2014

Multiple matching with ()*

Alessandro Re

Jano Svitok

Jano Svitok

Harry Kakueki

Alessandro Re

Alessandro Re

Harry Kakueki

Robert Klemme

Alessandro Re

botp

Wolfgang NÃ¡dasi-donner

Harry Kakueki

Robert Klemme

Alessandro Re

Wolfgang NÃ¡dasi-donner

Alessandro Re

Robert Klemme

Wolfgang NÃ¡dasi-donner

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads