returning text between two markers without markers included

A

Adam Akhtar

Hi if have

str = "ruby is ((Great))"

how do i use regex to find text between the start marker (( and end
marker ))?

Im new to regex but I tried this

\(\(([^((|^))]*)\)\)

but it includes the markers in the capture i.e. it gives ((Great)) when
i just want Great.

Ive tried searching the board and googling but couldnt find anything
suitable.
 
X

Xavier Noria

Hi if have

str = "ruby is ((Great))"

how do i use regex to find text between the start marker (( and end
marker ))?

Im new to regex but I tried this

\(\(([^((|^))]*)\)\)

but it includes the markers in the capture i.e. it gives ((Great))
when
i just want Great.

The fact that the parens are delimiters obscures the regexp a bit.
Assuming "))" ends the text to extract unconditionally you can simply
use .*? like this

irb(main):001:0> "ruby is ((Great))".match(/ \(\( (.*?) \)\) /mx)[1]
=> "Great"

-- fxn
 
A

Adam Akhtar

Thanks Xavier,

I haven't seen the /mx term before, is that reponsible for not including
the markers themselves?
 
A

Adam Akhtar

ahh just noticed the inclusion of the ? to make it restricted. I
understand your use of groups there with [1] but not to sure of that
syntax. I was thinking of using string.slice or string.scan to return
the text but not sure how i would do so with groups. How would i go
about doing this???
 
X

Xavier Noria

I haven't seen the /mx term before, is that reponsible for not
including
the markers themselves?

Those are two regexp modifiers stacked together:

* With /m the dot matches newlines. I couldn't assume the text to
extract doesn't contain newlines so I added it just in case.

* With /x literal whitespace in the regexp are ignored. Since the
regexp uses so many backslashes that gives some air for readability.

As for the other mail String#match returns a MatchData object. Those
objects support indexing by [], and the first capture is at index 1.
 
A

Adam Akhtar

ahhh thats great. I like the readability one. ill be using that a lot
from now on and ill use match instead of scan and slice for this
particular problem. Thanks Xavier.
 
A

Adam Akhtar

I just wondered, if I had multiple marked sections in a string how would
i capture all of them?

So if my sentance was

"a bannana is a type of ((fruit)) and a dog is a type of ((animal))" how
could I store fruit and animal for later use via regex?
 
D

Dave Thomas

I just wondered, if I had multiple marked sections in a string how
would
i capture all of them?

So if my sentance was

"a bannana is a type of ((fruit)) and a dog is a type of ((animal))"
how
could I store fruit and animal for later use via regex?

irb(main):004:0> str = "a bannana is a type of ((fruit)) and a dog is
a type of ((animal))"
=> "a bannana is a type of ((fruit)) and a dog is a type of ((animal))"

irb(main):005:0> str.scan(/\(\((.*?)\)
\)/) => [["fruit"],
["animal"]]
irb(main):006:0>
 
T

Todd Benson

I just wondered, if I had multiple marked sections in a string how would
i capture all of them?

So if my sentance was

"a bannana is a type of ((fruit)) and a dog is a type of ((animal))" how
could I store fruit and animal for later use via regex?

There's #scan...

(str.scan /\({2} (.*?) \){2}/x).flatten
=> ["fruit", "animal"]

Todd
 
A

Adam Akhtar

great, closer to solving my problem. Though i realised that this regex
wouldnt work if the marked text was split by a newline so i went away
and modified it so that if it were split it would still be picked up. I
did it with this

\({2}(?s)(.*?)(?s)\){2}

Im wondeirng if theres a neater way of sayinig "ignore any newlines that
split the marked text up"

is there an operator that tells it to ignore newlines and is the above
robust?

Thanks so much for the help so far.
 
T

Todd Benson

great, closer to solving my problem. Though i realised that this regex
wouldnt work if the marked text was split by a newline so i went away
and modified it so that if it were split it would still be picked up. I
did it with this

\({2}(?s)(.*?)(?s)\){2}

Im wondeirng if theres a neater way of sayinig "ignore any newlines that
split the marked text up"

is there an operator that tells it to ignore newlines and is the above
robust?

I don't know about robustness, but throwing an 'm' after the regex
like Xavier did might do the trick...

(str.scan /\({2} (.*?) \){2}/mx).flatten

...which could also be written as...

str.scan(/\({2} (.*?) \){2}/mx).flatten

If you are worried about spaces on either side of the

Notice the m and x after the last forward slash. If you are concerned
about there being spaces on either side of the string between (( and
)), then...

str.scan(/\{2} \s*(.*?)\s* /){2}/mx).flatten

Todd
 
D

David A. Black

Hi --

Those are two regexp modifiers stacked together:

* With /m the dot matches newlines. I couldn't assume the text to extract
doesn't contain newlines so I added it just in case.

* With /x literal whitespace in the regexp are ignored. Since the regexp uses
so many backslashes that gives some air for readability.

I know I'm in the minority, but I'll just mention that I find most
regexes that make use of /x very hard to read. The reason is that I've
trained my brain how to read a pattern, so if I encounter this:

/ string (.*) another ? string /x

it's a considerable effort to "not see" the spaces. I think it's
better to stick to the basic pattern language, which after all is the
only set of rules that we all learn and all share.

I would recommend saving /x for cases where you want to break a regex
out into multiple lines and include comments:

re = /
\( # opening paren
\d{3} # area code
\) # closing paren

etc. (Not a great example of an obscure pattern that's made more clear
by /x but you get the idea.)


David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top