regex @a = m / | /g and captures?

Bill · Oct 17, 2003

Hello, I've got a regex question.

In the following, the use of () in an 'or' type regex causes @a to
hold both captures, so for each pass through the regex, one capture
and one undef is stored.

Can this be prevented and still use () captures and '|' in the regex?

my $s = '1 2 {3, 3, 3} 4';

my @a = $s =~ m/\{[^\}]+\}|\d/g;

print "\nWithout captures:\n", join "\n", @a;

@a = $s =~ m/(\{[^\}]+\})|(\d)/g;
foreach(@a) { $_ = 'undef' unless $_; }
print "\n\nNow with captures:\n", join "\n", @a;

<<<<<<<<<<<

Steve Grazzini · Oct 17, 2003

Bill said:
In the following, the use of () in an 'or' type regex causes @a to
hold both captures, so for each pass through the regex, one capture
and one undef is stored.

Can this be prevented and still use () captures and '|' in the regex?

Put the parentheses around the entire expression.

@a = $s =~ m/(\{[^\}]+\})|(\d)/g;

/( { [^}]+ } | \d )/xg;

But (as you already know) you don't need the parens at all in
this case.

Tad McClellan · Oct 17, 2003

Bill said:
In the following, the use of () in an 'or' type regex causes @a to
hold both captures, so for each pass through the regex, one capture
and one undef is stored.

Can this be prevented and still use () captures and '|' in the regex?

@a = $s =~ m/(\{[^\}]+\})|(\d)/g;

grep() is handy when you need to filter a list:

my @a = grep defined, $s =~ m/(\{[^\}]+\})|(\d)/g;

Bill · Oct 18, 2003

Steve Grazzini said:
Put the parentheses around the entire expression.

@a = $s =~ m/(\{[^\}]+\})|(\d)/g;

Click to expand...

/( { [^}]+ } | \d )/xg;

Oh yes, of course! Cool.
But I think that I simplified the code I was revising too far.

What about this (we want the numbers not the separators):
my $s = '1; 2; {3, 3, 3}; 4;';

my @a = $s =~ m/\{[^\}]+\};|\d;/g;

print "\nWithout captures:\n", join "\n", @a;

@a = $s =~ m/(\{[^\}]+\});|(\d);/g;
foreach(@a) { $_ = 'undef' unless $_; }
print "\n\nNow with captures:\n", join "\n", @a;

<<<<<<<<<<<

It seems that either I have to chop the answers here or filter undefs,
as Tad suggests?

Quantum Mechanic · Oct 18, 2003

/( { [^}]+ } | \d )/xg;

Click to expand...

Oh yes, of course! Cool.
But I think that I simplified the code I was revising too far.

What about this (we want the numbers not the separators):
my $s = '1; 2; {3, 3, 3}; 4;';

my @a = $s =~ m/\{[^\}]+\};|\d;/g;

Then move the common elements (semi-colon) out of the alternation. In
this case, they can be moved out of the capture as well:

/( { [^}]+ } | \d );/xg;

But you haven't stated whether the semi-colons are always there, or
meaningful. If they have no meaning, you can go with the previous
version:

/( { [^}]+ } | \d )/xg;

Click to expand...

-QM

Bill · Oct 19, 2003

my $s = '1; 2; {3, 3, 3}; 4;';

my @a = $s =~ m/\{[^\}]+\};|\d;/g;

Click to expand...

Then move the common elements (semi-colon) out of the alternation. In
this case, they can be moved out of the capture as well:

/( { [^}]+ } | \d );/xg;

But you haven't stated whether the semi-colons are always there, or
meaningful. If they have no meaning, you can go with the previous
version:

/( { [^}]+ } | \d )/xg;

Click to expand...

Click to expand...

-QM

So, I guess the answer in general is just to find a way to rewrite the
regex so that there is only one capture. It's good that regexes are so
flexible. Thanks

Creating a regex to get multiple values and print	0	Jan 10, 2021
rename captures in regex	6	Feb 10, 2005
Effect of redo on m//g	5	Jun 30, 2009
What does it mean: "Trailing \ in regex m/\\\/ at ...	0	Nov 6, 2010
Help with dynamic regex	14	Mar 7, 2012
Match a pattern multiple times, returning matches, captures andoffset?	9	Apr 5, 2011
FAQ 6.20 What good is "\G" in a regular expression?	0	Mar 3, 2011
Oddity with Find::File and -M	0	Dec 24, 2010

regex @a = m / | /g and captures?

Bill

Steve Grazzini

Tad McClellan

Bill

Quantum Mechanic

Bill

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads