Help with nested pattern.

somedeveloper · Apr 14, 2007

Hi,

Would appreciate some hints on a 'smart' / 'nifty' solution to this
problem.

The problem:
I need to extract a block of text lying between -- let's say -- a
pair of brackets.
There can be an arbitrary # of such [] blocks nested one inside the
other.
I know how to mark my first '[' to start the matching process.

Example:
abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz

Desired output: [ def .* uvw ]

1. Now, I don't know if this is something Perl regexps can handle. I
read somewhere (possibly incorrectly) that nested patterns are in
general constructs that are handled via grammars (flex/bison combo)
and not regexps.

2. But since Perl provides features like match-time-code-evaluation in
regexps, I thought incrementing a count variable on each '[',
decrementing it on each ']', and printing the current pattern when the
count goes to zero would do the job... but I'm not so sure how.

3. If there's really no solution via regexps and grammars, I would
have to use the brute-force approach of processing each character in a
loop looking for ['s and ]'s. (yuck!)

Regards...

Brian McCauley · Apr 14, 2007

Hi,

Would appreciate some hints on a 'smart' / 'nifty' solution to this
problem.

The problem:
I need to extract a block of text lying between -- let's say --
a pair of brackets.
There can be an arbitrary # of such [] blocks nested one inside
the other.

This is FAQ: "How do I find matching/nesting anything?"

Brian McCauley · Apr 14, 2007

This is FAQ: "How do I find matching/nesting anything?"

Applying the suggestions given there

use strict;
use warnings;

my $in = ' abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz';

local our $re;

# Taken from "perldoc perlre" section dealing with (??{ })
$re = qr{
\[
(?:
(?> [^\[\]]+ )
|
(??{ $re })
)*
\]
}x;

# Find first top-level bracketed section
my ($out) = $in =~ /($re)/;

# Remove sub-brackets
$out =~ s/(?<!\A)$re//g;

# Normalize whitespace
$out =~ s/\s+/ /g;

print "$out\n";

__END__

somedeveloper · Apr 14, 2007

This is FAQ: "How do I find matching/nesting anything?"

Click to expand...

Applying the suggestions given there

use strict;
use warnings;

my $in = ' abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz';

local our $re;

# Taken from "perldoc perlre" section dealing with (??{ })
$re = qr{
\[
(?:
(?> [^\[\]]+ )
|
(??{ $re })
)*
\]
}x;

# Find first top-level bracketed section
my ($out) = $in =~ /($re)/;

# Remove sub-brackets
$out =~ s/(?<!\A)$re//g;

# Normalize whitespace
$out =~ s/\s+/ /g;

print "$out\n";

__END__

Can't thank you enough! It was (really){2,}\.\.\. dumb on my part to
not check the faq first!

Mirco Wahab · Apr 14, 2007

The problem:
I need to extract a block of text lying between -- let's say -- a
pair of brackets.
There can be an arbitrary # of such [] blocks nested one inside the
other.
I know how to mark my first '[' to start the matching process.
Example:
abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz

Desired output: [ def .* uvw ]

If the problem stays as simple as your example,
which means: you know in advance to capture
only the outer part of something, you could
simply re-model it as a regexp and forget about
the inner structure (if you don't need it).

Example (you know you need only the "outer pair")

use strict;
use warnings;

my $text = '
abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz ';

my $reg;

$reg = qr/ \A # start of string
.+? (\[ \s+ \w+) \s+ (\S+) # re-model abc [ def ~~~
.* # be greedy
\b(\w+ \s+ \]) \s+ \w+ \s+ # re-model backwards
\z
/xs;

if( $text =~ /$reg/ ) {
print "$1 $2 $3"
}

If your real problem is more complicated,
then you'd go with Brians solution imho.

Regards

Mirco

Brian McCauley · Apr 19, 2007

# Remove sub-brackets
$out =~ s/(?<!\A)$re//g;

\A is zero width (so look-behind = look-ahead) and without a /m
qualifier it's equivalent to ^ so the above is more neatly written as:

$out =~ s/(?!^)$re//g;

Help needed with nested parsing of file into objects	12	Jun 4, 2012
Novice - help with pattern matching needed	5	Feb 7, 2004
Pattern remembering and replacing	1	Sep 27, 2003
Weaver/Yarn Pattern in Python	1	Jan 21, 2004
(Newbie) Help with sockets.	6	Feb 29, 2008
need help with a cart I inherited, need to increase number of total characters allowed	3	Oct 22, 2007
Problem with PERL function	18	Jun 26, 2007
Help needed for perl rookie	4	Dec 27, 2004

Help with nested pattern.

somedeveloper

Brian McCauley

Brian McCauley

somedeveloper

Mirco Wahab

Brian McCauley

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads