How do I use a variable-width positive look behind assertion?

J

jl_post

Hi,

I have some code that splits text right after a line containing
only a single dot. To do this, I used a positive look-behind
assertion, like in this sample script:


#!/usr/bin/perl

use strict;
use warnings;

my $text = <<"END_OF_TEXT";
Line 1
..
Line 2
..
Line 3
END_OF_TEXT

# Use a positive look-behind assertion
# to split $text right after a dot on
# a line by itself:
my @elements = split m/(?<=\n\.\n)/, $text;

use Data::Dumper;
print Dumper @elements;

__END__


Running this program gives the output:

$VAR1 = 'Line 1
..
';
$VAR2 = 'Line 2
..
';
$VAR3 = 'Line 3
';

Basically, $text was split into three elements, which each element
(except for the last) ending with a dot (on a line by itself).

This positive look-behind assertion works great if the dot is truly
on a line by itself. But it there was leading and/or trailing
whitespace with the dot (on a line by itself), the regular expression
m/(?<=\n\.\n)/ won't split after that line.

What I'm looking for is to do something like this:


#!/usr/bin/perl

use strict;
use warnings;

my $text = <<"END_OF_TEXT";
Line 1
 
D

Dr.Ruud

(e-mail address removed) schreef:

You take far too many words and lines to explain your simple problem.
Further, the subject doesn't mention the problem at hand, but your
problem with some assumed way to solve it.
I want to split right after a line that
contains exactly one dot and an arbitrary amount of whitespace, but I
don't think I can do it with split() using a simple regular
expression. Or can I?

If you can afford to lose the separating lines:

split /^[[:blank:]]*\.[[:blank:]]*\n/m, $text;


If you want to keep the separating lines too:

split /(^ [[:blank:]]* \. [[:blank:]]* \n )/mx, $text;



But why use split() at all? Alternative that keeps the blocks together:

#!/usr/bin/perl
use strict;
use warnings;

my $text = join "", <DATA>;

my @elements =
$text =~
m{ .*?
(?:
^
[[:blank:]]*
[.]
[[:blank:]]*
\n
|
\z
)
}msxg;

print "[$_]\n" for grep $_, @elements;


__DATA__
Line 1
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top