Multiple Line Pattern Match

Chris L. · Apr 9, 2006

Can someone please provide some assitance with a multi-line matching
problem? I have a datafile that looks like this:

***************DATAFILE************************
START
foo
START
foo
START
foo
bar
foo
bar
foo
bar
START

I am trying to capture the contents between the START and START
delineators. However, only if there are more than 1 line in between
them.
Specifically, I want to capture the entries with 6 lines in between
START and START--
but I want to leave out the entries that are only 1 line between START
and START.

Below is what I have so far-- however, it captures everything in
between START and START. Again, Im trying to catch only the 6 line
stretches between START and START not the 1 line stretches...
----------------------------------------------------------------------------------------------------------

open(FH,"foobar.txt")|| die "Cannot open FHandle: $!";
local $/ = "START\n";
while ( <FH> )
{
s/.*START\n//;
print;
}
close FH;
-------------------------------------------------------------------------------------------------------------

Is there a way to specify the amount of lines?
Thank you very much for your time.
Chris L.

Xicheng Jia · Apr 9, 2006

Chris said:
Can someone please provide some assitance with a multi-line matching
problem? I have a datafile that looks like this:

***************DATAFILE************************
START
foo
START
foo
START
foo
bar
foo
bar
foo
bar
START

I am trying to capture the contents between the START and START
delineators. However, only if there are more than 1 line in between
them.
Specifically, I want to capture the entries with 6 lines in between
START and START--
but I want to leave out the entries that are only 1 line between START
and START.

Below is what I have so far-- however, it captures everything in
between START and START. Again, Im trying to catch only the 6 line
stretches between START and START not the 1 line stretches...
----------------------------------------------------------------------------------------------------------

open(FH,"foobar.txt")|| die "Cannot open FHandle: $!";
local $/ = "START\n";
while ( <FH> )
{

= s/.*START\n//;
this line is useless, coz START has been in $/, so $_ doesnot contain
the string "START", if you want to remove the last START which is not
followed by a newline, then you may want to use:

s/.*START//;

print;
}
close FH;
-------------------------------------------------------------------------------------------------------------

= Is there a way to specify the amount of lines?

Just count the numer of newlines in $/, like

my $number_of_lines = tr/\n//;
print "$_\n\n" if $number_of_lines == 6;

Xicheng

Tad McClellan · Apr 10, 2006

Xicheng Jia said:
Chris L. wrote:

= s/.*START\n//;
this line is useless, coz START has been in $/, so $_ doesnot contain
the string "START",

Yes it does (if the file has the $/ value anywhere in it)..

When $/="\n" do you get a newline in $_ ?

Sure you do. Same here.

if you want to remove the last START which is not
followed by a newline, then you may want to use:

s/.*START//;

What is it that keeps the character after the START from
being a newline again?

perl -le 'print "matched" if "START\n" =~ /.*START/'

Just count the numer of newlines in $/, like

my $number_of_lines = tr/\n//;

That counts the number of newlines in $_, not in $/

Xicheng Jia · Apr 10, 2006

Yes it does (if the file has the $/ value anywhere in it)..

When $/="\n" do you get a newline in $_ ?

Sure you do. Same here.

yeah, you are right. I always use -l option on my command line which
actually chomps off $/, so it's why I thought there is no such $/ in
$_... anyway, the s/// expression there is about the same as chomp..

What is it that keeps the character after the START from
being a newline again?

perl -le 'print "matched" if "START\n" =~ /.*START/'

= That counts the number of newlines in $_, not in $/

my typo, and thanks for the correction..

Regards,
Xicheng

Anno Siegel · Apr 10, 2006

Chris L. said:
Can someone please provide some assitance with a multi-line matching
problem? I have a datafile that looks like this:

***************DATAFILE************************
START
foo
START
foo
START
foo
bar
foo
bar
foo
bar
START

I am trying to capture the contents between the START and START
delineators. However, only if there are more than 1 line in between
them.
Specifically, I want to capture the entries with 6 lines in between
START and START--
but I want to leave out the entries that are only 1 line between START
and START.

my @big_chunks = do {
local $/ = "START\n";
grep tr/\n// > 2, <DATA>;
};

Anno

Tad McClellan · Apr 10, 2006

Chris L. said:
Specifically, I want to capture the entries with 6 lines in between
START and START--
but I want to leave out the entries that are only 1 line between START
and START.

Why not capture all the chunks, and then filter them based
on how many lines they contain?

----------------------
#!/usr/bin/perl
use warnings;
use strict;

local $/ = "START\n";

while ( <DATA> ) {
chomp;
my @lines = split /\n/;
next unless @lines == 6;

print "found a 6-line chunk\n";
}

__DATA__
START
foo
START
foo
START
foo
bar
foo
bar
foo
bar
START

How can I fix my pattern coding error in c++	0	Mar 19, 2023
Multiple Line Pattern Match problem	7	May 31, 2007
Match a pattern multiple times, returning matches, captures andoffset?	9	Apr 5, 2011
Pyautogui, cv2 and cannot find image	0	Feb 7, 2023
Need help in extracting lines from word using python	5	Mar 19, 2013
Replace an occurrence of a regexp with a function call on a substringof the match, multiple times on	4	Sep 16, 2013
I need help fixing my website	2	Oct 15, 2023
Problem with memory usage in pattern match	2	Dec 5, 2005

Multiple Line Pattern Match

Chris L.

Xicheng Jia

Tad McClellan

Xicheng Jia

Anno Siegel

Tad McClellan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads