Need RegExp

I

Indigo5

I need a way of parsing a variable format. Basically the section starts off
with a Reference keyword and ends with a Comments keyword.

Reference:
1. document1
2 document2
Comments:

I was doing this as follows:

if (/^Reference/ .. /^Comments/){

if (/^[0-9]/){

chomp;

$_ =~ s/^[0-9].\s+//g;

push @references, $_;
}
}

However, I just received a document where Reference 1 wrapped onto another
line so that the format looks like this:

Reference:
1. document 1
more of document 1 here
2. document 2
Comments:

I cannot change the format of the input I receive, so I have to find a way
to parse this the way I receive it. Any help would be highly appreciated.
Thank you.
 
T

Tad McClellan

Indigo5 said:
I need a way of parsing a variable format. Basically the section starts off
with a Reference keyword and ends with a Comments keyword.


Have you considered using $/ instead?

Maybe $/ = "Comments:\n"; # ??

Reference:
1. document1
2 document2
Comments:

I was doing this as follows:

if (/^Reference/ .. /^Comments/){

if (/^[0-9]/){

chomp;

$_ =~ s/^[0-9].\s+//g;

push @references, $_;
}
}

However, I just received a document where Reference 1 wrapped onto another
line so that the format looks like this:

Reference:
1. document 1
more of document 1 here


So just tack it onto the end of the previous one then.

What's the problem?

if ( /^ +/ ) { # continuation line
chomp;
$references[-1] .= " $_";
}
 
J

John W. Krahn

Indigo5 said:
I need a way of parsing a variable format. Basically the section starts off
with a Reference keyword and ends with a Comments keyword.

Reference:
1. document1
2 document2
Comments:

I was doing this as follows:

if (/^Reference/ .. /^Comments/){

if (/^[0-9]/){

chomp;

$_ =~ s/^[0-9].\s+//g;

push @references, $_;
}
}

However, I just received a document where Reference 1 wrapped onto another
line so that the format looks like this:

Reference:
1. document 1
more of document 1 here
2. document 2
Comments:

I cannot change the format of the input I receive, so I have to find a way
to parse this the way I receive it. Any help would be highly appreciated.


$_ = <FILE> until /^Reference/;

while ( <FILE> ) {
last if /^Comments/;
chomp;
s/^\d+\.\s+// ? push( @references, $_ ) : ( $references[ -1 ] .= $_ );
}



John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top