text parsing

S

Shalini Joshi

Hi!

I am relatively new to perl and am looking to parse a non-delimited
text file. What I would like to do is out of this file of records
which always begin with 'FPR' and could span multiple lines, extract
only some relevant records.

The criterion is the number denoted by characters 7 through 18..I have
a vague idea how to go about parsing this, but with so much
information(on the website and the other postings on the group) it's
kind of confusing what the best way to do it is..I am initially
interested in just getting the script to work...

I would just like to extract this info and create a newfile where i
would store it in the same format..Is there any way I can do it? Or
would I necessarily have to parse the info into an array or somethng
before i dump it into the new file?

Thanks and looking forward to any kind of help and tips.

--Shalini
 
J

Jeff 'japhy' Pinyan

[posted & mailed]

I am relatively new to perl and am looking to parse a non-delimited
text file. What I would like to do is out of this file of records
which always begin with 'FPR' and could span multiple lines, extract
only some relevant records.

The criterion is the number denoted by characters 7 through 18..I have
a vague idea how to go about parsing this, but with so much
information(on the website and the other postings on the group) it's
kind of confusing what the best way to do it is..I am initially
interested in just getting the script to work...

It would be helpful if you showed us what code you're already trying.

The first thing I would do is read one 'record' from the file at a time.
You could use $/ to do this, but in your case it would be a little
trickier than usual, so I'll avoid that approach. Here, I'm just reading
until I get to a line that starts with "FPR".

my @records;

open RECORDS, "< file.txt" or die "can't read file.txt: $!";

# get the FIRST line of the record
local $_ = <RECORDS>;
{
# put the first line into $rec
my $rec = $_;

# get all subsequent lines of the record
while (<RECORDS>) {
# stop if we encounter an FPR line
last if /^FPR/;

# tack this line onto the $rec variable
$rec .= $_;
}

# add this record to the array
push @records, $rec;

# go back to the top of the block
# NOTE: at this point, $_ is the
# first line of the NEXT record
redo;
}
close RECORDS;

Now you have an array, @records, that contains the FPR-marked records in
your file. What you do with that array is up to you.
 
G

Gunnar Hjalmarsson

Shalini said:
I am relatively new to perl and am looking to parse a non-delimited
text file. What I would like to do is out of this file of records
which always begin with 'FPR' and could span multiple lines,
extract only some relevant records.

The criterion is the number denoted by characters 7 through 18..

Not easy to suggest anything without sample data, but how about
something like this:

open OLDFILE, $oldfile or die $!;
open NEWFILE, "> $newfile" or die $!;
{
local $/ = 'FPR';
print NEWFILE scalar <OLDFILE>;
while (<OLDFILE>) {
my $num = substr ($_, 3, 12;
if ( ... some tests of $num ... ) {
print NEWFILE $_;
}
}
}
close NEWFILE;
close OLDFILE;
 
S

Shalini Joshi

I also have the following data from the data file:

Sorry for not including all the information in one post. Will keep all
this in mind in all my future ones.

RHR001PRICE REFRESHER2004052620040526*AETNA*V (Header)
FPR001AET001A20AET81082004052601863063601863063600000000000ING VP
Growth and Income M&E 1.40% (Data)
FPR001AET001A70AETN2062004052601384691901384691900000000000ING VP
Growth and Income M&E 1.40% (Data)
RTR001PRICE REFRESHER000000569 (Trailer)


Of course the data doesnt wrap in the file..it's one single line(funny
when i print it on paper it gave me the data on multiple lines).

I am not sure why the code in Jeff's reply doesnt work..when i run it
it just sits in an infinite loop. I have to kill it to get back to my
prompt.

THanks for all the helpful posts.

Regards

Shalini
 
S

Shalini Joshi

Hi ..this is regarding my earlier post.

I got it working.Apparently because of the redo, it was going into an
infinite loop. Here's what i did and it works when I print out the
array elements now.

#! /usr/bin/perl
use strict;
my @records;
my $i;
open RECORDS, "< AETNA17.R3617.txt" or die "Can't read file:
$!";

# Get the first line of the record
local $_ = <RECORDS>;
#print $_ ;
while (<RECORDS>){
#Put the first line into $rec
my $rec = $_;

#get all subsequent lines of the record

while (<RECORDS>) {
# stop if we encounter an FPR line

last if /^FPR/;

# tack this line onto the $rec variable
$rec .= $_;
}
push @records, $rec;

#Go back to the top of the block
# At this point, $_ is the first line of the NEXT FPR record
print "$_";

}
pop @records; # To remove the trailer that is stored before
condition
# is tested
close RECORDS;

foreach $i (@records)
{
print "$i";
}


Am now working on dealing with the array elements.

Thanks a bunch for the help.

Regards,

Shalini
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top