D
donaldjones
I have a large text file with many records I'd like to parse and
extract data. I'm trying to conceptually figure out how to pull out
what I need and put in a CSV file. Here is a snipped of what 2 records
look like, the beginning of each records always has a "TEST1:" and an
"NN:" within the first line:
-------------------snip----------------------
TEST1: DTP:07/17/06 SSZ4 NN:007-74 REC:01 UNZZ PG: 001+
CCTL FUN:007-74 CFL:L0E MT:09/11/05-R FSS:L0E MN:L00
PT:2005 2004
DOE, JOHN PZY:C01 TMRV,DI-03/02 UII ZZL:07/10/06 SEQU:2
RRRU NONE
ZMTH ZMTH 1 2 3 5 DDAA YXA XXA XXS
ZMTH 11/02/02 2 0102 156 156 11
ZMTH 11/02/02 2 0202 156 156 11
ZMTH 11/02/02 2 0302 156 156 22
ZMTH 11/02/02 2 0402 96 96 11
TEST1: DTP:07/17/06 SSZ4 NN:745-88 REC:01 UNZZ PG: 001+
CCTL FUN:745-88 CFL:L0E MT:09/11/05-R FSS:L0E MN:L00
PT:2005 2004
DOE, JOHN PZY:C01 TMRV,DI-03/02 UII ZZL:07/10/06 SEQU:2
RRRU NONE
ZMTH ZMTH 1 2 3 5 DDAA YXA XXA XXS
ZMTH 11/02/02 2 0102 156 156 11
ZMTH 11/02/02 2 0202 156 156 11
ZMTH 11/02/02 2 0302 156 156 22
ZMTH 11/02/02 2 0402 96 96 11
-------------------snip----------------------
Here is an example of what I'm looking for to put in a CSV File, with
the first line being the header of each piece of data I'm trying to
grab:
NN,CFL,Name,UI
007-74,L0E,JOHN DOE,DI
So as you can see, I want to be able to pull out the value after each
##: (for particular ##:'s or all of them if that's easy) where ## is a
character representation followed by a colon that identifies each piece
of data in the text file snippet above. Also want to note that the
name has no delimiter, but always comes on the 4th line of each record,
at the beginning.
I'm not looking for someone to write the program for me, but was
looking for some ideas on how to go about grabbing this data out and
putting into another file. I use Perl mainly as a system administrator
to get tasks done as needed and can figure out where to go if pointed
in the right direction. I'm used to working with the same predictable
fileds on each line for parsing and splitting, but not when the values
can be found in a span of multiple lines.
I've googled for this a few times, but I haven't found quite what I'm
looking for.
Any ideas? Any help would be greatly appreciated.
extract data. I'm trying to conceptually figure out how to pull out
what I need and put in a CSV file. Here is a snipped of what 2 records
look like, the beginning of each records always has a "TEST1:" and an
"NN:" within the first line:
-------------------snip----------------------
TEST1: DTP:07/17/06 SSZ4 NN:007-74 REC:01 UNZZ PG: 001+
CCTL FUN:007-74 CFL:L0E MT:09/11/05-R FSS:L0E MN:L00
PT:2005 2004
DOE, JOHN PZY:C01 TMRV,DI-03/02 UII ZZL:07/10/06 SEQU:2
RRRU NONE
ZMTH ZMTH 1 2 3 5 DDAA YXA XXA XXS
ZMTH 11/02/02 2 0102 156 156 11
ZMTH 11/02/02 2 0202 156 156 11
ZMTH 11/02/02 2 0302 156 156 22
ZMTH 11/02/02 2 0402 96 96 11
TEST1: DTP:07/17/06 SSZ4 NN:745-88 REC:01 UNZZ PG: 001+
CCTL FUN:745-88 CFL:L0E MT:09/11/05-R FSS:L0E MN:L00
PT:2005 2004
DOE, JOHN PZY:C01 TMRV,DI-03/02 UII ZZL:07/10/06 SEQU:2
RRRU NONE
ZMTH ZMTH 1 2 3 5 DDAA YXA XXA XXS
ZMTH 11/02/02 2 0102 156 156 11
ZMTH 11/02/02 2 0202 156 156 11
ZMTH 11/02/02 2 0302 156 156 22
ZMTH 11/02/02 2 0402 96 96 11
-------------------snip----------------------
Here is an example of what I'm looking for to put in a CSV File, with
the first line being the header of each piece of data I'm trying to
grab:
NN,CFL,Name,UI
007-74,L0E,JOHN DOE,DI
So as you can see, I want to be able to pull out the value after each
##: (for particular ##:'s or all of them if that's easy) where ## is a
character representation followed by a colon that identifies each piece
of data in the text file snippet above. Also want to note that the
name has no delimiter, but always comes on the 4th line of each record,
at the beginning.
I'm not looking for someone to write the program for me, but was
looking for some ideas on how to go about grabbing this data out and
putting into another file. I use Perl mainly as a system administrator
to get tasks done as needed and can figure out where to go if pointed
in the right direction. I'm used to working with the same predictable
fileds on each line for parsing and splitting, but not when the values
can be found in a span of multiple lines.
I've googled for this a few times, but I haven't found quite what I'm
looking for.
Any ideas? Any help would be greatly appreciated.