Reading from fixed-length text file

N

nun

I need to read lines from an ASCII flat file in the following format :

1 to 4 - code (length= 4)
5 to 24 - number (length=20)
25 to 54 - description (length=30)
55 to 62 - p1 (length= 8)
71 to 78 - p2 (length= 8)
104 to 123 - New number (length=20)
124 to 124 - flag (length= 1)


Here's an example line of the file which will no doubt wrap in this post:

PQ AMERICAN SERIES CATFISH 0.000
0.000 L11115 2


Now in another script, I was reading in comma-separated values from a
file like this:

#################################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
push @AoA, [ split /,/ ];
}
#################################

and I want to do the same thing with this fixed-length data. My reading
online suggests that I could accomplish this using unpack, or
Text::FixedLength but I'm not sure which is best. Can anyone provide
guidance or an example?

DB
 
J

Jürgen Exner

nun said:
I need to read lines from an ASCII flat file in the following format :

1 to 4 - code (length= 4)
5 to 24 - number (length=20)
25 to 54 - description (length=30) [...]
My
reading online suggests that I could accomplish this using unpack, or
Text::FixedLength

Third option: substr() which actually may be better than unpack(), because
AFAIK unpack() takes a sequence of bytes while substr() will take a sequence
of characters.
Depending upon if e.g. the 'description' is 30 characters long or 30 bytes
long you may get surprising results otherwise.

jue
 
J

John W. Krahn

nun said:
I need to read lines from an ASCII flat file in the following format :

1 to 4 - code (length= 4)
5 to 24 - number (length=20)
25 to 54 - description (length=30)
55 to 62 - p1 (length= 8)
71 to 78 - p2 (length= 8)
104 to 123 - New number (length=20)
124 to 124 - flag (length= 1)


Here's an example line of the file which will no doubt wrap in this post:

PQ AMERICAN SERIES CATFISH 0.000
0.000 L11115 2


Now in another script, I was reading in comma-separated values from a
file like this:

#################################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
push @AoA, [ split /,/ ];
}
#################################

and I want to do the same thing with this fixed-length data. My reading
online suggests that I could accomplish this using unpack, or
Text::FixedLength but I'm not sure which is best. Can anyone provide
guidance or an example?

Using unpack it could be as simple as:

push @AoA, [ unpack 'A4 A20 A30 A8 A8 A20 A', $_ ];




John
 
D

DB

Ok I've made some progress, I think I just need some syntax help. I
think I need to start this thread fresh though to avoid confusion.

I need to read lines from an ASCII flat file in the following format :

0 to 3 - code (length= 4)
4 to 23 - number (length=20)
24 to 53 - description (length=30)
54 to 61 - p1 (length= 8)
70 to 77 - p2 (length= 8)
103 to 122 - New number (length=20)
123 to 123 - flag (length= 1)


Some of the lines however do not have the last two fields and end at
position 77. Here's an example line of the file which will no doubt wrap
in this post:

PQ AMERICAN SERIES CATFISH 0.000
0.000 L11115 2

Now in another script, I was reading in comma-separated values from a
file like this:

#################################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
push @AoA, [ split /,/ ];
}
#################################

and I want to do the same thing with this fixed-length data. Jurgen was
correct that unpack() was not a good solution.

Here is what I'm trying, which fails. I'm guessing it is a stupid sytnax
problem. Can someone assist?

#################################
# reading data in from file
my (@AoA);
while ( <> ) {
chomp;
my $line_length=length($_);

if ($line_length=124) {
push @AoA, [(
substr($_, 0, 4),
substr($_, 4, 20),
substr($_, 24, 30),
substr($_, 54, 8),
substr($_, 70, 8),
substr($_, 103, 20),
substr($_, 123, 1),
)];
}

if ($line_length=78) {
push @AoA, [(
substr($_, 0, 4),
substr($_, 4, 20),
substr($_, 24, 30),
substr($_, 54, 8),
substr($_, 70, 8),
" ", # add twenty spaces
" ", # add one space
)];
}

}
#############################
 
J

J. Gleixner

DB said:
Here is what I'm trying, which fails. I'm guessing it is a stupid sytnax
problem. Can someone assist?

First, you should state what, exactly, fails.
#################################
# reading data in from file
my (@AoA);
my @AoA;
while ( <> ) {
chomp;
my $line_length=length($_);

if ($line_length=124) {
use ==

'=' does the assignment.
push @AoA, [(
substr($_, 0, 4),
substr($_, 4, 20),
substr($_, 24, 30),
substr($_, 54, 8),
substr($_, 70, 8),
substr($_, 103, 20),
substr($_, 123, 1),
)];
}

if ($line_length=78) { use ==
push @AoA, [(
substr($_, 0, 4),
substr($_, 4, 20),
substr($_, 24, 30),
substr($_, 54, 8),
substr($_, 70, 8),
" ", # add twenty spaces
" ", # add one space
)];
}

You could also ignore the length and if there's data there,
use it, otherwise have a default value.

...
substr($_, 70, 8),
substr($_, 103, 20) || ' ' x 20,
substr($_, 123, 1) || ' ',
 
N

nun

J. Gleixner said:
First, you should state what, exactly, fails.

my @AoA;

use ==

'=' does the assignment.

Yes, right after I posted I noticed that and now the script works.

[...]
You could also ignore the length and if there's data there,
use it, otherwise have a default value.

...
substr($_, 70, 8),
substr($_, 103, 20) || ' ' x 20,
substr($_, 123, 1) || ' ',


I tried that but it now gives me a warning such as this for each line in
the input data:

substr outside of string at HM_parse.pl line 47, <> line 17763.

But that's ok, I'm happy to have a working script even if it is not as
elegant as can be. Thanks!

DB
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,076
Latest member
OrderKetoBeez

Latest Threads

Top