&read_file in File::Slurp terminates unexpectedly on file

  • Thread starter Charles R. Thompson
  • Start date
C

Charles R. Thompson

I'm working through the conversion of some fixed-length record files with
extended ASCII data and a series of characters in the some of the files
appears to be causing read_file to assume it's at the end of the file. These
won't translate in the various readers so I'll notate. My hex editor says
the last characters where it terminates are:

00 C7 07 CA 1A 29 00

I see everything up to the 1A (Decimal 26), meaning I can see the CA as the
end. According to an ASCII chart I found online 1A is the 'substitute'
character.

Is there a method in Perl I can use to ensure an entire file is read so I
can read every character without incident?

Charles
 
B

Ben Morrow

Charles R. Thompson said:
I'm working through the conversion of some fixed-length record files with
extended ASCII data and a series of characters in the some of the files
appears to be causing read_file to assume it's at the end of the file. These
won't translate in the various readers so I'll notate. My hex editor says
the last characters where it terminates are:

00 C7 07 CA 1A 29 00

I see everything up to the 1A (Decimal 26), meaning I can see the CA as the
end. According to an ASCII chart I found online 1A is the 'substitute'
character.

Is there a method in Perl I can use to ensure an entire file is read so I
can read every character without incident?

Have you called binmode() on the filehandle concerned?

Ben
 
J

Jay Tilton

: My hex editor says
: the last characters where it terminates are:
:
: 00 C7 07 CA 1A 29 00
:
: I see everything up to the 1A (Decimal 26), meaning I can see the CA as the
: end. According to an ASCII chart I found online 1A is the 'substitute'
: character.

On DOS-ish filesystems, character 0x1A marks the end-of-file when
reading the file as text.

: Is there a method in Perl I can use to ensure an entire file is read so I
: can read every character without incident?

binmode() the filehandle. This will screw up the normal CRLF
translation, but that's easily remedied.
 
T

Trent Curry

Jay said:
On DOS-ish filesystems, character 0x1A marks the end-of-file when
reading the file as text.

Yes, on my WinXP Pro system if I insert 0x1A (Ctrl + Z) in the middle
file and read it without binmode() it gets cut off there.

Just FYI, the same is not true in a unix/linux based envirornment. 0x04
(Ctrl + D) and 0x03 (Ctrl + C) characters insert into the file does not
prevent reading to the end. It is my understnading that this is a Win32
quirk (at least NT based; I have no Win9x/ME systems to check with.)
binmode() the filehandle. This will screw up the normal CRLF
translation, but that's easily remedied.

It still read just fine, but if you want the end result to be just \n
(LF) instead of \r\n (CRLF) a simple

$line = s!\r\n!\n!g;

for each line oughtta do it.

Or one better:

$line = s!\r\n|\r!\n!g;

(Or if you don't find reading the whole file to memory:)

local $/ = undef;
(my $file = <SOMEFILE>) =~ s!\r\n|\r!\n!g;

Though if you know the file will large linebe line is best suited, and
usually the way to go in most cases.

--
Trent Curry

perl -e
'($s=qq/e29716770256864702379602c6275605/)=~s!([0-9a-f]{2})!pack("h2",$1
)!eg;print(reverse("$s")."\n");'
 
U

Uri Guttman

BM> Have you called binmode() on the filehandle concerned?

and you can enable binmode when using File::Slurp (a recent
version). the older module couldn't do binmode nor an already open
handle that had binmode called on it.

uri
 
C

Charles R. Thompson

Is there a method in Perl I can use to ensure an entire file is read
so I
and you can enable binmode when using File::Slurp (a recent
version). the older module couldn't do binmode nor an already open
handle that had binmode called on it.

I am using an older version, you are correct. I found the binmode answer
earlier after searching more on "1A" and Perl. I have to say after that
searching I found an alarming number of posts with my same problem. I had
previously went to the FAQs first and tried all the examples under "How can
I read in an entire file all at once? " hoping one of them provided a clue,
no dice.

Even though this appears to be Windows specific, I think including a note on
binmode in that particular FAQs section would be very beneficial. Not a cop
out... I fully realize now searching a bit more with some specifics would
have gotten my answer, but I also wouldn't have spent my time and others
here if it were in the FAQs.

Just a thought.

Charles
 
B

Ben Liddicott

You need to use the binmode function, or the three argument open, with O_BINARY.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,831
Latest member
HealthSmartketoReviews

Latest Threads

Top