Read on closed filehandle

R

Russ

I have a very simple script to read an input file, skip everything
between the strings "BEGIN" and "END", and store the results in an
output
file.

----------------------------

3 open(IN,"input.txt");
4 open(OUT,">output.txt");
5
6 while(<IN>) {
7 chomp;
8 if (/BEGIN/) {
9 until (/END/) {
10 $_ = <IN>;
11 }
12 } else {
13 print OUT "$_\n";
14 }
15 }
16
17 close IN;
18 close OUT;

----------------------------------

When I run this on a small file, about 28 kb, it works fine. When I
try it on a large file, about 3 gigs , I get the error:

"Read on closed filehandle <IN> at trim.pl line 6."

The file structure is essentailly the same, the only difference is the
size.
I bump into limits in many text editors when I try to access the large
file, and
I assume that this is a similar limitation in PERL. Is there any way
around this?

I also wouldn't mind some constructive criticism on the script
itself. It seems like there should
be a more elegant way to accomplish this task which might avoid this
problem.
Any suggestions?

Thanks,
Russ
 
M

Mark Clements

Russ said:
I have a very simple script to read an input file, skip everything
between the strings "BEGIN" and "END", and store the results in an
output
file.

----------------------------

3 open(IN,"input.txt");
4 open(OUT,">output.txt");
5
6 while(<IN>) {
7 chomp;
8 if (/BEGIN/) {
9 until (/END/) {
10 $_ = <IN>;
11 }
12 } else {
13 print OUT "$_\n";
14 }
15 }
16
17 close IN;
18 close OUT;

----------------------------------

When I run this on a small file, about 28 kb, it works fine. When I
try it on a large file, about 3 gigs , I get the error:

"Read on closed filehandle <IN> at trim.pl line 6."

It's best if you post code without the line numbers: makes it easier for
people to run it. You also need to run with

use strict;
use warnings;

and check the return value of your system calls (you need to do this
*always*), eg

open (my $infh,"<","input.txt") or die $!;


That should give you some pointers. perldoc -q open

Also check what

perl -V

reports for uselargefiles.

Mark
 
J

jgraber

Russ said:
I have a very simple script to read an input file, skip everything
between the strings "BEGIN" and "END", and store the results in an
output file.
I also wouldn't mind some constructive criticism on the script
itself. It seems like there should
be a more elegant way to accomplish this task which might avoid this
problem.
Any suggestions?

Size of input file is not the problem.
What did you want to happen if there is a BEGIN but no END?

# stop printing when see BEGIN, restart when see END
perl -n -e 'print unless /^BEGIN/ .. /^END/' < infile > outfile
 
R

Russ

use strict;
use warnings;

and check the return value of your system calls (you need to do this
*always*), eg

open (my $infh,"<","input.txt") or die $!;

That should give you some pointers. perldoc -q open

Also check what

perl -V

reports for uselargefiles.

I made the changes you suggested. The program dies at the open()
statement with error:

"Value too large for defined data type at trim.pl line 5."

Also, 'perl -V' does not reference uselargefiles. Is that something
that is set when perl in
initially compiled on the system? Does perl need to be recompiled to
set this value?

Thanks,
Russ
 
J

jgraber

Russ said:
I have a very simple script to read an input file, skip everything
between the strings "BEGIN" and "END", and store the results in an
output file.

3 open(IN,"input.txt");
4 open(OUT,">output.txt");
5
6 while(<IN>) {
7 chomp;
8 if (/BEGIN/) {
9 until (/END/) {
10 $_ = <IN>;
11 }
12 } else {
13 print OUT "$_\n";
14 }
15 }
16
17 close IN;
18 close OUT;
I also wouldn't mind some constructive criticism on the script
itself.

additional comments to my previous reply
Be sure to use strict; and use warnings;
No use in chomping if you intend to add \n afterwards anyway.
What about anchoring BEGIN and END to front or end or whole line?
What about lines like "In every BEGINNING there is also an ENDING"?
Check for errors when opening and closing files,
since printing to a full disk wont be notied until the close.
close OUT || die "cant close '$outfile' because of error : $!\n";

Add the following lines to the end of your program
and run it as its own input file and then notice that it never exits.
# BEGIN
# eND

What did you want to do in the above case;
omit from BEGIN to eof,
or keep it?
 
R

Russ

Please check the return value of open.


No need to chomp if you're going to append
"\n" before printing


What happends if your file
matches /BEGIN/ but never
matches /END/?
You keep reading, even
after you reach the end
of the file.







More concise:

while (<IN>) {
next if /BEGIN/ .. /END/;
print OUT $_;

}

Steven,

Thanks for in suggestions. It works fine but I'm not familiar with the
use of '..' in the next statement.
I have used the 'next if' construct with single regular expressions,
is the above code just representing
the entire group of characters between the "BEGIN" and "END" strings?

If that's the case, and I don't have an "END", will the code print
everything between "BEGIN"
and the end of the file?

Thanks,
Russ D.
 
T

Tad McClellan

Russ said:
read an input file, skip everything
between the strings "BEGIN" and "END", ^^^^^^^


It seems like there should
be a more elegant way to accomplish this task which might avoid this
problem.


Your Question is Asked Frequently.

Any suggestions?


perldoc -q between

How can I pull out lines between two patterns that are themselves on difâ€
ferent lines?
 
D

Dan Mercer

: I have a very simple script to read an input file, skip everything
: between the strings "BEGIN" and "END", and store the results in an
: output
: file.
:
: ----------------------------
:
: 3 open(IN,"input.txt");
: 4 open(OUT,">output.txt");
: 5
: 6 while(<IN>) {
: 7 chomp;
: 8 if (/BEGIN/) {
: 9 until (/END/) {
: 10 $_ = <IN>;

You aren't checking to see if <IN> hits EOF. Undoubtedly,
the last line in the file contains the word "END"

Dan Mercer


: 11 }
: 12 } else {
: 13 print OUT "$_\n";
: 14 }
: 15 }
: 16
: 17 close IN;
: 18 close OUT;
:
: ----------------------------------
:
: When I run this on a small file, about 28 kb, it works fine. When I
: try it on a large file, about 3 gigs , I get the error:
:
: "Read on closed filehandle <IN> at trim.pl line 6."
:
: The file structure is essentailly the same, the only difference is the
: size.
: I bump into limits in many text editors when I try to access the large
: file, and
: I assume that this is a similar limitation in PERL. Is there any way
: around this?
:
: I also wouldn't mind some constructive criticism on the script
: itself. It seems like there should
: be a more elegant way to accomplish this task which might avoid this
: problem.
: Any suggestions?
:
: Thanks,
: Russ
:
 
P

Peter J. Holzer

Also, 'perl -V' does not reference uselargefiles. Is that something
that is set when perl in initially compiled on the system? Does perl
need to be recompiled to set this value?

Yes.

hp
 
J

Joe Smith

Size of input file is not the problem.

No, the size of the input file _is_ the problem.

On a version of perl compiled on a 32-bit machine without "uselargefiles=define",
things like segfault or "read on closed filehandle" will occur after reading
2,147,483,647 bytes from the input file.

-Joe
 
P

Peter J. Holzer

No, the size of the input file _is_ the problem.
Yes.


On a version of perl compiled on a 32-bit machine without "uselargefiles=define",
things like segfault or "read on closed filehandle" will occur after reading
2,147,483,647 bytes from the input file.

Not on the systems I know. You can't even open the file, and the "read
on closed filehandle" message occurs on the first read (not after 2GB)
and happens because the open failed and the OP failed to check for that.

hp
 
J

Joe Smith

Peter said:
Not on the systems I know. You can't even open the file, and the "read
on closed filehandle" message occurs on the first read (not after 2GB)
and happens because the open failed and the OP failed to check for that.

Other ways of getting to 2GB is reading from a pipe or socket or reading from
redirected STDIN. I've run into a case where the shell was able to
open a 5GB file and set up redirection, but the program barfed after 2GB.

-Joe
 
J

jgraber

Jim Gibson said:
It is the range operator in scalar context, which acts like a
flip-flop. See 'perldoc perlop' and search for 'Range Operators'. The
operator will be false until /BEGIN/ is true, true until /END/ is true,
then false again until the next /BEGIN/, etc.

OK so far.
If you do not have an 'END',
the program will print from the final BEGIN to the end of the file.

Not true, since $_ is read one line at a time, and is lost if not printed.
This case is demonstrated below.

% cat omit_begin_end.pl
#!/usr/local/bin/perl
use strict; use warnings;
while(<DATA>){
next if (/^BEGIN/ .. /^END/);
print;
} # wend
__DATA__
line1
BEGIN
line2
#END
line3
% perl omit_begin_end.pl
line1

If #END is changed to END,
then output also includes line3.

If it were important to retain lines between BEGIN and eof with no END,
then the pending lines need to be stored until it is known if they are needed
or not, as shown in this tested example.

#!/usr/local/bin/perl
use strict; use warnings;
my @buffer;
while(<DATA>){
if (/^BEGIN/ .. /^END/) {
push @buffer,$_;
next;
}
@buffer = (); # clear buffer
print;
} # wend
print @buffer;
__DATA__
line1
BEGIN
line2
#END
line3

Or if you dont mind slurping the whole file,
this tested partial example code
prints from the final BEGIN to the end of the file if no matching END.
This does not use the range operator though.
{ local $/=undef; # undef the line-break
$_ = <DATA>; # slurp the whole thing
s/^BEGIN.*^END\n//gms; # s/// on whole thing
print; # print the whole thing
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top