Read on closed filehandle

Discussion in 'Perl Misc' started by Russ, Feb 1, 2007.

  1. Russ

    Russ Guest

    I have a very simple script to read an input file, skip everything
    between the strings "BEGIN" and "END", and store the results in an
    output
    file.

    ----------------------------

    3 open(IN,"input.txt");
    4 open(OUT,">output.txt");
    5
    6 while(<IN>) {
    7 chomp;
    8 if (/BEGIN/) {
    9 until (/END/) {
    10 $_ = <IN>;
    11 }
    12 } else {
    13 print OUT "$_\n";
    14 }
    15 }
    16
    17 close IN;
    18 close OUT;

    ----------------------------------

    When I run this on a small file, about 28 kb, it works fine. When I
    try it on a large file, about 3 gigs , I get the error:

    "Read on closed filehandle <IN> at trim.pl line 6."

    The file structure is essentailly the same, the only difference is the
    size.
    I bump into limits in many text editors when I try to access the large
    file, and
    I assume that this is a similar limitation in PERL. Is there any way
    around this?

    I also wouldn't mind some constructive criticism on the script
    itself. It seems like there should
    be a more elegant way to accomplish this task which might avoid this
    problem.
    Any suggestions?

    Thanks,
    Russ
     
    Russ, Feb 1, 2007
    #1
    1. Advertising

  2. Russ wrote:
    > I have a very simple script to read an input file, skip everything
    > between the strings "BEGIN" and "END", and store the results in an
    > output
    > file.
    >
    > ----------------------------
    >
    > 3 open(IN,"input.txt");
    > 4 open(OUT,">output.txt");
    > 5
    > 6 while(<IN>) {
    > 7 chomp;
    > 8 if (/BEGIN/) {
    > 9 until (/END/) {
    > 10 $_ = <IN>;
    > 11 }
    > 12 } else {
    > 13 print OUT "$_\n";
    > 14 }
    > 15 }
    > 16
    > 17 close IN;
    > 18 close OUT;
    >
    > ----------------------------------
    >
    > When I run this on a small file, about 28 kb, it works fine. When I
    > try it on a large file, about 3 gigs , I get the error:
    >
    > "Read on closed filehandle <IN> at trim.pl line 6."


    It's best if you post code without the line numbers: makes it easier for
    people to run it. You also need to run with

    use strict;
    use warnings;

    and check the return value of your system calls (you need to do this
    *always*), eg

    open (my $infh,"<","input.txt") or die $!;


    That should give you some pointers. perldoc -q open

    Also check what

    perl -V

    reports for uselargefiles.

    Mark
     
    Mark Clements, Feb 1, 2007
    #2
    1. Advertising

  3. Russ

    Guest

    "Russ" <> writes:
    > I have a very simple script to read an input file, skip everything
    > between the strings "BEGIN" and "END", and store the results in an
    > output file.


    > I also wouldn't mind some constructive criticism on the script
    > itself. It seems like there should
    > be a more elegant way to accomplish this task which might avoid this
    > problem.
    > Any suggestions?


    Size of input file is not the problem.
    What did you want to happen if there is a BEGIN but no END?

    # stop printing when see BEGIN, restart when see END
    perl -n -e 'print unless /^BEGIN/ .. /^END/' < infile > outfile

    --
    Joel
     
    , Feb 1, 2007
    #3
  4. Russ

    Russ Guest

    On Feb 1, 5:40 pm, Mark Clements <>
    wrote:

    > use strict;
    > use warnings;
    >
    > and check the return value of your system calls (you need to do this
    > *always*), eg
    >
    > open (my $infh,"<","input.txt") or die $!;
    >
    > That should give you some pointers. perldoc -q open
    >
    > Also check what
    >
    > perl -V
    >
    > reports for uselargefiles.


    I made the changes you suggested. The program dies at the open()
    statement with error:

    "Value too large for defined data type at trim.pl line 5."

    Also, 'perl -V' does not reference uselargefiles. Is that something
    that is set when perl in
    initially compiled on the system? Does perl need to be recompiled to
    set this value?

    Thanks,
    Russ
     
    Russ, Feb 1, 2007
    #4
  5. Russ

    Guest

    "Russ" <> writes:

    > I have a very simple script to read an input file, skip everything
    > between the strings "BEGIN" and "END", and store the results in an
    > output file.
    >
    > 3 open(IN,"input.txt");
    > 4 open(OUT,">output.txt");
    > 5
    > 6 while(<IN>) {
    > 7 chomp;
    > 8 if (/BEGIN/) {
    > 9 until (/END/) {
    > 10 $_ = <IN>;
    > 11 }
    > 12 } else {
    > 13 print OUT "$_\n";
    > 14 }
    > 15 }
    > 16
    > 17 close IN;
    > 18 close OUT;


    > I also wouldn't mind some constructive criticism on the script
    > itself.


    additional comments to my previous reply
    Be sure to use strict; and use warnings;
    No use in chomping if you intend to add \n afterwards anyway.
    What about anchoring BEGIN and END to front or end or whole line?
    What about lines like "In every BEGINNING there is also an ENDING"?
    Check for errors when opening and closing files,
    since printing to a full disk wont be notied until the close.
    close OUT || die "cant close '$outfile' because of error : $!\n";

    Add the following lines to the end of your program
    and run it as its own input file and then notice that it never exits.
    # BEGIN
    # eND

    What did you want to do in the above case;
    omit from BEGIN to eof,
    or keep it?

    --
    Joel
     
    , Feb 1, 2007
    #5
  6. Russ

    Russ Guest

    On Feb 1, 6:10 pm, ""
    <> wrote:
    > On Feb 1, 2:34 pm, "Russ" <> wrote:
    >
    > > I have a very simple script to read an input file, skip everything
    > > between the strings "BEGIN" and "END", and store the results in an
    > > output
    > > file.

    >
    > > ----------------------------

    >
    > > 3 open(IN,"input.txt");

    >
    > Please check the return value of open.
    >
    > > 4 open(OUT,">output.txt");
    > > 5
    > > 6 while(<IN>) {
    > > 7 chomp;

    >
    > No need to chomp if you're going to append
    > "\n" before printing
    >
    > > 8 if (/BEGIN/) {
    > > 9 until (/END/) {
    > > 10 $_ = <IN>;
    > > 11 }

    >
    > What happends if your file
    > matches /BEGIN/ but never
    > matches /END/?
    > You keep reading, even
    > after you reach the end
    > of the file.
    >
    >
    >
    > > 12 } else {
    > > 13 print OUT "$_\n";
    > > 14 }
    > > 15 }
    > > 16
    > > 17 close IN;
    > > 18 close OUT;

    >
    > > ----------------------------------

    >
    > > When I run this on a small file, about 28 kb, it works fine. When I
    > > try it on a large file, about 3 gigs , I get the error:

    >
    > > "Read on closed filehandle <IN> at trim.pl line 6."

    >
    > > The file structure is essentailly the same, the only difference is the
    > > size.
    > > I bump into limits in many text editors when I try to access the large
    > > file, and
    > > I assume that this is a similar limitation in PERL. Is there any way
    > > around this?

    >
    > > I also wouldn't mind some constructive criticism on the script
    > > itself. It seems like there should
    > > be a more elegant way to accomplish this task which might avoid this
    > > problem.
    > > Any suggestions?

    >
    > More concise:
    >
    > while (<IN>) {
    > next if /BEGIN/ .. /END/;
    > print OUT $_;
    >
    > }
    >
    > --
    > Hope this helps,
    > Steven


    Steven,

    Thanks for in suggestions. It works fine but I'm not familiar with the
    use of '..' in the next statement.
    I have used the 'next if' construct with single regular expressions,
    is the above code just representing
    the entire group of characters between the "BEGIN" and "END" strings?

    If that's the case, and I don't have an "END", will the code print
    everything between "BEGIN"
    and the end of the file?

    Thanks,
    Russ D.
     
    Russ, Feb 2, 2007
    #6
  7. Russ <> wrote:

    > read an input file, skip everything
    > between the strings "BEGIN" and "END",

    ^^^^^^^


    > It seems like there should
    > be a more elegant way to accomplish this task which might avoid this
    > problem.



    Your Question is Asked Frequently.


    > Any suggestions?



    perldoc -q between

    How can I pull out lines between two patterns that are themselves on difâ€
    ferent lines?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Feb 2, 2007
    #7
  8. Russ

    Dan Mercer Guest

    "Russ" <> wrote in message news:...
    : I have a very simple script to read an input file, skip everything
    : between the strings "BEGIN" and "END", and store the results in an
    : output
    : file.
    :
    : ----------------------------
    :
    : 3 open(IN,"input.txt");
    : 4 open(OUT,">output.txt");
    : 5
    : 6 while(<IN>) {
    : 7 chomp;
    : 8 if (/BEGIN/) {
    : 9 until (/END/) {
    : 10 $_ = <IN>;

    You aren't checking to see if <IN> hits EOF. Undoubtedly,
    the last line in the file contains the word "END"

    Dan Mercer


    : 11 }
    : 12 } else {
    : 13 print OUT "$_\n";
    : 14 }
    : 15 }
    : 16
    : 17 close IN;
    : 18 close OUT;
    :
    : ----------------------------------
    :
    : When I run this on a small file, about 28 kb, it works fine. When I
    : try it on a large file, about 3 gigs , I get the error:
    :
    : "Read on closed filehandle <IN> at trim.pl line 6."
    :
    : The file structure is essentailly the same, the only difference is the
    : size.
    : I bump into limits in many text editors when I try to access the large
    : file, and
    : I assume that this is a similar limitation in PERL. Is there any way
    : around this?
    :
    : I also wouldn't mind some constructive criticism on the script
    : itself. It seems like there should
    : be a more elegant way to accomplish this task which might avoid this
    : problem.
    : Any suggestions?
    :
    : Thanks,
    : Russ
    :
     
    Dan Mercer, Feb 2, 2007
    #8
  9. On 2007-02-01 23:03, Russ <> wrote:
    > Also, 'perl -V' does not reference uselargefiles. Is that something
    > that is set when perl in initially compiled on the system? Does perl
    > need to be recompiled to set this value?


    Yes.

    hp


    --
    _ | Peter J. Holzer | Es ist ganz einfach ihn zu verstehen, wenn
    |_|_) | Sysadmin WSR | man nur alle wichtigen Worte im Satz durch
    | | | | andere ersetzt.
    __/ | http://www.hjp.at/ | -- Nils Ketelsen in danr
     
    Peter J. Holzer, Feb 2, 2007
    #9
  10. Russ

    Joe Smith Guest

    wrote:
    > "Russ" <> writes:
    >> I have a very simple script to read an input file, skip everything
    >> between the strings "BEGIN" and "END", and store the results in an
    >> output file.

    >
    >> When I run this on a small file, about 28 kb, it works fine. When I
    >> try it on a large file, about 3 gigs , I get the error:

    >
    > Size of input file is not the problem.


    No, the size of the input file _is_ the problem.

    On a version of perl compiled on a 32-bit machine without "uselargefiles=define",
    things like segfault or "read on closed filehandle" will occur after reading
    2,147,483,647 bytes from the input file.

    -Joe
     
    Joe Smith, Feb 3, 2007
    #10
  11. On 2007-02-03 09:46, Joe Smith <> wrote:
    > wrote:
    >> "Russ" <> writes:
    >>> When I run this on a small file, about 28 kb, it works fine. When I
    >>> try it on a large file, about 3 gigs , I get the error:

    >>
    >> Size of input file is not the problem.

    >
    > No, the size of the input file _is_ the problem.


    Yes.


    > On a version of perl compiled on a 32-bit machine without "uselargefiles=define",
    > things like segfault or "read on closed filehandle" will occur after reading
    > 2,147,483,647 bytes from the input file.


    Not on the systems I know. You can't even open the file, and the "read
    on closed filehandle" message occurs on the first read (not after 2GB)
    and happens because the open failed and the OP failed to check for that.

    hp

    --
    _ | Peter J. Holzer | Es ist ganz einfach ihn zu verstehen, wenn
    |_|_) | Sysadmin WSR | man nur alle wichtigen Worte im Satz durch
    | | | | andere ersetzt.
    __/ | http://www.hjp.at/ | -- Nils Ketelsen in danr
     
    Peter J. Holzer, Feb 4, 2007
    #11
  12. Russ

    Joe Smith Guest

    Peter J. Holzer wrote:

    >> On a version of perl compiled on a 32-bit machine without "uselargefiles=define",
    >> things like segfault or "read on closed filehandle" will occur after reading
    >> 2,147,483,647 bytes from the input file.

    >
    > Not on the systems I know. You can't even open the file, and the "read
    > on closed filehandle" message occurs on the first read (not after 2GB)
    > and happens because the open failed and the OP failed to check for that.


    Other ways of getting to 2GB is reading from a pipe or socket or reading from
    redirected STDIN. I've run into a case where the shell was able to
    open a 5GB file and set up redirection, but the program barfed after 2GB.

    -Joe
     
    Joe Smith, Feb 5, 2007
    #12
  13. Russ

    Guest

    Jim Gibson <> writes:

    > In article <>,
    > Russ <> wrote:
    >
    > > On Feb 1, 6:10 pm, ""
    > > <> wrote:
    > > > while (<IN>) {
    > > > next if /BEGIN/ .. /END/;
    > > > print OUT $_;
    > > > }

    >
    > > Thanks for in suggestions. It works fine but I'm not familiar with the
    > > use of '..' in the next statement.
    > > I have used the 'next if' construct with single regular expressions,
    > > is the above code just representing
    > > the entire group of characters between the "BEGIN" and "END" strings?
    > >
    > > If that's the case, and I don't have an "END", will the code print
    > > everything between "BEGIN" and the end of the file?

    >
    > It is the range operator in scalar context, which acts like a
    > flip-flop. See 'perldoc perlop' and search for 'Range Operators'. The
    > operator will be false until /BEGIN/ is true, true until /END/ is true,
    > then false again until the next /BEGIN/, etc.


    OK so far.

    > If you do not have an 'END',
    > the program will print from the final BEGIN to the end of the file.


    Not true, since $_ is read one line at a time, and is lost if not printed.
    This case is demonstrated below.

    % cat omit_begin_end.pl
    #!/usr/local/bin/perl
    use strict; use warnings;
    while(<DATA>){
    next if (/^BEGIN/ .. /^END/);
    print;
    } # wend
    __DATA__
    line1
    BEGIN
    line2
    #END
    line3
    % perl omit_begin_end.pl
    line1

    If #END is changed to END,
    then output also includes line3.

    If it were important to retain lines between BEGIN and eof with no END,
    then the pending lines need to be stored until it is known if they are needed
    or not, as shown in this tested example.

    #!/usr/local/bin/perl
    use strict; use warnings;
    my @buffer;
    while(<DATA>){
    if (/^BEGIN/ .. /^END/) {
    push @buffer,$_;
    next;
    }
    @buffer = (); # clear buffer
    print;
    } # wend
    print @buffer;
    __DATA__
    line1
    BEGIN
    line2
    #END
    line3

    Or if you dont mind slurping the whole file,
    this tested partial example code
    prints from the final BEGIN to the end of the file if no matching END.
    This does not use the range operator though.
    { local $/=undef; # undef the line-break
    $_ = <DATA>; # slurp the whole thing
    s/^BEGIN.*^END\n//gms; # s/// on whole thing
    print; # print the whole thing
    }

    --
    Joel
     
    , Feb 5, 2007
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt Kruse
    Replies:
    5
    Views:
    309
    Richard Cornford
    Sep 9, 2003
  2. Sam

    closed filehandle

    Sam, Sep 14, 2003, in forum: Perl Misc
    Replies:
    17
    Views:
    171
  3. Rex Gustavus Adolphus
    Replies:
    17
    Views:
    212
    Rex Gustavus Adolphus
    Mar 7, 2004
  4. Sisyphus
    Replies:
    6
    Views:
    151
  5. Jim

    prematurely closed filehandle

    Jim, Jun 29, 2006, in forum: Perl Misc
    Replies:
    9
    Views:
    154
    Sherm Pendley
    Jul 7, 2006
Loading...

Share This Page