Script "terminates" when processing large numbers of files

Discussion in 'Perl Misc' started by Scott Stark, Aug 2, 2003.

  1. Scott Stark

    Scott Stark Guest

    Hi, I'm running a script that reads through large numbers of html
    files (1500-2000 or so) in each of about 20 directories, searching for
    strings in the files.

    For some reason the script quits midway through, and I get a
    "Terminated" message. It quits while checking a batch of files at a
    different point in the file system every time, so I know it's not a
    code error. In fact if I limit the total number of files processed to
    a couple of hundred, the script runs fine.

    Is this some kind of memory problem or other resource problem? I've
    tried breaking up each directory pass into separate subroutine calls,
    and even broken up the individual directory lists so that they process
    in smaller batches of 300 each, thinking that might free up resources.
    Something like this:

    foreach $d (@dirs){
    my @files = glob("$basedir/$d/*.html $basedir/$d/*.htm");
    if(scalar(@files) > 300){
    ... # make smaller lists called my(@shortList) of 300 each
    search_files(@shortList);
    }
    }

    sub search_files {
    my @files = @_;
    ... # search through each file
    }

    I've tried running the script with perl -d and #! /usr/bin/perl -w
    with no errors and get the same results, but at different points in
    the file system.

    Any thoughts? If it's a memory problem, is there some way to free up
    memory?

    thanks,
    Scott
    Scott Stark, Aug 2, 2003
    #1
    1. Advertising

  2. Scott Stark

    Scott Stark Guest

    Tim Heaney <> wrote in message news:<>...
    > Perhaps the glob is hitting the expansion limit. Try reading the
    > directory yourself...something like


    Hi Tim, well that didn't work either. I've done some further testing
    and discovered that the "termination" is happening not in the glob (or
    read) but in the search_files() subroutine, always (as far as I can
    gather) after it's closed one file in the @files list and before it
    opens the next.

    Here's an abbreviated version of the search_files subroutine that's
    called for each directory:

    sub search_files{
    my(@files) = @_;
    my(@searchStrings) = split(/\s+/,param('terms'));
    foreach $f (@files){
    open(F, "$f") || on_error("Can't open file $f for reading");
    while($line=<F>){
    for($s=0;$s<scalar(@searchStrings);$s++){
    $line =~ s/($searchStrings[$s])/<font
    color=\"blue\"><b>$1<\/b><\/font>/gi
    and $found{$searchStrings[$s]}=$line
    and next if($line =~/$searchStrings[$s]/i);
    }
    }
    close(F);
    }

    Not much unusual going on here -- perhaps Gregory is correct, there's
    a time limit? The whole thing never takes more than a couple of
    minutes though. And where it stops varies every time.

    thanks
    Scott
    Scott Stark, Aug 3, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Zsolt Koppany
    Replies:
    2
    Views:
    7,107
    SaintMagoo
    Mar 5, 2011
  2. Earl Eiland
    Replies:
    1
    Views:
    309
    Michael Hoffman
    Mar 12, 2005
  3. Stefan Neumann

    Daemon terminates unexpected

    Stefan Neumann, Feb 6, 2006, in forum: Python
    Replies:
    1
    Views:
    317
    Steve Horsley
    Feb 10, 2006
  4. Laurent Bugnion
    Replies:
    2
    Views:
    315
    Laurent Bugnion
    Aug 20, 2006
  5. Sherm Pendley

    processing large numbers/values/figures

    Sherm Pendley, Jul 10, 2006, in forum: Perl Misc
    Replies:
    22
    Views:
    245
    Lukas Ruf
    Jul 12, 2006
Loading...

Share This Page