Efficiently searching multiple files

Discussion in 'Perl Misc' started by Jorge, May 20, 2010.

  1. Jorge

    Jorge Guest

    I have a Linux directory full of shell scripts (~100 scripts with
    average
    size of ~60 lines) with many of them calling other scripts as
    depencencies.

    Not being the author of the scripts and with the immediate need to
    become familiar with
    the overall scope of this script directory, I needed to create a cross-
    reference chart
    that shows what script depends on what other script.

    My Perl script (below) works in terms of doing what I need, however,
    IMO, it appears to run a bit slow. I timed it at ~3.5 sec. Possibly
    that is it's max performance but something tells me (from other Perl
    script experiences) this one just isn't up to speed.

    I would appreciate any ideas or comments.

    Thanks, Jorge

    #!/usr/bin/perl -l

    use strict;
    use warnings;

    &scripts_dir('/some/dir/that/holds/scripts');

    sub scripts_dir {

    # set some vars
    my $dir = shift;
    my(@paths, $paths, @scripts, $scripts, $string);
    my($lines, $leaf, $header, $header2);
    local($_);

    # check dir can be opened for read
    unless (opendir(DIR, $dir)) {
    die "can't open $dir $!\n";
    closedir(DIR);
    return;
    }

    #
    # read dir, skip over system files and bak files
    # build array of script names
    # build array of full-path script names
    #
    foreach (readdir(DIR)) {
    next if $_ eq '.' || $_ eq '..' || $_ =~ /\.bak$/;
    push(@scripts, $_);
    $paths = $dir."/".$_;
    push(@paths, $paths);
    }
    closedir(DIR);

    # open ouput file, format and print headers
    open OUT, ">output" or die "cannot open output for writing $! ...
    \n";
    $header = sprintf("%-25s%-25s%s", "SCRIPT NAME", "FOUND IN",
    "USAGE");
    $header2 = sprintf("%-25s%-25s%s", "===========", "========",
    "=====");
    print OUT $header;
    print OUT $header2;

    # loop through each script name
    foreach $scripts(@scripts){

    # loop through each script in directory
    foreach my $paths(@paths){

    # get last leaf of script being searched --
    # if it matches itself; skip
    $leaf = get_leaf($paths);
    if($scripts eq $leaf) { next;}

    # open each script for searching
    open F, "$paths" or die "cannot open $paths for reading
    $! ...\n";

    while(my $lines = <F>) {

    # -l switch in place
    chomp($lines);

    # search for matches to the commonly-used command
    syntax
    if($lines =~ /\$\($scripts / || $lines =~ /\`
    $scripts /){

    # format to line up with headers
    $string = sprintf("%-25s%-25s%s", $scripts, $leaf,
    $lines);

    # print to file
    print OUT $string;
    }
    }
    }
    }
    # close I/O streams
    close(F);
    close(OUT);
    }

    #======================
    # subroutine get_leaf
    #======================
    sub get_leaf
    {
    # get arg(s)
    my($pathname) = @_;

    # split on leafs
    my @split_pathname = split( /[\\\/]/, $pathname);

    # grab last leaf
    my $leaf = pop( @split_pathname );

    # return bare script name (leaf)
    return( $leaf );
    }

    __END__
    Jorge, May 20, 2010
    #1
    1. Advertising

  2. Jorge

    Uri Guttman Guest

    >>>>> "J" == Jorge <> writes:

    J> #!/usr/bin/perl -l

    J> use strict;
    J> use warnings;

    good

    J> &scripts_dir('/some/dir/that/holds/scripts');

    don't call subs with & as it isn't needed with () and it is perl4
    style. there are other subtle issues that can bite you.

    J> sub scripts_dir {

    J> # set some vars
    J> my $dir = shift;
    J> my(@paths, $paths, @scripts, $scripts, $string);
    J> my($lines, $leaf, $header, $header2);

    don't declare vars before you need them. in many cases you can declare
    then when first used.

    J> local($_);

    why? actually i recommend against using $_ as much as possible (it does
    have its places) so you can used named vars which are easier to read and
    make the code better

    J> # check dir can be opened for read
    J> unless (opendir(DIR, $dir)) {

    use a lexical handle

    unless (opendir(my $dirh, $dir)) {

    J> die "can't open $dir $!\n";
    J> closedir(DIR);

    why close the handle if it never opened?

    J> return;

    you can just exit here instead.

    J> }

    better yet, use File::Slurp's read_dir.


    J> #
    J> # read dir, skip over system files and bak files
    J> # build array of script names
    J> # build array of full-path script names
    J> #
    J> foreach (readdir(DIR)) {
    J> next if $_ eq '.' || $_ eq '..' || $_ =~ /\.bak$/;

    read_dir skips . and .. for you.


    J> push(@scripts, $_);
    J> $paths = $dir."/".$_;
    J> push(@paths, $paths);
    J> }
    J> closedir(DIR);

    that can all be done so much simpler like this (untested):

    my @paths = map "$dir/$_", grep !/\.bak$/, read_dir $dir ;

    J> # open ouput file, format and print headers
    J> open OUT, ">output" or die "cannot open output for writing $!
    J> ...

    ditto for lexical handles

    J> \n";
    J> $header = sprintf("%-25s%-25s%s", "SCRIPT NAME", "FOUND IN",
    J> "USAGE");
    J> $header2 = sprintf("%-25s%-25s%s", "===========", "========",
    J> "=====");

    you use the same sprintf format three times. put that into a variable
    above so you can change it in one place.

    J> print OUT $header;
    J> print OUT $header2;

    no need for two print calls.

    print OUT $header, $header2 ;


    J> # loop through each script name
    J> foreach $scripts(@scripts){

    foreach my $scripts (@scripts) {

    that is how you can declare vars locally. and use more horizontal
    whitespace. others need to read your code so think about them when you
    write it. and YOU are an other too! :)

    i noticed this below but the point is true here. don't use plural names
    for singular things. $scripts has a single script name

    J> # loop through each script in directory
    J> foreach my $paths(@paths){

    you used my there. why not in the previous loop?

    J> # get last leaf of script being searched --
    J> # if it matches itself; skip
    J> $leaf = get_leaf($paths);
    J> if($scripts eq $leaf) { next;}

    slightly faster as you don't need a block entry on next. also a better
    style for simple flow control like this.

    next if $scripts eq $leaf ;

    J> # open each script for searching
    J> open F, "$paths" or die "cannot open $paths for reading

    don't quote scalar vars as it can lead to subtle bugs. it is not needed here.

    J> $! ...\n";

    J> while(my $lines = <F>) {

    $lines is a single line so make that singular. using a plural name
    implies an array or array ref

    J> # -l switch in place

    that only affects one liners using -p or -n. no effect on regular code

    J> chomp($lines);

    J> # search for matches to the commonly-used command
    J> syntax
    J> if($lines =~ /\$\($scripts / || $lines =~ /\`
    J> $scripts /){

    this doesn't make sense. are you checking if the current script refers
    to itself??

    J> # format to line up with headers
    J> $string = sprintf("%-25s%-25s%s", $scripts, $leaf,
    J> $lines);

    J> # print to file
    J> print OUT $string;

    since you just print the string, you can use printf directly.

    J> }
    J> }
    J> }
    J> }
    J> # close I/O streams
    J> close(F);
    J> close(OUT);
    J> }

    J> #======================
    J> # subroutine get_leaf
    J> #======================
    J> sub get_leaf
    J> {
    J> # get arg(s)
    J> my($pathname) = @_;

    J> # split on leafs
    J> my @split_pathname = split( /[\\\/]/, $pathname);

    gack!!

    File::Basename does this for you and simpler.

    J> # grab last leaf
    J> my $leaf = pop( @split_pathname );

    you could just return the popped value directly. hell, you can do that
    in the split line too:

    return( (split( /[\\\/]/, $pathname)[-1] );

    as for speedups, i can't help much since i don't have your input
    data. nothing seems oddly slow looking but your core loops are deep and
    can be done better. slurping in the entire file (File::Slurp) and
    scanning for those lines in one regex would be noticeably faster. and
    your core logic is suspect as it doesn't seem to check for calling other
    scripts from a given one.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
    Uri Guttman, May 20, 2010
    #2
    1. Advertising

  3. Jorge

    J. Gleixner Guest

    Jorge wrote:
    > I have a Linux directory full of shell scripts (~100 scripts with
    > average
    > size of ~60 lines) with many of them calling other scripts as
    > depencencies.
    >
    > Not being the author of the scripts and with the immediate need to
    > become familiar with
    > the overall scope of this script directory, I needed to create a cross-
    > reference chart
    > that shows what script depends on what other script.
    >
    > My Perl script (below) works in terms of doing what I need, however,
    > IMO, it appears to run a bit slow. I timed it at ~3.5 sec. Possibly
    > that is it's max performance but something tells me (from other Perl
    > script experiences) this one just isn't up to speed.


    3 seconds isn't fast enough??.. just write it, run it, and move on..
    sheesh..

    >
    > I would appreciate any ideas or comments.
    >
    > Thanks, Jorge
    >
    > #!/usr/bin/perl -l
    >
    > use strict;
    > use warnings;
    >
    > &scripts_dir('/some/dir/that/holds/scripts');

    Remove the '&' there.
    >
    > sub scripts_dir {
    >
    > # set some vars
    > my $dir = shift;
    > my(@paths, $paths, @scripts, $scripts, $string);
    > my($lines, $leaf, $header, $header2);
    > local($_);


    Declare your vars in the smallest possible scope.
    >
    > # check dir can be opened for read
    > unless (opendir(DIR, $dir)) {
    > die "can't open $dir $!\n";


    After die is called, nothing after this point is called..
    > closedir(DIR);
    > return;
    > }
    >
    > #
    > # read dir, skip over system files and bak files
    > # build array of script names
    > # build array of full-path script names
    > #
    > foreach (readdir(DIR)) {
    > next if $_ eq '.' || $_ eq '..' || $_ =~ /\.bak$/;
    > push(@scripts, $_);
    > $paths = $dir."/".$_;
    > push(@paths, $paths);
    > }
    > closedir(DIR);
    >
    > # open ouput file, format and print headers
    > open OUT, ">output" or die "cannot open output for writing $! ...
    > \n";

    use 3 argument open.

    open( my $out, '>', 'output' ) || die "can't open output: $!";
    > $header = sprintf("%-25s%-25s%s", "SCRIPT NAME", "FOUND IN",
    > "USAGE");
    > $header2 = sprintf("%-25s%-25s%s", "===========", "========",
    > "=====");
    > print OUT $header;
    > print OUT $header2;


    Just use printf.

    printf OUT "%-25s%-25s%s\n", "SCRIPT NAME", "FOUND IN", "USAGE";

    or printf $out...
    >
    > # loop through each script name
    > foreach $scripts(@scripts){

    for my $script( @scripts )
    >
    > # loop through each script in directory
    > foreach my $paths(@paths){

    $path seems better than $paths
    >
    > # get last leaf of script being searched --
    > # if it matches itself; skip
    > $leaf = get_leaf($paths);

    my $leaf = ( split( /\//, $paths ) )[-1];

    > if($scripts eq $leaf) { next;}

    next if $scripts eq $leaf;
    >
    > # open each script for searching
    > open F, "$paths" or die "cannot open $paths for reading
    > $! ...\n";

    open( my $file, '<', $path ) || die "can't open $path: $!;
    >
    > while(my $lines = <F>) {

    my $line instead of my $lines.
    >
    > # -l switch in place
    > chomp($lines);
    >
    > # search for matches to the commonly-used command
    > syntax
    > if($lines =~ /\$\($scripts / || $lines =~ /\`
    > $scripts /){
    >
    > # format to line up with headers
    > $string = sprintf("%-25s%-25s%s", $scripts, $leaf,
    > $lines);
    >
    > # print to file
    > print OUT $string;

    again printf
    > }
    > }


    Should close F or $file here.

    > }
    > }
    > # close I/O streams
    > close(F);

    ^^^ should be just after the end of the while.. above.
    > close(OUT);
    > }
    >
    > #======================
    > # subroutine get_leaf
    > #======================
    > sub get_leaf
    > {
    > # get arg(s)
    > my($pathname) = @_;
    >
    > # split on leafs
    > my @split_pathname = split( /[\\\/]/, $pathname);
    >
    > # grab last leaf
    > my $leaf = pop( @split_pathname );
    >
    > # return bare script name (leaf)
    > return( $leaf );
    > }
    >
    > __END__
    >
    >
    J. Gleixner, May 20, 2010
    #3
  4. Jorge

    Jorge Guest

    I wasn't expecting a full critique but I'll take it -- learning is
    good so thanks for your effort and time.

    Most of your points are obvious (once read) and some I will have to
    digest.

    Thanks again



    On May 20, 12:47 pm, "Uri Guttman" <> wrote:
    > >>>>> "J" == Jorge  <> writes:

    >
    >   J> #!/usr/bin/perl -l
    >
    >   J> use strict;
    >   J> use warnings;
    >
    > good
    >
    >   J> &scripts_dir('/some/dir/that/holds/scripts');
    >
    > don't call subs with & as it isn't needed with () and it is perl4
    > style. there are other subtle issues that can bite you.
    >
    >   J> sub scripts_dir {
    >
    >   J>     # set some vars
    >   J>     my $dir = shift;
    >   J>     my(@paths, $paths, @scripts, $scripts, $string);
    >   J>     my($lines, $leaf, $header, $header2);
    >
    > don't declare vars before you need them. in many cases you can declare
    > then when first used.
    >
    >   J>     local($_);
    >
    > why? actually i recommend against using $_ as much as possible (it does
    > have its places) so you can used named vars which are easier to read and
    > make the code better
    >
    >   J>     # check dir can be opened for read
    >   J>     unless (opendir(DIR, $dir)) {
    >
    > use a lexical handle
    >
    > unless (opendir(my $dirh, $dir)) {
    >
    >   J>         die "can't open $dir $!\n";
    >   J>         closedir(DIR);
    >
    > why close the handle if it never opened?
    >
    >   J>         return;
    >
    > you can just exit here instead.
    >
    >   J>     }
    >
    > better yet, use File::Slurp's read_dir.
    >
    >   J>     #
    >   J>     # read dir, skip over system files and bak files
    >   J>     # build array of script names
    >   J>     # build array of full-path script names
    >   J>     #
    >   J>     foreach (readdir(DIR)) {
    >   J>         next if $_ eq '.' || $_ eq '..' || $_ =~ /\.bak$/;
    >
    > read_dir skips . and .. for you.
    >
    >   J>         push(@scripts, $_);
    >   J>         $paths = $dir."/".$_;
    >   J>         push(@paths, $paths);
    >   J>     }
    >   J>     closedir(DIR);
    >
    > that can all be done so much simpler like this (untested):
    >
    > my @paths = map "$dir/$_", grep !/\.bak$/, read_dir $dir ;
    >
    >   J>     # open ouput file, format and print headers
    >   J>     open OUT, ">output" or die "cannot open output for writing$!
    >   J>     ...
    >
    > ditto for lexical handles
    >
    >   J> \n";
    >   J>     $header = sprintf("%-25s%-25s%s", "SCRIPT NAME", "FOUND IN",
    >   J> "USAGE");
    >   J>     $header2 = sprintf("%-25s%-25s%s", "===========", "========",
    >   J> "=====");
    >
    > you use the same sprintf format three times. put that into a variable
    > above so you can change it in one place.
    >
    >   J>     print OUT $header;
    >   J>     print OUT $header2;
    >
    > no need for two print calls.
    >
    > print OUT $header, $header2 ;
    >
    >   J>     # loop through each script name
    >   J>     foreach $scripts(@scripts){
    >
    >         foreach my $scripts (@scripts) {
    >
    > that is how you can declare vars locally. and use more horizontal
    > whitespace. others need to read your code so think about them when you
    > write it. and YOU are an other too! :)
    >
    > i noticed this below but the point is true here. don't use plural names
    > for singular things. $scripts has a single script name
    >
    >   J>         # loop through each script in directory
    >   J>         foreach my $paths(@paths){
    >
    > you used my there. why not in the previous loop?
    >
    >   J>             # get last leaf of script being searched --
    >   J>             # if it matches itself; skip
    >   J>             $leaf = get_leaf($paths);
    >   J>             if($scripts eq $leaf) { next;}
    >
    > slightly faster as you don't need a block entry on next. also a better
    > style for simple flow control like this.
    >
    >         next if $scripts eq $leaf ;
    >
    >   J>             # open each script for searching
    >   J>             open F, "$paths" or die "cannot open $paths for reading
    >
    > don't quote scalar vars as it can lead to subtle bugs. it is not needed here.
    >
    >   J> $! ...\n";
    >
    >   J>             while(my $lines = <F>) {
    >
    > $lines is a single line so make that singular. using a plural name
    > implies an array or array ref
    >
    >   J>                 # -l switch in place
    >
    > that only affects one liners using -p or -n. no effect on regular code
    >
    >   J>                 chomp($lines);
    >
    >   J>                 # search for matches to the commonly-used command
    >   J> syntax
    >   J>                 if($lines =~ /\$\($scripts / || $lines =~ /\`
    >   J> $scripts /){
    >
    > this doesn't make sense. are you checking if the current script refers
    > to itself??
    >
    >   J>                     # format to line up with headers
    >   J>                     $string = sprintf("%-25s%-25s%s", $scripts, $leaf,
    >   J> $lines);
    >
    >   J>                     # print to file
    >   J>                     print OUT $string;
    >
    > since you just print the string, you can use printf directly.
    >
    >   J>                 }
    >   J>             }
    >   J>         }
    >   J>     }
    >   J>     # close I/O streams
    >   J>     close(F);
    >   J>     close(OUT);
    >   J> }
    >
    >   J> #======================
    >   J> # subroutine get_leaf
    >   J> #======================
    >   J> sub get_leaf
    >   J> {
    >   J>         # get arg(s)
    >   J>         my($pathname) = @_;
    >
    >   J>         # split on leafs
    >   J>         my @split_pathname = split( /[\\\/]/, $pathname);
    >
    > gack!!
    >
    > File::Basename does this for you and simpler.
    >
    >   J>         # grab last leaf
    >   J>         my $leaf = pop( @split_pathname );
    >
    > you could just return the popped value directly. hell, you can do that
    > in the split line too:
    >
    >         return( (split( /[\\\/]/, $pathname)[-1] );
    >
    > as for speedups, i can't help much since i don't have your input
    > data. nothing seems oddly slow looking but your core loops are deep and
    > can be done better. slurping in the entire file (File::Slurp) and
    > scanning for those lines in one regex would be noticeably faster. and
    > your core logic is suspect as it doesn't seem to check for calling other
    > scripts from a given one.
    >
    > uri
    >
    > --
    > Uri Guttman  ------    --------  http://www.sysarch.com--
    > -----  Perl Code Review , Architecture, Development, Training, Support ------
    > ---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com---------
    Jorge, May 20, 2010
    #4
  5. Jorge

    Jim Gibson Guest

    In article
    <>,
    Jorge <> wrote:

    > I have a Linux directory full of shell scripts (~100 scripts with
    > average
    > size of ~60 lines) with many of them calling other scripts as
    > depencencies.
    >
    > Not being the author of the scripts and with the immediate need to
    > become familiar with
    > the overall scope of this script directory, I needed to create a cross-
    > reference chart
    > that shows what script depends on what other script.
    >
    > My Perl script (below) works in terms of doing what I need, however,
    > IMO, it appears to run a bit slow. I timed it at ~3.5 sec. Possibly
    > that is it's max performance but something tells me (from other Perl
    > script experiences) this one just isn't up to speed.
    >
    > I would appreciate any ideas or comments.
    >
    > Thanks, Jorge
    >


    Uri and J. have given you thorough critiques of your posted program. I
    can only add a few additional suggestions.

    Concerning speedup, you have implemented an O(N**2) algorithm. You are
    reading each of N files N-1 times, looking for a different script name
    each time you read each file. You should read each file only once, and
    look for all of your script names in each line. You shouldn't exclude
    the script your are reading from this search, since scripts can call
    themselves. I would also not look in comments, since it is very common
    to put script names in comments, and you don't want those.

    For how to search for many things at once, see the advice in 'perldoc
    -q many' "How do I efficiently match many regular expressions at once?"

    You are concatenating the directory name and file name into a path,
    then taking many steps to strip the directory name from the path to
    retrieve the file name. Don't bother, since you are only working in one
    directory and the $dir variable only contains one value. Add $dir to
    the open statement or a temp variable:

    my $filename = "$dir/$scripts";
    open( my $f, '<', $filename ) ...

    Then you don't need the @paths array at all.

    [program snipped]

    --
    Jim Gibson
    Jim Gibson, May 20, 2010
    #5
  6. Jorge

    Jorge Guest

    Yeah Uri and J gave the program a good going-over so I have plenty to
    look at there and now I can add your pointers to the list.

    You mention one very good point that should have been obvious yet it
    completely got past me ... that beinging I am searching every file for
    each single pattern where I should be looking at each file only once
    for all patterns.

    Ihanks again, Jorge


    On May 20, 2:30 pm, Jim Gibson <> wrote:
    > In article
    > <>,
    >
    >
    >
    >
    >
    > Jorge <> wrote:
    > > I have a Linux directory full of shell scripts (~100 scripts with
    > > average
    > > size of ~60 lines) with many of them calling other scripts as
    > > depencencies.

    >
    > > Not being the author of the scripts and with the immediate need to
    > > become familiar with
    > > the overall scope of this script directory, I needed to create a cross-
    > > reference chart
    > > that shows what script depends on what other script.

    >
    > > My Perl script (below) works in terms of doing what I need, however,
    > > IMO, it appears to run a bit slow. I timed it at ~3.5 sec. Possibly
    > > that is it's max performance but something tells me (from other Perl
    > > script experiences) this one just isn't up to speed.

    >
    > > I would appreciate any ideas or comments.

    >
    > > Thanks, Jorge

    >
    > Uri and J. have given you thorough critiques of your posted program. I
    > can only add a few additional suggestions.
    >
    > Concerning speedup, you have implemented an O(N**2) algorithm. You are
    > reading each of N files N-1 times, looking for a different script name
    > each time you read each file. You should read each file only once, and
    > look for all of your script names in each line. You shouldn't exclude
    > the script your are reading from this search, since scripts can call
    > themselves. I would also not look in comments, since it is very common
    > to put script names in comments, and you don't want those.
    >
    > For how to search for many things at once, see the advice in 'perldoc
    > -q many' "How do I efficiently match many regular expressions at once?"
    >
    > You are concatenating the directory name and file name into a path,
    > then taking many steps to strip the directory name from the path to
    > retrieve the file name. Don't bother, since you are only working in one
    > directory and the $dir variable only contains one value. Add $dir to
    > the open statement or a temp variable:
    >
    >   my $filename = "$dir/$scripts";
    >   open( my $f, '<', $filename ) ...
    >
    > Then you don't need the @paths array at all.
    >
    > [program snipped]
    >
    > --
    > Jim Gibson- Hide quoted text -
    >
    > - Show quoted text -
    Jorge, May 20, 2010
    #6
  7. Jorge

    Jorge Guest

    On May 20, 6:11 pm, Tad McClellan <> wrote:
    > Jorge <> wrote:
    > > Yeah Uri and J gave the program a good going-over so I have plenty to
    > > look at there and now I can add your pointers to the list.

    >
    > [ snip TOFU. Please stop doing that! ]
    >
    > TOFU is "Top Over Fullquote Under" also known as "top posting".
    >
    > You were asked nicely (twice!) to stop doing that years ago,
    > yet you continue.
    >
    > I hereby banish you to perpetual invisibility!
    >
    > *plonk*
    >
    > --
    > Tad McClellan
    > email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
    > The above message is a Usenet post.
    > I don't recall having given anyone permission to use it on a Web site.


    Yes I was warned (and IMO not in a nice way) but it has been 3 years
    since I posted and simply had forgotten about that incident. Sometimes
    sh*t happens.

    I will make every effort not to do it again.
    Jorge, May 21, 2010
    #7
  8. Jorge

    Jorge Guest

    On May 21, 1:01 am, bugbear <bugbear@trim_papermule.co.uk_trim> wrote:
    > Jorge wrote:
    > > I have a Linux directory full of shell scripts (~100 scripts with
    > > average
    > > size of ~60 lines) with many of them calling other scripts as
    > > depencencies.

    >
    > > Not being the author of the scripts and with the immediate need to
    > > become familiar with
    > > the overall scope of this script directory, I needed to create a cross-
    > > reference chart
    > > that shows what script depends on what other script.

    >
    > > My Perl script (below) works in terms of doing what I need, however,
    > > IMO, it appears to run a bit slow. I timed it at ~3.5 sec. Possibly
    > > that is it's max performance but something tells me (from other Perl
    > > script experiences) this one just isn't up to speed.

    >
    > > I would appreciate any ideas or comments.

    >
    > This is a run-once script. Any time spent optimising
    > it (e.g. 10 minutes) is a net loss to you of ten minutes,
    > bar what ever time your optimisation saves from 3.5 seconds.
    >
    > 10 minutes then.
    >
    >     BugBear- Hide quoted text -
    >
    > - Show quoted text -


    Point taken. I probably should have pointed out that the focus of my
    interest in the program execution speed is not for just this one-time
    script. Certainly you (and others) are spot on that 3 seconds is of
    little consequence. However, for future use, I.E., when the day comes
    that I have to deal with data 100x the size of this data I want to be
    using the most efficient method(s).

    Thanks, Jorge
    Jorge, May 21, 2010
    #8
  9. Jorge

    Jorge Guest

    On May 21, 12:22 pm, Tad McClellan <> wrote:
    > [ Thank you for not top-posting ]
    >
    >
    >
    >
    >
    > Jorge <> wrote:
    > > On May 20, 6:11 pm, Tad McClellan <> wrote:
    > >> Jorge <> wrote:
    > >> > Yeah Uri and J gave the program a good going-over so I have plenty to
    > >> > look at there and now I can add your pointers to the list.

    >
    > >> [ snip TOFU. Please stop doing that! ]

    >
    > >> TOFU is "Top Over Fullquote Under" also known as "top posting".

    >
    > >> You were asked nicely (twice!) to stop doing that years ago,
    > >> yet you continue.

    >
    > >> I hereby banish you to perpetual invisibility!

    >
    > >> *plonk*

    >
    > >> --
    > >> Tad McClellan
    > >> email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
    > >> The above message is a Usenet post.
    > >> I don't recall having given anyone permission to use it on a Web site.

    >
    > It is bad manners to quote .sigs too...
    >
    > You really should learn how to post to Usenet if you plan
    > to post to Usenet...
    >
    >    http://web.presby.edu/~nnqadmin/nnq/nquote.html
    >
    > Have you seen the Posting Guidelines that are posted here frequently?
    >
    > Their purpose is to help avoid this sort of thing...
    >
    > > Yes I was warned (and IMO not in a nice way)

    >
    > Here's what I said:
    >
    >     [ snip TOFU. Please stop doing that! ]
    >
    >     [ snip yet more TOFU. Please stop doing that before it is too late! ]
    >
    > What was not nice about it?
    >
    > I was trying to save you from getting killfiled.
    >
    > I failed.
    >
    > > but it has been 3 years
    > > since I posted and simply had forgotten about that incident.

    >
    > Not top-posting is standard netiquette, the incident would never
    > have happened if you just deigned to post in the accepted manner
    > from the get-go.
    >
    > > Sometimes
    > > sh*t happens.

    >
    > Sometimes people are banished to eternal invisibility.
    >
    > --
    > Tad McClellan
    > email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
    > The above message is a Usenet post.
    > I don't recall having given anyone permission to use it on a Web site.- Hide quoted text -
    >
    > - Show quoted text -


    I'm not so thinned-skinned that I can't take criticism. On the other
    hand, the one comment years ago "before it's too late", came across as
    a threat and an ambiguous one at that. If it was intended to
    communicate the fact that I had violated the TOS (more than once) and
    I was in jeopardy of being tombstoned ... then it should have been
    worded as such and not left as an open-ended statement that left me
    guessing what "or else" means.

    Fast forward to today ... I have pled guilty and right here and now,
    apologize to the forum for the disruption and promise to not top-post
    again. There's not much more I can say or do.

    Before banishing me to eternal invisibility, I should at least have
    the chance to face the wheel.
    Jorge, May 21, 2010
    #9
  10. Jorge

    Guest

    On Fri, 21 May 2010 12:50:52 -0700 (PDT), Jorge <> wrote:

    >Before banishing me to eternal invisibility, I should at least have
    >the chance to face the wheel.


    The Pit, The Pendulem
    , May 21, 2010
    #10
  11. Jorge

    John Bokma Guest

    Tad McClellan <> writes:

    > Here's what I said:
    >
    > [ snip TOFU. Please stop doing that! ]
    >
    > [ snip yet more TOFU. Please stop doing that before it is too late! ]
    >
    > What was not nice about it?


    Maybe you should stop addressing people like they are 3 year old? I've
    mentioned this before but you really DO NOT HAVE TO POST.

    Ah, well, there was a time I was a bit like you. I have grown up since,
    so there is hope.

    --
    John Bokma j3b

    Hacking & Hiking in Mexico - http://johnbokma.com/
    http://castleamber.com/ - Perl & Python Development
    John Bokma, May 22, 2010
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. darrel
    Replies:
    2
    Views:
    314
    darrel
    Aug 9, 2005
  2. Replies:
    4
    Views:
    951
    M.E.Farmer
    Feb 13, 2005
  3. Livin
    Replies:
    5
    Views:
    322
    Ben Sizer
    Jan 18, 2006
  4. sasuke
    Replies:
    5
    Views:
    2,008
    Tom Anderson
    Jul 6, 2008
  5. Brad
    Replies:
    7
    Views:
    1,284
    Jorgen Grahn
    Jun 25, 2010
Loading...

Share This Page