Sorting and Writing Effecient Code

Discussion in 'Perl Misc' started by banker123, Nov 2, 2006.

  1. banker123

    banker123 Guest

    I have the code below that opens a directory, determines the current
    date and extracts the file names that start with the current date, then
    loops through the contents of those files to extract data in a scalar
    context.

    Question
    1. How do I sort the variables in a scalar context not an array?
    Should I be reading the variables I have extracted into an array?

    2. This program executes every 15 seconds to output the contents of
    the directory. The output is used to control work flow and meet crucial
    deadlines. Is this effeciently written? Any suggestions?

    My first "production" perl program, after reading and learning perl for
    about a month.

    #Open directory
    my $dir="G:/Formware/files/";
    opendir DH, $dir or die "Cannot open$!";

    #Get system date in the format mmddyy
    @months = qw(01 02 03 04 05 06 07 08 09 10 11 12);
    @days = qw(01 02 03 04 05 06 07 08 09 10..31);
    ($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek,
    $dayOfYear, $daylightSavings) = localtime();
    $dayOfMonth=sprintf("%02d", $dayOfMonth);
    $date = "$months[$month]$dayOfMonth06";

    #Read each file beginning with system date from directory
    @files=grep(/^$date/,readdir(DH));
    closedir (DH);


    foreach $file (@files) {
    open (data, "$dir$file") or die "Cannot open file: $!";
    while ( <data> ) {
    if ( /JOBNAME/ ) {
    $job = substr($_,8,8);
    chomp($job);
    }
    elsif ( /\.TIF\s{12}/ ) {
    my $box = substr($_,73,6);
    printf "$box $job $file\n";
    last;
    }
    }}
    banker123, Nov 2, 2006
    #1
    1. Advertising

  2. banker123

    Guest

    "banker123" <> wrote:
    > I have the code below that opens a directory, determines the current
    > date and extracts the file names that start with the current date, then
    > loops through the contents of those files to extract data in a scalar
    > context.


    You seem to be mis-using the term "scalar context", in a way that
    renders your question meaningless.

    >
    > Question
    > 1. How do I sort the variables in a scalar context not an array?
    > Should I be reading the variables I have extracted into an array?


    If you want to sort them, then you need to store them. Either an array,
    or maybe a hash.

    > 2. This program executes every 15 seconds to output the contents of
    > the directory. The output is used to control work flow and meet crucial
    > deadlines. Is this effeciently written? Any suggestions?


    You don't seem to be using strict. You should. As long as the program
    does whatever it needs to do fast enough so that it is not the bottleneck,
    it is hard to see how its efficiency would matter.

    >
    > My first "production" perl program, after reading and learning perl for
    > about a month.
    >
    > #Open directory
    > my $dir="G:/Formware/files/";
    > opendir DH, $dir or die "Cannot open$!";


    You should probably use a lexical dir handle.

    opendir my $dh, $dir or die $!;

    > #Read each file beginning with system date from directory
    > @files=grep(/^$date/,readdir(DH));


    If the list can be big, you may not want to read everything into memory
    at once.

    while (defined (my $file=readdir($dh))) {
    next unless $file =~ /^$date/;

    Every time you execute this script, it reprocesses the same files it
    already processed the last time. That is the fundamental inefficiency,
    but how to fix it depends on external factors.

    By the way, there was no indication anywhere in your code about what it is
    you want to sort, or how you want it sorted.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 2, 2006
    #2
    1. Advertising

  3. banker123

    banker123 Guest

    > > Question
    > > 1. How do I sort the variables in a scalar context not an array?
    > > Should I be reading the variables I have extracted into an array?

    >
    > If you want to sort them, then you need to store them. Either an array,
    > or maybe a hash.


    How do I store the variables into an array, push or pop command, does
    this have much overhead? I woul like the saort to point to a sort
    order file that containas priorites. Example:

    Variables
    Box 1
    Box 2
    Box 3

    Prioroties table
    Box 3
    Box 1
    Box 2

    Sorted Array
    Box 3
    Box 1
    Box 2


    > Every time you execute this script, it reprocesses the same files it
    > already processed the last time. That is the fundamental inefficiency,
    > but how to fix it depends on external factors.


    Understood this is running real-time therefore the re-processing is
    acceptable.
    banker123, Nov 2, 2006
    #3
  4. banker123 wrote:
    > I have the code below that opens a directory, determines the current
    > date and extracts the file names that start with the current date, then
    > loops through the contents of those files to extract data in a scalar
    > context.
    >
    > Question
    > 1. How do I sort the variables in a scalar context not an array?
    > Should I be reading the variables I have extracted into an array?
    >
    > 2. This program executes every 15 seconds to output the contents of
    > the directory. The output is used to control work flow and meet crucial
    > deadlines. Is this effeciently written? Any suggestions?
    >
    > My first "production" perl program, after reading and learning perl for
    > about a month.
    >
    > #Open directory
    > my $dir="G:/Formware/files/";
    > opendir DH, $dir or die "Cannot open$!";
    >
    > #Get system date in the format mmddyy
    > @months = qw(01 02 03 04 05 06 07 08 09 10 11 12);
    > @days = qw(01 02 03 04 05 06 07 08 09 10..31);


    You don't need those two arrays.

    > ($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek,
    > $dayOfYear, $daylightSavings) = localtime();
    > $dayOfMonth=sprintf("%02d", $dayOfMonth);
    > $date = "$months[$month]$dayOfMonth06";


    That won't work because the variable $dayOfMonth06 doesn't contain any data.
    You probably want:

    my ( $day, $month, $year ) = ( localtime )[ 3, 4, 5 ];
    my $date = sprintf '%02d%02d%02d', $month + 1, $day, $year % 100;


    > #Read each file beginning with system date from directory
    > @files=grep(/^$date/,readdir(DH));
    > closedir (DH);
    >
    >
    > foreach $file (@files) {
    > open (data, "$dir$file") or die "Cannot open file: $!";
    > while ( <data> ) {
    > if ( /JOBNAME/ ) {
    > $job = substr($_,8,8);
    > chomp($job);


    Do you really need to use chomp?

    > }
    > elsif ( /\.TIF\s{12}/ ) {
    > my $box = substr($_,73,6);
    > printf "$box $job $file\n";
    > last;
    > }
    > }}



    John
    --
    Perl isn't a toolbox, but a small machine shop where you can special-order
    certain sorts of tools at low cost and in short order. -- Larry Wall
    John W. Krahn, Nov 2, 2006
    #4
  5. Mirco Wahab wrote:
    > Thus spoke banker123 (on 2006-11-02 20:36):
    >
    >>2. This program executes every 15 seconds to output the contents of
    >>the directory. The output is used to control work flow and meet crucial
    >>deadlines. Is this effeciently written? Any suggestions?
    >>
    >>My first "production" perl program, after reading and learning perl for
    >>about a month.

    >
    > This is really good for a "first time program" ;-)
    >
    > I don't know your files, but I straightened the
    > program a bit (all programmers do so with other
    > peoples code ;-), maybe it won't work (in the third
    > part) because I had to guess what your data is like.
    >
    > ...
    > # date stuff
    > my ($d, $m, $y) = (localtime)[3..5];
    > my $date = sprintf "%02d"x3, $m+1, $d, $y-100;

    ^^^^^^
    Subtracting 100 from $y will only work for 21st century dates.

    $ perl -le'print( ( 1999 - 1900 ) - 100 )'
    -1



    John
    --
    Perl isn't a toolbox, but a small machine shop where you can special-order
    certain sorts of tools at low cost and in short order. -- Larry Wall
    John W. Krahn, Nov 2, 2006
    #5
  6. banker123

    Guest

    "banker123" <> wrote:
    > > > Question
    > > > 1. How do I sort the variables in a scalar context not an array?
    > > > Should I be reading the variables I have extracted into an array?

    > >
    > > If you want to sort them, then you need to store them. Either an
    > > array, or maybe a hash.

    >
    > How do I store the variables into an array, push or pop command, does
    > this have much overhead?


    Perl's "push" has about as low overhead as you can have and still be using
    Perl. If you need lower overhead than push, you are probably using the
    wrong language in the first place.

    > I woul like the saort to point to a sort
    > order file that containas priorites. Example:
    >
    > Variables
    > Box 1
    > Box 2
    > Box 3
    >
    > Prioroties table
    > Box 3
    > Box 1
    > Box 2
    >
    > Sorted Array
    > Box 3
    > Box 1
    > Box 2


    From this example it isn't all that clear what you mean. "Sorted Array"
    happens to be an exact copy of "Prioroties table" in this case, but
    presumably that is just an accident? I'm guessing you want to keep the
    things in priority table's order, but include them only if they also appear
    in "Variables". In that case, you would store the "Variables" in a hash.
    Then you would loop through priority table, printing out only those things
    which appear in the hash.

    > > Every time you execute this script, it reprocesses the same files it
    > > already processed the last time. That is the fundamental inefficiency,
    > > but how to fix it depends on external factors.

    >
    > Understood this is running real-time therefore the re-processing is
    > acceptable.


    I'm not sure I follow that. The more real-time your process is, the less
    acceptable it is to not be efficient. Do you mean that some other process
    is deleting files, so that large numbers of them won't accumulate?

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 2, 2006
    #6
  7. banker123 <> wrote:

    > Question
    > 1. How do I sort the variables in a scalar context not an array?



    That question makes no sense.

    "scalar" means "a single thing".

    You cannot sort a single thing, you need more than one thing
    before you can impost an order (sorting) on the things.


    > Should I be reading the variables I have extracted into an array?



    Probably.


    > 2. This program executes every 15 seconds to output the contents of
    > the directory. The output is used to control work flow and meet crucial
    > deadlines. Is this effeciently written?



    "premature optimization is the root of all evil"

    (google it)


    > Any suggestions?



    Don't concern yourself much with efficiency until _after_ it has
    been demonstrated that what you already have is too slow.

    Is it taking close to 15 seconds to execute?

    If not, then you are done without having to spend any time tweeking stuff.


    > My first "production" perl program, after reading and learning perl for
    > about a month.
    >
    > #Open directory
    > my $dir="G:/Formware/files/";



    I would suggest leaving the trailing slash out of there.

    It looks confusing when you use it later.


    > opendir DH, $dir or die "Cannot open$!";



    It would be nice to know the name of the directory that you were
    attempting to open, so you should include that info (along with
    some delimiters) in your diagnostic message:

    opendir DH, $dir or die "Cannot open '$dir' $!";


    > #Get system date in the format mmddyy
    > @months = qw(01 02 03 04 05 06 07 08 09 10 11 12);



    You should always put these:

    use warnings;
    use strict;

    at the top of every Perl program. They will catch many common
    mistakes for you.


    > @days = qw(01 02 03 04 05 06 07 08 09 10..31);



    You never use the @days array in your subsequent code, so why is it there?


    > ($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek,
    > $dayOfYear, $daylightSavings) = localtime();



    If you use a "Slice" (see that section in perldata.pod) you can get
    just the ones you want without a bunch of dummy variables that you
    never use:

    my($day, $month, $year) = (localtime)[3,4,5];

    > $dayOfMonth=sprintf("%02d", $dayOfMonth);



    $day = sprintf '%02d', $day;
    $month = sprintf '%02d', $month + 1;


    > $date = "$months[$month]$dayOfMonth06";

    ^^
    ^^

    Your program will break in about two months...


    $year = sprintf '%02d', ($year + 1900) % 100;
    my $date = "$month$day$year";


    > #Read each file beginning with system date from directory
    > @files=grep(/^$date/,readdir(DH));



    Whitespace is not a scarce resource. Feel free to use as much of it
    as you like to make your code easier to read and understand:

    @files = grep(/^$date/, readdir(DH));

    Even better, eliminate unnecessary parenthesis:

    @files = grep /^$date/, readdir DH;
    or
    @files = grep { /^$date/ } readdir DH;


    > closedir (DH);
    >
    >
    > foreach $file (@files) {



    Here you read things into an array when you do not need to. Below
    you do not read things into an array when you do need to. Get
    the logic right first, then worry about performance.

    foreach my $file ( grep /^$date/, readdir DH ) { # no @files temp array


    > open (data, "$dir$file") or die "Cannot open file: $!";



    Your working program will become a non-working program when you update
    your perl and it includes a new function named data().

    It looks like you are missing a directory separator, but you're not.
    It would be better if it didn't look like a mistake. Leave the
    trailing slash out of $dir, and put it in explicitly when it is needed.

    You should report the filename in the diagnostic message again.

    The 3-argument form of open() is much much better, so you should get
    used to using that form all of the time.

    open my $data, '<', "$dir/$file" or die "Cannot open '$dir/$file' $!";


    > while ( <data> ) {



    while ( <$data> ) {


    > if ( /JOBNAME/ ) {
    > $job = substr($_,8,8);
    > chomp($job);



    Why do you give substr args that capture a newline only to remove the newline?

    Just grab one less character, then you won't need a chomp:

    my $job = substr($_, 8, 7);


    > }
    > elsif ( /\.TIF\s{12}/ ) {
    > my $box = substr($_,73,6);
    > printf "$box $job $file\n";
    > last;
    > }



    You should line up your closing brace with the keyword that goes with
    the opening brace, and indent the code to show its structure:

    elsif ( /\.TIF\s{12}/ ) {
    my $box = substr($_,73,6);
    printf "$box $job $file\n";
    last;
    }


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Nov 2, 2006
    #7
  8. banker123 <> wrote:
    >> > Question
    >> > 1. How do I sort the variables in a scalar context not an array?
    >> > Should I be reading the variables I have extracted into an array?

    >>
    >> If you want to sort them, then you need to store them. Either an array,
    >> or maybe a hash.

    >
    > How do I store the variables into an array, push or pop command,



    Either one is fine, since you are going to rearrange their order
    later when you get to the sorting part anyway.


    > does
    > this have much overhead?



    It is a difference of a few microseconds, and below you say wasting
    thousands of microseconds is acceptable, so asking this is stepping
    over dollars in order to pick up pennies.


    > I woul like the saort to point to a sort
    > order file that containas priorites. Example:
    >
    > Variables
    > Box 1
    > Box 2
    > Box 3



    Those are not variables.

    Perhaps you meant values instead?

    If you show it in Perl code, it will be much less ambiguous.

    Have you seen the Posting Guidelines that are posted here frequently?


    >> Every time you execute this script, it reprocesses the same files it
    >> already processed the last time. That is the fundamental inefficiency,
    >> but how to fix it depends on external factors.

    >
    > Understood this is running real-time therefore the re-processing is
    > acceptable.



    Huh?

    If you have time constraints, then wasting time is UNacceptable.

    If you do not have time constraints, why do you keep harping on efficieny?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Nov 2, 2006
    #8
  9. Mirco Wahab <-halle.de> wrote:
    > Thus spoke banker123 (on 2006-11-02 20:36):
    >
    >> 2. This program executes every 15 seconds to output the contents of
    >> the directory. The output is used to control work flow and meet crucial
    >> deadlines. Is this effeciently written? Any suggestions?
    >>
    >> My first "production" perl program, after reading and learning perl for
    >> about a month.

    >
    > This is really good for a "first time program" ;-)
    >
    > I don't know your files, but I straightened the
    > program a bit (all programmers do so with other
    > peoples code ;-), maybe it won't work (in the third
    > part) because I had to guess what your data is like.
    >
    > ...
    > # date stuff
    > my ($d, $m, $y) = (localtime)[3..5];
    > my $date = sprintf "%02d"x3, $m+1, $d, $y-100;
    >
    > # directory stuff
    > my $dir = shift or die "no dir given";
    > opendir(DH, $dir) || die "$dir: $!";
    > my @files = grep /^$date/, readdir(DH);
    > closedir DH;
    >
    > # file stuff
    > foreach my $fname (@files) {
    > open(my $fh, "$dir$fname") or die "$fname: $!";
    > my ($job, $box);
    > while( <$fh> ) {
    > # extract something
    > ($job) = /.{8}(.{8})/ if /JOBNAME/;
    > # extract something else
    > ($box) = /.{73}(.{6})/
    > and printf "$box $job %s $fname\n", ' 'x20
    > and last if /\.TIF\s{12}/
    > }
    > close $fh;
    > }
    >
    >
    >
    > Don't forget to close your files, otherwise
    > your program will fail sensational ...



    How will it fail?

    Perl closes any open filehandles when it exits anyway.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Nov 2, 2006
    #9
  10. banker123

    Guest

    wrote:
    > Thus spoke Tad McClellan (on 2006-11-03 00:21):
    >
    > >> # file stuff
    > >> foreach my $fname (@files) {
    > >> open(my $fh, "$dir$fname") or die "$fname: $!";
    > >> ...
    > >> ...
    > >> close $fh;
    > >> }
    > >>
    > >> Don't forget to close your files, otherwise
    > >> your program will fail sensational ...

    > >
    > >
    > > How will it fail?
    > > Perl closes any open filehandles when it exits anyway.

    >
    > If you have several thousand file names in @files,
    > the O.P.'s Windows OS should bark after the first
    > hundred opened - but not closed ...


    lexical file handles are automatically closed when they go out of
    scope.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 3, 2006
    #10
  11. banker123

    Mirco Wahab Guest

    Thus spoke (on 2006-11-03 16:14):
    > wrote:
    >> Thus spoke Tad McClellan (on 2006-11-03 00:21):
    >> > How will it fail?
    >> > Perl closes any open filehandles when it exits anyway.

    >> If you have several thousand file names in @files,
    >> the O.P.'s Windows OS should bark after the first
    >> hundred opened - but not closed ...

    >
    > lexical file handles are automatically closed
    > when they go out of scope.


    Not bad!

    I didn't know that.

    Thanks & Regards!

    Mirco
    Mirco Wahab, Nov 3, 2006
    #11
  12. banker123

    Ben Morrow Guest

    Quoth :
    > wrote:
    > > Thus spoke Tad McClellan (on 2006-11-03 00:21):
    > >
    > > >> # file stuff
    > > >> foreach my $fname (@files) {
    > > >> open(my $fh, "$dir$fname") or die "$fname: $!";
    > > >> ...
    > > >> ...
    > > >> close $fh;
    > > >> }
    > > >>
    > > >> Don't forget to close your files, otherwise
    > > >> your program will fail sensational ...
    > > >
    > > >
    > > > How will it fail?
    > > > Perl closes any open filehandles when it exits anyway.

    > >
    > > If you have several thousand file names in @files,
    > > the O.P.'s Windows OS should bark after the first
    > > hundred opened - but not closed ...

    >
    > lexical file handles are automatically closed when they go out of
    > scope.


    However, if you are writing to the filehandle, you should close them
    yourself *and check for an error*, as a write error (disk full, network
    down, or whatever) may not be reported until the file is closed.

    Ben

    --
    I touch the fire and it freezes me, []
    I look into it and it's black.
    Why can't I feel? My skin should crack and peel---
    I want the fire back... Buffy, 'Once More With Feeling'
    Ben Morrow, Nov 4, 2006
    #12
  13. On Fri, 03 Nov 2006 06:08:50 +0100, Mirco Wahab wrote:

    > If you have several thousand file names in @files,
    > the O.P.'s Windows OS should bark after the first
    > hundred opened - but not closed ...
    >
    > Maybe I'm wrong and it's a modern OS these days ...


    Yes, you're wrong. Windows supported open files limited only by memory
    when many unixes didn't.

    M4
    -= Not a Windows fan at all =-

    --
    Redundancy is a great way to introduce more single points of failure.
    Martijn Lievaart, Nov 4, 2006
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. DKode

    effecient template system

    DKode, Jun 7, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    325
    DKode
    Jun 8, 2005
  2. Amit

    Which is more effecient ?

    Amit, Jul 14, 2003, in forum: C Programming
    Replies:
    10
    Views:
    563
    Chris Dollin
    Jul 18, 2003
  3. Tung Chau
    Replies:
    0
    Views:
    367
    Tung Chau
    Aug 6, 2004
  4. Arne Vajhøj

    Re: Effecient way of creating XML

    Arne Vajhøj, Sep 28, 2010, in forum: Java
    Replies:
    4
    Views:
    263
    Tom Anderson
    Sep 29, 2010
  5. Pitch
    Replies:
    27
    Views:
    629
    RedGrittyBrick
    Oct 6, 2010
Loading...

Share This Page