Sorting and Writing Effecient Code

B

banker123

I have the code below that opens a directory, determines the current
date and extracts the file names that start with the current date, then
loops through the contents of those files to extract data in a scalar
context.

Question
1. How do I sort the variables in a scalar context not an array?
Should I be reading the variables I have extracted into an array?

2. This program executes every 15 seconds to output the contents of
the directory. The output is used to control work flow and meet crucial
deadlines. Is this effeciently written? Any suggestions?

My first "production" perl program, after reading and learning perl for
about a month.

#Open directory
my $dir="G:/Formware/files/";
opendir DH, $dir or die "Cannot open$!";

#Get system date in the format mmddyy
@months = qw(01 02 03 04 05 06 07 08 09 10 11 12);
@days = qw(01 02 03 04 05 06 07 08 09 10..31);
($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek,
$dayOfYear, $daylightSavings) = localtime();
$dayOfMonth=sprintf("%02d", $dayOfMonth);
$date = "$months[$month]$dayOfMonth06";

#Read each file beginning with system date from directory
@files=grep(/^$date/,readdir(DH));
closedir (DH);


foreach $file (@files) {
open (data, "$dir$file") or die "Cannot open file: $!";
while ( <data> ) {
if ( /JOBNAME/ ) {
$job = substr($_,8,8);
chomp($job);
}
elsif ( /\.TIF\s{12}/ ) {
my $box = substr($_,73,6);
printf "$box $job $file\n";
last;
}
}}
 
X

xhoster

banker123 said:
I have the code below that opens a directory, determines the current
date and extracts the file names that start with the current date, then
loops through the contents of those files to extract data in a scalar
context.

You seem to be mis-using the term "scalar context", in a way that
renders your question meaningless.
Question
1. How do I sort the variables in a scalar context not an array?
Should I be reading the variables I have extracted into an array?

If you want to sort them, then you need to store them. Either an array,
or maybe a hash.
2. This program executes every 15 seconds to output the contents of
the directory. The output is used to control work flow and meet crucial
deadlines. Is this effeciently written? Any suggestions?

You don't seem to be using strict. You should. As long as the program
does whatever it needs to do fast enough so that it is not the bottleneck,
it is hard to see how its efficiency would matter.
My first "production" perl program, after reading and learning perl for
about a month.

#Open directory
my $dir="G:/Formware/files/";
opendir DH, $dir or die "Cannot open$!";

You should probably use a lexical dir handle.

opendir my $dh, $dir or die $!;
#Read each file beginning with system date from directory
@files=grep(/^$date/,readdir(DH));

If the list can be big, you may not want to read everything into memory
at once.

while (defined (my $file=readdir($dh))) {
next unless $file =~ /^$date/;

Every time you execute this script, it reprocesses the same files it
already processed the last time. That is the fundamental inefficiency,
but how to fix it depends on external factors.

By the way, there was no indication anywhere in your code about what it is
you want to sort, or how you want it sorted.

Xho
 
B

banker123

Question
If you want to sort them, then you need to store them. Either an array,
or maybe a hash.

How do I store the variables into an array, push or pop command, does
this have much overhead? I woul like the saort to point to a sort
order file that containas priorites. Example:

Variables
Box 1
Box 2
Box 3

Prioroties table
Box 3
Box 1
Box 2

Sorted Array
Box 3
Box 1
Box 2

Every time you execute this script, it reprocesses the same files it
already processed the last time. That is the fundamental inefficiency,
but how to fix it depends on external factors.

Understood this is running real-time therefore the re-processing is
acceptable.
 
J

John W. Krahn

banker123 said:
I have the code below that opens a directory, determines the current
date and extracts the file names that start with the current date, then
loops through the contents of those files to extract data in a scalar
context.

Question
1. How do I sort the variables in a scalar context not an array?
Should I be reading the variables I have extracted into an array?

2. This program executes every 15 seconds to output the contents of
the directory. The output is used to control work flow and meet crucial
deadlines. Is this effeciently written? Any suggestions?

My first "production" perl program, after reading and learning perl for
about a month.

#Open directory
my $dir="G:/Formware/files/";
opendir DH, $dir or die "Cannot open$!";

#Get system date in the format mmddyy
@months = qw(01 02 03 04 05 06 07 08 09 10 11 12);
@days = qw(01 02 03 04 05 06 07 08 09 10..31);

You don't need those two arrays.
($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek,
$dayOfYear, $daylightSavings) = localtime();
$dayOfMonth=sprintf("%02d", $dayOfMonth);
$date = "$months[$month]$dayOfMonth06";

That won't work because the variable $dayOfMonth06 doesn't contain any data.
You probably want:

my ( $day, $month, $year ) = ( localtime )[ 3, 4, 5 ];
my $date = sprintf '%02d%02d%02d', $month + 1, $day, $year % 100;

#Read each file beginning with system date from directory
@files=grep(/^$date/,readdir(DH));
closedir (DH);


foreach $file (@files) {
open (data, "$dir$file") or die "Cannot open file: $!";
while ( <data> ) {
if ( /JOBNAME/ ) {
$job = substr($_,8,8);
chomp($job);

Do you really need to use chomp?
}
elsif ( /\.TIF\s{12}/ ) {
my $box = substr($_,73,6);
printf "$box $job $file\n";
last;
}
}}


John
 
J

John W. Krahn

Mirco said:
Thus spoke banker123 (on 2006-11-02 20:36):
2. This program executes every 15 seconds to output the contents of
the directory. The output is used to control work flow and meet crucial
deadlines. Is this effeciently written? Any suggestions?

My first "production" perl program, after reading and learning perl for
about a month.

This is really good for a "first time program" ;-)

I don't know your files, but I straightened the
program a bit (all programmers do so with other
peoples code ;-), maybe it won't work (in the third
part) because I had to guess what your data is like.

...
# date stuff
my ($d, $m, $y) = (localtime)[3..5];
my $date = sprintf "%02d"x3, $m+1, $d, $y-100;
^^^^^^
Subtracting 100 from $y will only work for 21st century dates.

$ perl -le'print( ( 1999 - 1900 ) - 100 )'
-1



John
 
X

xhoster

banker123 said:
How do I store the variables into an array, push or pop command, does
this have much overhead?

Perl's "push" has about as low overhead as you can have and still be using
Perl. If you need lower overhead than push, you are probably using the
wrong language in the first place.
I woul like the saort to point to a sort
order file that containas priorites. Example:

Variables
Box 1
Box 2
Box 3

Prioroties table
Box 3
Box 1
Box 2

Sorted Array
Box 3
Box 1
Box 2

From this example it isn't all that clear what you mean. "Sorted Array"
happens to be an exact copy of "Prioroties table" in this case, but
presumably that is just an accident? I'm guessing you want to keep the
things in priority table's order, but include them only if they also appear
in "Variables". In that case, you would store the "Variables" in a hash.
Then you would loop through priority table, printing out only those things
which appear in the hash.
Understood this is running real-time therefore the re-processing is
acceptable.

I'm not sure I follow that. The more real-time your process is, the less
acceptable it is to not be efficient. Do you mean that some other process
is deleting files, so that large numbers of them won't accumulate?

Xho
 
T

Tad McClellan

banker123 said:
Question
1. How do I sort the variables in a scalar context not an array?


That question makes no sense.

"scalar" means "a single thing".

You cannot sort a single thing, you need more than one thing
before you can impost an order (sorting) on the things.

Should I be reading the variables I have extracted into an array?

Probably.


2. This program executes every 15 seconds to output the contents of
the directory. The output is used to control work flow and meet crucial
deadlines. Is this effeciently written?


"premature optimization is the root of all evil"

(google it)

Any suggestions?


Don't concern yourself much with efficiency until _after_ it has
been demonstrated that what you already have is too slow.

Is it taking close to 15 seconds to execute?

If not, then you are done without having to spend any time tweeking stuff.

My first "production" perl program, after reading and learning perl for
about a month.

#Open directory
my $dir="G:/Formware/files/";


I would suggest leaving the trailing slash out of there.

It looks confusing when you use it later.

opendir DH, $dir or die "Cannot open$!";


It would be nice to know the name of the directory that you were
attempting to open, so you should include that info (along with
some delimiters) in your diagnostic message:

opendir DH, $dir or die "Cannot open '$dir' $!";

#Get system date in the format mmddyy
@months = qw(01 02 03 04 05 06 07 08 09 10 11 12);


You should always put these:

use warnings;
use strict;

at the top of every Perl program. They will catch many common
mistakes for you.

@days = qw(01 02 03 04 05 06 07 08 09 10..31);


You never use the @days array in your subsequent code, so why is it there?

($second, $minute, $hour, $dayOfMonth, $month, $yearOffset, $dayOfWeek,
$dayOfYear, $daylightSavings) = localtime();


If you use a "Slice" (see that section in perldata.pod) you can get
just the ones you want without a bunch of dummy variables that you
never use:

my($day, $month, $year) = (localtime)[3,4,5];
$dayOfMonth=sprintf("%02d", $dayOfMonth);


$day = sprintf '%02d', $day;
$month = sprintf '%02d', $month + 1;

$date = "$months[$month]$dayOfMonth06";
^^
^^

Your program will break in about two months...


$year = sprintf '%02d', ($year + 1900) % 100;
my $date = "$month$day$year";

#Read each file beginning with system date from directory
@files=grep(/^$date/,readdir(DH));


Whitespace is not a scarce resource. Feel free to use as much of it
as you like to make your code easier to read and understand:

@files = grep(/^$date/, readdir(DH));

Even better, eliminate unnecessary parenthesis:

@files = grep /^$date/, readdir DH;
or
@files = grep { /^$date/ } readdir DH;

closedir (DH);


foreach $file (@files) {


Here you read things into an array when you do not need to. Below
you do not read things into an array when you do need to. Get
the logic right first, then worry about performance.

foreach my $file ( grep /^$date/, readdir DH ) { # no @files temp array

open (data, "$dir$file") or die "Cannot open file: $!";


Your working program will become a non-working program when you update
your perl and it includes a new function named data().

It looks like you are missing a directory separator, but you're not.
It would be better if it didn't look like a mistake. Leave the
trailing slash out of $dir, and put it in explicitly when it is needed.

You should report the filename in the diagnostic message again.

The 3-argument form of open() is much much better, so you should get
used to using that form all of the time.

open my $data, '<', "$dir/$file" or die "Cannot open '$dir/$file' $!";

while ( <data> ) {


if ( /JOBNAME/ ) {
$job = substr($_,8,8);
chomp($job);


Why do you give substr args that capture a newline only to remove the newline?

Just grab one less character, then you won't need a chomp:

my $job = substr($_, 8, 7);

}
elsif ( /\.TIF\s{12}/ ) {
my $box = substr($_,73,6);
printf "$box $job $file\n";
last;
}


You should line up your closing brace with the keyword that goes with
the opening brace, and indent the code to show its structure:

elsif ( /\.TIF\s{12}/ ) {
my $box = substr($_,73,6);
printf "$box $job $file\n";
last;
}
 
T

Tad McClellan

banker123 said:
How do I store the variables into an array, push or pop command,


Either one is fine, since you are going to rearrange their order
later when you get to the sorting part anyway.

does
this have much overhead?


It is a difference of a few microseconds, and below you say wasting
thousands of microseconds is acceptable, so asking this is stepping
over dollars in order to pick up pennies.

I woul like the saort to point to a sort
order file that containas priorites. Example:

Variables
Box 1
Box 2
Box 3


Those are not variables.

Perhaps you meant values instead?

If you show it in Perl code, it will be much less ambiguous.

Have you seen the Posting Guidelines that are posted here frequently?

Understood this is running real-time therefore the re-processing is
acceptable.


Huh?

If you have time constraints, then wasting time is UNacceptable.

If you do not have time constraints, why do you keep harping on efficieny?
 
T

Tad McClellan

Mirco Wahab said:
Thus spoke banker123 (on 2006-11-02 20:36):
2. This program executes every 15 seconds to output the contents of
the directory. The output is used to control work flow and meet crucial
deadlines. Is this effeciently written? Any suggestions?

My first "production" perl program, after reading and learning perl for
about a month.

This is really good for a "first time program" ;-)

I don't know your files, but I straightened the
program a bit (all programmers do so with other
peoples code ;-), maybe it won't work (in the third
part) because I had to guess what your data is like.

...
# date stuff
my ($d, $m, $y) = (localtime)[3..5];
my $date = sprintf "%02d"x3, $m+1, $d, $y-100;

# directory stuff
my $dir = shift or die "no dir given";
opendir(DH, $dir) || die "$dir: $!";
my @files = grep /^$date/, readdir(DH);
closedir DH;

# file stuff
foreach my $fname (@files) {
open(my $fh, "$dir$fname") or die "$fname: $!";
my ($job, $box);
while( <$fh> ) {
# extract something
($job) = /.{8}(.{8})/ if /JOBNAME/;
# extract something else
($box) = /.{73}(.{6})/
and printf "$box $job %s $fname\n", ' 'x20
and last if /\.TIF\s{12}/
}
close $fh;
}



Don't forget to close your files, otherwise
your program will fail sensational ...


How will it fail?

Perl closes any open filehandles when it exits anyway.
 
X

xhoster

Thus spoke Tad McClellan (on 2006-11-03 00:21):


If you have several thousand file names in @files,
the O.P.'s Windows OS should bark after the first
hundred opened - but not closed ...

lexical file handles are automatically closed when they go out of
scope.

Xho
 
M

Mirco Wahab

Thus spoke (e-mail address removed) (on 2006-11-03 16:14):
lexical file handles are automatically closed
when they go out of scope.

Not bad!

I didn't know that.

Thanks & Regards!

Mirco
 
B

Ben Morrow

Quoth (e-mail address removed):
lexical file handles are automatically closed when they go out of
scope.

However, if you are writing to the filehandle, you should close them
yourself *and check for an error*, as a write error (disk full, network
down, or whatever) may not be reported until the file is closed.

Ben
 
M

Martijn Lievaart

If you have several thousand file names in @files,
the O.P.'s Windows OS should bark after the first
hundred opened - but not closed ...

Maybe I'm wrong and it's a modern OS these days ...

Yes, you're wrong. Windows supported open files limited only by memory
when many unixes didn't.

M4
-= Not a Windows fan at all =-
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top