S
Steve Allan
One of my colleagues who is new to Perl wrote a script that used
recursion to do what File::Find is designed to do (he didn't know
about File::Find). He sent it to me, and I sent him back a new
version using File::Find. He later wrote back that my version was
SLOWER!
Skeptical, I reduced both scripts to versions that simply search for
files and put them into an array. I then benchmarked the two and he
was right. Here's what cmpthese() reported using 20 iterations:
s/iter File::Find Recursion
File::Find 5.19 -- -70%
Recursion 1.58 229% --
Is poor performance a known problem with File::Find, or am I using it
improperly? It's hard to believe it could be slower than recursion.
Any insights on why I'm seeing such a discrepancy?
I'm using ActiveState perl, version 5.8.0 on Window XP.
Below is the benchmark script I used. The variable $rootdir is set to
a directory on my machine that contains over 12000 files and numerous
subdirectories. You should only have to modify that one line to run
the script - if you're so inclined :^)
Thanks.
--
-- Steve
#!/usr/bin/perl -w
use strict;
use Benchmark qwall) ;
my $rootdir = 'd:/projects/vhi/griffin/dev';
chdir $rootdir or die "Can't change to $rootdir: $!";
cmpthese (20, {
'File::Find' => \&findfiles,
'Recursion' => \&recurse,
}
);
#findfiles();
#recurse();
#======================================================================
# New version uses File::Find
#======================================================================
sub findfiles {
use File::Find;
my @flist;
# search for files not in have list
find (sub { push @flist => $File::Find::name if -f }, '.');
print "In Findfiles: found ", scalar @flist, " files\n";
}
#======================================================================
# Old version uses recursion
#======================================================================
my @globallist;
sub recurse {
use Cwd;
@globallist = ();
# recurse through sub directories checking for known files
opendir CURRENTDIR, "." or die "Cannot open directory ", cwd(), ":! ";
my @files = grep !/^\.\.?$/, readdir CURRENTDIR;
closedir CURRENTDIR;
for (@files) {
checkFile($_, '.');
}
print "In Recurse: found ", scalar @globallist, " files\n";
}
sub checkFile {
my $fileToCheck = $_[0];
my $currentP4Dir = $_[1];
# first see if it is a directory
if ( chdir $fileToCheck == 1 ) {
$currentP4Dir = $currentP4Dir."/".$fileToCheck;
# go through the new directory
opendir CURRENTDIR, "." or die "Cannot open directory ".cwd();
my @files = grep !/^\.\.?$/, readdir CURRENTDIR;
closedir CURRENTDIR;
for (@files) {
checkFile($_, $currentP4Dir);
}
# after finishing, go up a directory
chdir "..";
$currentP4Dir =~ s/^(.*)\/.*$/$1/;
}
else {
#print "$fileToCheck\n";
push @globallist => $fileToCheck;
}
}
#=================================== EOF ======================================
recursion to do what File::Find is designed to do (he didn't know
about File::Find). He sent it to me, and I sent him back a new
version using File::Find. He later wrote back that my version was
SLOWER!
Skeptical, I reduced both scripts to versions that simply search for
files and put them into an array. I then benchmarked the two and he
was right. Here's what cmpthese() reported using 20 iterations:
s/iter File::Find Recursion
File::Find 5.19 -- -70%
Recursion 1.58 229% --
Is poor performance a known problem with File::Find, or am I using it
improperly? It's hard to believe it could be slower than recursion.
Any insights on why I'm seeing such a discrepancy?
I'm using ActiveState perl, version 5.8.0 on Window XP.
Below is the benchmark script I used. The variable $rootdir is set to
a directory on my machine that contains over 12000 files and numerous
subdirectories. You should only have to modify that one line to run
the script - if you're so inclined :^)
Thanks.
--
-- Steve
#!/usr/bin/perl -w
use strict;
use Benchmark qwall) ;
my $rootdir = 'd:/projects/vhi/griffin/dev';
chdir $rootdir or die "Can't change to $rootdir: $!";
cmpthese (20, {
'File::Find' => \&findfiles,
'Recursion' => \&recurse,
}
);
#findfiles();
#recurse();
#======================================================================
# New version uses File::Find
#======================================================================
sub findfiles {
use File::Find;
my @flist;
# search for files not in have list
find (sub { push @flist => $File::Find::name if -f }, '.');
print "In Findfiles: found ", scalar @flist, " files\n";
}
#======================================================================
# Old version uses recursion
#======================================================================
my @globallist;
sub recurse {
use Cwd;
@globallist = ();
# recurse through sub directories checking for known files
opendir CURRENTDIR, "." or die "Cannot open directory ", cwd(), ":! ";
my @files = grep !/^\.\.?$/, readdir CURRENTDIR;
closedir CURRENTDIR;
for (@files) {
checkFile($_, '.');
}
print "In Recurse: found ", scalar @globallist, " files\n";
}
sub checkFile {
my $fileToCheck = $_[0];
my $currentP4Dir = $_[1];
# first see if it is a directory
if ( chdir $fileToCheck == 1 ) {
$currentP4Dir = $currentP4Dir."/".$fileToCheck;
# go through the new directory
opendir CURRENTDIR, "." or die "Cannot open directory ".cwd();
my @files = grep !/^\.\.?$/, readdir CURRENTDIR;
closedir CURRENTDIR;
for (@files) {
checkFile($_, $currentP4Dir);
}
# after finishing, go up a directory
chdir "..";
$currentP4Dir =~ s/^(.*)\/.*$/$1/;
}
else {
#print "$fileToCheck\n";
push @globallist => $fileToCheck;
}
}
#=================================== EOF ======================================