Return only the directories from readdir

Michael-John Tavantzis · Mar 7, 2013

There is a slowdown associated with reading the contents of a directory that contains large amounts of files. Since I am only interested in sub-directory names is there a way to return the directory names only and skip returning the files? Or quickly determine if there exists a sub-directory at least?

Here is the code i'm starting from... would have liked to be able to put a "-d" in front of readdir.

opendir(my $directory_handle, $directory) || die("Cannot open directory");
my @searchSet= readdir($directory_handle);

Rainer Weikusat · Mar 7, 2013

Michael-John Tavantzis said:
There is a slowdown associated with reading the contents of a
directory that contains large amounts of files. Since I am only
interested in sub-directory names is there a way to return the
directory names only and skip returning the files? Or quickly
determine if there exists a sub-directory at least?

Here is the code i'm starting from... would have liked to be able to
put a "-d" in front of readdir.

No. You actually need to go through the list of returned names and
test them one-by-one in order to determine which are _presently_[*]
directories.

[*] This doesn't mean they will still be directories by the time some
'other code' uses them, cf

http://cwe.mitre.org/data/definitions/367.html

Willem · Mar 7, 2013

Michael-John Tavantzis wrote:
) There is a slowdown associated with reading the contents of a directory that contains large amounts of files. Since I am only interested in sub-directory names is there a way to return the directory names only and skip returning the files? Or quickly determine if there exists a sub-directory at least?
)
) Here is the code i'm starting from... would have liked to be able to put a "-d" in front of readdir.
)
) opendir(my $directory_handle, $directory) || die("Cannot open directory");
) my @searchSet= readdir($directory_handle);

This seems to work (note the slash at the end):

my @searchSet = glob("$directory/*/");

But I have no idea of its efficiency.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Rainer Weikusat · Mar 7, 2013

Willem said:
Michael-John Tavantzis wrote:
) There is a slowdown associated with reading the contents of a directory that contains large amounts of files. Since I am only interested in sub-directory names is there a way to return the directory names only and skip returning the files? Or quickly determine if there exists a sub-directory at least?
)
) Here is the code i'm starting from... would have liked to be able to put a "-d" in front of readdir.
)
) opendir(my $directory_handle, $directory) || die("Cannot open directory");
) my @searchSet= readdir($directory_handle);

This seems to work (note the slash at the end):

my @searchSet = glob("$directory/*/");

But I have no idea of its efficiency.

It still needs to read the complete directory.

Michael-John Tavantzis · Mar 7, 2013

I can see your need but, thinking about how the OS is doing this, I

doubt there's anything to be done. Directories (in Linux and

DOS/Windows at any rate) are just files, with a bit set in the file

descriptor to say that they're a directory. No sub-directory contains

any information as to whether any other file in the same directory is

also a sub-directory, or for that matter whether any of the files in the

containing directory are themselves directories.

So I'd doubt very much if the OS has any way of finding sub-directories

other than reading every entry and deciding there and then whether or

not it's a sub-directory.

I'm open to contradiction, though: this is entirely theoretical. And

some less-common OS's (IBM MVS (z/OS) or VM (z/VM), maybe) did their

disk management in different ways; it's possible that they maintained a

separate index of directories and sub-directories, which would allow

Perl to enumerate them directly.

Yes, that's what I was hoping for, or at least for the OS to have the folders and files sorted, so there was a way of doing this. Thank you for the replies.

Alan Curry · Mar 7, 2013

There is a slowdown associated with reading the contents of a directory
that contains large amounts of files. Since I am only interested in
sub-directory names is there a way to return the directory names only
and skip returning the files? Or quickly determine if there exists a
sub-directory at least?

The last question does have an answer for unix filesystems: the link count of
a directory is 2 plus the number of subdirectories. The first 2 links are the
directory's name in its parent and its own ".", the rest are the ".." links
from the children.

perl -e '
opendir(my $d, "/tmp") or die $!;
my $subdirs = (stat $d)[3] - 2;
print "/tmp has $subdirs subdirs\n"'

You don't need to look for subdirectories if (stat $dirhandle)[3]-2==0, if
you can be sure that the filesystem is proper unix, not FAT or anything else
weird.

Some systems provide a d_type field in struct dirent, which allows you to
find out which returned items are directories without stat'ing them. This
would speed up your search because it means you don't have to read the
inodes. And it has a clean failure mode on unsupported filesystems, so you
can fall back to stat and not worry about breakage when that unexpected FAT
filesystem shows up.

I see there's an IO:

irent module which provides access to the d_type
feature in perl.

readdir: is there a way to reset cursor to beginning?	8	Feb 16, 2010
Getting the names for >200 directories	2	Aug 26, 2008
lest talk a litle more about directories	25	Jul 26, 2013
Getting all directories/files from current directory and using -d flag for the directories	8	Oct 8, 2004
List of directories with a directory	3	Jan 30, 2008
how to write a script to only process one depth directories	7	Jul 2, 2009
Merge files	1	Aug 7, 2013
Return actual files only not directories, using Pathname..	5	Feb 5, 2009

Return only the directories from readdir

Michael-John Tavantzis

Rainer Weikusat

Willem

Rainer Weikusat

Michael-John Tavantzis

Alan Curry

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads