Return only the directories from readdir

Discussion in 'Perl Misc' started by Michael-John Tavantzis, Mar 7, 2013.

  1. There is a slowdown associated with reading the contents of a directory that contains large amounts of files. Since I am only interested in sub-directory names is there a way to return the directory names only and skip returning the files? Or quickly determine if there exists a sub-directory at least?

    Here is the code i'm starting from... would have liked to be able to put a "-d" in front of readdir.

    opendir(my $directory_handle, $directory) || die("Cannot open directory");
    my @searchSet= readdir($directory_handle);
     
    Michael-John Tavantzis, Mar 7, 2013
    #1
    1. Advertising

  2. Michael-John Tavantzis <> writes:
    > There is a slowdown associated with reading the contents of a
    > directory that contains large amounts of files. Since I am only
    > interested in sub-directory names is there a way to return the
    > directory names only and skip returning the files? Or quickly
    > determine if there exists a sub-directory at least?
    >
    > Here is the code i'm starting from... would have liked to be able to
    > put a "-d" in front of readdir.


    No. You actually need to go through the list of returned names and
    test them one-by-one in order to determine which are _presently_[*]
    directories.

    [*] This doesn't mean they will still be directories by the time some
    'other code' uses them, cf

    http://cwe.mitre.org/data/definitions/367.html
     
    Rainer Weikusat, Mar 7, 2013
    #2
    1. Advertising

  3. Michael-John Tavantzis

    Willem Guest

    Michael-John Tavantzis wrote:
    ) There is a slowdown associated with reading the contents of a directory that contains large amounts of files. Since I am only interested in sub-directory names is there a way to return the directory names only and skip returning the files? Or quickly determine if there exists a sub-directory at least?
    )
    ) Here is the code i'm starting from... would have liked to be able to put a "-d" in front of readdir.
    )
    ) opendir(my $directory_handle, $directory) || die("Cannot open directory");
    ) my @searchSet= readdir($directory_handle);

    This seems to work (note the slash at the end):

    my @searchSet = glob("$directory/*/");

    But I have no idea of its efficiency.


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
     
    Willem, Mar 7, 2013
    #3
  4. Willem <> writes:
    > Michael-John Tavantzis wrote:
    > ) There is a slowdown associated with reading the contents of a directory that contains large amounts of files. Since I am only interested in sub-directory names is there a way to return the directory names only and skip returning the files? Or quickly determine if there exists a sub-directory at least?
    > )
    > ) Here is the code i'm starting from... would have liked to be able to put a "-d" in front of readdir.
    > )
    > ) opendir(my $directory_handle, $directory) || die("Cannot open directory");
    > ) my @searchSet= readdir($directory_handle);
    >
    > This seems to work (note the slash at the end):
    >
    > my @searchSet = glob("$directory/*/");
    >
    > But I have no idea of its efficiency.


    It still needs to read the complete directory.
     
    Rainer Weikusat, Mar 7, 2013
    #4
  5. On Thursday, March 7, 2013 1:25:15 PM UTC-5, Henry Law wrote:
    > On 07/03/13 17:46, Michael-John Tavantzis wrote:
    >
    > > There is a slowdown associated with reading the contents of a directorythat contains large amounts of files. Since I am only interested in sub-directory names is there a way to return the directory names only and skip returning the files? Or quickly determine if there exists a sub-directory at least?

    >
    > >

    >
    > > Here is the code i'm starting from... would have liked to be able to put a "-d" in front of readdir.

    >
    >
    >
    > I can see your need but, thinking about how the OS is doing this, I
    >
    > doubt there's anything to be done. Directories (in Linux and
    >
    > DOS/Windows at any rate) are just files, with a bit set in the file
    >
    > descriptor to say that they're a directory. No sub-directory contains
    >
    > any information as to whether any other file in the same directory is
    >
    > also a sub-directory, or for that matter whether any of the files in the
    >
    > containing directory are themselves directories.
    >
    >
    >
    > So I'd doubt very much if the OS has any way of finding sub-directories
    >
    > other than reading every entry and deciding there and then whether or
    >
    > not it's a sub-directory.
    >
    >
    >
    > I'm open to contradiction, though: this is entirely theoretical. And
    >
    > some less-common OS's (IBM MVS (z/OS) or VM (z/VM), maybe) did their
    >
    > disk management in different ways; it's possible that they maintained a
    >
    > separate index of directories and sub-directories, which would allow
    >
    > Perl to enumerate them directly.


    Yes, that's what I was hoping for, or at least for the OS to have the folders and files sorted, so there was a way of doing this. Thank you for the replies.

    >
    >
    >
    > --
    >
    >
    >
    > Henry Law Manchester, England
     
    Michael-John Tavantzis, Mar 7, 2013
    #5
  6. Michael-John Tavantzis

    Alan Curry Guest

    In article <>,
    Michael-John Tavantzis <> wrote:
    >There is a slowdown associated with reading the contents of a directory
    >that contains large amounts of files. Since I am only interested in
    >sub-directory names is there a way to return the directory names only
    >and skip returning the files? Or quickly determine if there exists a
    >sub-directory at least?


    The last question does have an answer for unix filesystems: the link count of
    a directory is 2 plus the number of subdirectories. The first 2 links are the
    directory's name in its parent and its own ".", the rest are the ".." links
    from the children.

    perl -e '
    opendir(my $d, "/tmp") or die $!;
    my $subdirs = (stat $d)[3] - 2;
    print "/tmp has $subdirs subdirs\n"'

    You don't need to look for subdirectories if (stat $dirhandle)[3]-2==0, if
    you can be sure that the filesystem is proper unix, not FAT or anything else
    weird.

    Some systems provide a d_type field in struct dirent, which allows you to
    find out which returned items are directories without stat'ing them. This
    would speed up your search because it means you don't have to read the
    inodes. And it has a clean failure mode on unsupported filesystems, so you
    can fall back to stat and not worry about breakage when that unexpected FAT
    filesystem shows up.

    I see there's an IO::Dirent module which provides access to the d_type
    feature in perl.

    --
    Alan Curry
     
    Alan Curry, Mar 7, 2013
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt
    Replies:
    2
    Views:
    745
  2. Ramon
    Replies:
    5
    Views:
    10,837
    John C. Bollinger
    Jan 3, 2005
  3. electric sheep

    readdir() and S_ISDIR(stat.st_mode) question (OT ?)

    electric sheep, Mar 4, 2004, in forum: C Programming
    Replies:
    2
    Views:
    11,556
    Villy Kruse
    Mar 4, 2004
  4. Brian Wallace
    Replies:
    5
    Views:
    133
    James Gray
    Feb 5, 2009
  5. John Stoffel
    Replies:
    0
    Views:
    159
    John Stoffel
    Feb 6, 2009
Loading...

Share This Page