How do I get more-detailed directory info?

Discussion in 'Perl Misc' started by Robbie Hatley, Sep 26, 2005.

  1. Greetings, group. This is the first time I've posted here,
    or in any Perl-related forum for that matter. I'm a rank
    beginner at Perl. I'm trying to teaching myself the language
    from reading books (incl. one with a camel on the cover, and
    one with a llama; not that anyone here would recongize those,
    of course), trying my own Perl hacks, and struggling
    with compiler errror messages.

    The training program I'm currently trying to write is a
    duplicate-file finding/removing program. A rather elaborate
    program when written in C++. An engineer friend told me it
    would be much simpler in Perl. I had to tell him, "Cool, but
    I don't know Perl". He seemed offended and said "Your loss.".

    So... pasted below is my first (incomplete) attempt at writing
    a real Perl program.

    I have hundreds of questions about Perl, but for now I'll
    ask this group just two questions:

    1:
    How do I get a more-detailed directory listing than is offered
    by the readdir function? Is their any way to goad that
    function into coughing-up file-type (file or directory or link),
    size in bytes, mod-time, mod-date, attribtutes, etc.? Or do I
    have to use some other approach to get that data?

    2:
    Is there a better way to emulate the C++ concept of a "list of
    structs" than what I'm doing below? (I'm using an array of refs
    to hashes.)



    Here's my program at it's current state of development:



    ################################################################################
    ##########
    # dedup3.perl
    # Duplicate file finding/erasing program.
    # Written by Robbie Hatley, as a "learn Perl" excercise.
    # Plan: Recursively decend directory tree starting from current working
    directory,
    # and make a master list of all files encountered on this branch. Order the
    list by size.
    # Within each size group, compare each file, from left to right, to all the
    files to
    # its right. If a duplicate pair is found, alert user and get user input. Give
    user
    # these choices:
    # 1. Erase left file
    # 2. Erase right file
    # 3. Ignore this pair of duplicate files and move to next
    # 4. Quit
    # If user elects to delete a file, delete it, then move to next duplicate file
    pair.
    ################################################################################
    ##########

    use strict;
    use warnings;

    use Cwd;

    # Not valid Perl; how do I do this???
    # struct FileRecord
    # {
    # std::string Date;
    # std::string Time;
    # std::string Type;
    # long int Size;
    # std::string Attr;
    # std::string Name;
    # };
    #
    # std::list<rhdir::FileRecord> FileList;
    #
    # TODO: How do I extract size, mod-time, mod-date, type (file or dir),
    # attributes, etc. and store in an array of structs (or Perl equiv)???
    #
    # Try an array of hashes?

    my $CurDir = getcwd();
    print "CWD = ", $CurDir, "\n";
    opendir(Dot, ".") or die "Can\'t open the directory!!!";

    my @LocalFiles;
    my $FileName;
    my $FileRecord;

    while ($FileName=readdir(Dot))
    {
    $FileRecord=
    {
    "Date" => "Unknown",
    "Time" => "Unknown",
    "Type" => "Unknown",
    "Size" => 42,
    "Attr" => "Unknown",
    "Name" => $FileName
    };
    push @LocalFiles, $FileRecord;
    }

    closedir(Dot);

    foreach $FileRecord (@LocalFiles)
    {
    print($$FileRecord{"Name"}, "\n");
    }




    --
    Cheers,
    Robbie Hatley
    Tustin, CA, USA
    email: lonewolfintj at pacbell dot net
    web: home dot pacbell dot net slant earnur slant
    Robbie Hatley, Sep 26, 2005
    #1
    1. Advertising

  2. Robbie Hatley wrote:
    > Greetings, group. This is the first time I've posted here,
    > or in any Perl-related forum for that matter. I'm a rank
    > beginner at Perl. I'm trying to teaching myself the language
    > from reading books (incl. one with a camel on the cover, and
    > one with a llama; not that anyone here would recongize those,
    > of course), trying my own Perl hacks, and struggling
    > with compiler errror messages.
    >
    > The training program I'm currently trying to write is a
    > duplicate-file finding/removing program. A rather elaborate
    > program when written in C++. An engineer friend told me it
    > would be much simpler in Perl. I had to tell him, "Cool, but
    > I don't know Perl". He seemed offended and said "Your loss.".
    >
    > So... pasted below is my first (incomplete) attempt at writing
    > a real Perl program.
    >
    > I have hundreds of questions about Perl, but for now I'll
    > ask this group just two questions:
    >
    > 1:
    > How do I get a more-detailed directory listing than is offered
    > by the readdir function? Is their any way to goad that
    > function into coughing-up file-type (file or directory or link),
    > size in bytes, mod-time, mod-date, attribtutes, etc.? Or do I
    > have to use some other approach to get that data?
    >
    > 2:
    > Is there a better way to emulate the C++ concept of a "list of
    > structs" than what I'm doing below? (I'm using an array of refs
    > to hashes.)
    >
    >


    [ code snipped ]

    Here is an example:
    Where to look:
    perldoc -f stat (more detailed informations about a file)
    perldoc perldsc, Perl Data Structures Cookbook (ARRAYS OF HASHES)


    --- code---
    use Cwd;
    use strict;

    my $CurDir = getcwd();
    print "CWD = ", $CurDir, "\n";
    opendir(DOT, ".") or die "Can\'t open the directory!!!";

    my @LocalFiles;
    my $FileName;
    my @files = readdir(DOT);
    my $Rec;
    foreach (@files)
    {
    my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
    $atime,$mtime,$ctime,$blksize,$blocks) = stat($_);
    my $FileRecord = {};
    $FileRecord->{Date} = "Unknown";
    $FileRecord->{Time} = $mtime;
    $FileRecord->{Type} = $mode;
    $FileRecord->{Size} = $size;
    $FileRecord->{Attr} = "Unknown";
    $FileRecord->{Name} = $_;

    push(@LocalFiles, $FileRecord);
    }

    closedir(DOT);


    my $role;
    foreach $Rec (@LocalFiles)
    {
    print "{ ";
    for $role (keys %$Rec)
    {
    print "$role=" . $Rec->{$role} . " ";
    }
    print " }\n";
    }

    --- code ---
    --
    regards,
    Reinhard
    Reinhard Pagitsch, Sep 26, 2005
    #2
    1. Advertising

  3. Robbie Hatley wrote:
    > 1:
    > How do I get a more-detailed directory listing than is offered
    > by the readdir function? Is their any way to goad that
    > function into coughing-up file-type (file or directory or link),
    > size in bytes, mod-time, mod-date, attribtutes, etc.? Or do I
    > have to use some other approach to get that data?


    Those data are informations about individual files. See stat() and/or the
    file test operators (e.g. -M, -f, ...)

    > 2:
    > Is there a better way to emulate the C++ concept of a "list of
    > structs" than what I'm doing below? (I'm using an array of refs
    > to hashes.)


    I think that's the most perlish way.

    jue
    Jürgen Exner, Sep 26, 2005
    #3
  4. Robbie Hatley <> wrote:

    > and struggling
    > with compiler errror messages.



    It may help to lookup the messages in

    perldoc perldiag


    > 1:
    > How do I get a more-detailed directory listing than is offered
    > by the readdir function? Is their any way to goad that
    > function into coughing-up file-type (file or directory or link),
    > size in bytes, mod-time, mod-date, attribtutes, etc.?



    perldoc -f stat

    perldoc -f -X


    Be sure to pay close attention to

    perldoc -f readdir

    particularly the part that starts with

    If you're planning to filetest the return values out of a "readdir"...


    > # Plan: Recursively decend directory tree starting from current working
    > directory,



    perldoc File::Find


    > opendir(Dot, ".") or die "Can\'t open the directory!!!";



    You should include the $! variable in your die() message.

    There is no need to backslash a single quote in a double quoted string.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Sep 26, 2005
    #4
  5. Robbie Hatley

    Dave Weaver Guest

    Reinhard Pagitsch <> wrote:
    > --- code---
    > use Cwd;
    > use strict;
    >
    > my $CurDir = getcwd();
    > print "CWD = ", $CurDir, "\n";
    > opendir(DOT, ".") or die "Can\'t open the directory!!!";


    Yuk. Lexical filehandles have been available for years.

    opendir my $dir "." or die ...
    >
    > my @LocalFiles;
    > my $FileName;
    > my @files = readdir(DOT);
    > my $Rec;
    > foreach (@files)


    If the only thing you're going to do with @files is loop over
    them like this, why bother slurping them into an array in the first
    place?
    It's better and more scaleable to use something like
    while( $FileName = readdir( $dir ) {

    > {
    > my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
    > $atime,$mtime,$ctime,$blksize,$blocks) = stat($_);


    No need to create all those variables and then ignore them. Just get
    the things you need from stat():

    my ( $mode, $size, $mtime ) = (stat)[ 2, 7, 9 ];


    > my $FileRecord = {};
    > $FileRecord->{Date} = "Unknown";
    > $FileRecord->{Time} = $mtime;
    > $FileRecord->{Type} = $mode;
    > $FileRecord->{Size} = $size;
    > $FileRecord->{Attr} = "Unknown";
    > $FileRecord->{Name} = $_;
    >
    > push(@LocalFiles, $FileRecord);


    I would write this as:

    push @LocalFiles, {
    Time => $mtime,
    Type => $mode,
    Size => $size,
    Name => $FileName,
    ... etc ...
    };


    My code to output a simple ls-like listing would be as follows:

    #!/usr/bin/perl
    use warnings;
    use strict;

    my $dirname = "/tmp";

    opendir my $dir, $dirname or die $!;
    while ( my $filename = readdir $dir ) {
    my $pathname = "$dirname/$filename";

    my ( $size, $mtime ) = (stat $pathname)[ 7, 9 ];

    my $type = "other";
    # See "perldoc -f -X" for a complete list of tests
    $type = "dir" if -d $pathname;
    $type = "link" if -l $pathname;
    $type = "file" if -f $pathname;

    printf "%5s %20s %10d %s\n",
    $type,
    $filename,
    $size,
    scalar localtime $mtime;
    }
    closedir $dir;
    Dave Weaver, Sep 26, 2005
    #5
  6. Robbie Hatley

    Paul Lalli Guest

    Dave Weaver wrote:
    > while ( my $filename = readdir $dir ) {
    > my $pathname = "$dirname/$filename";
    >
    > my ( $size, $mtime ) = (stat $pathname)[ 7, 9 ];


    Why are you doing an explicit call to stat() here, but using the file
    test operators below? Why not be consistant?

    my ($size, $mtime) = (-s $pathname, -M _);

    >
    > my $type = "other";
    > # See "perldoc -f -X" for a complete list of tests
    > $type = "dir" if -d $pathname;
    > $type = "link" if -l $pathname;
    > $type = "file" if -f $pathname;


    You've already done one call to stat() in this loop. No need to
    duplicate that call 3 more times.

    my $type = "other";
    $type = 'dir' if -d _;
    $type = 'link' if -l _;
    $type = 'file' if -f _;

    or, my preferred way of writing this...
    my $type = -d _ ? 'dir'
    : -l _ ? 'link'
    : -f _ ? 'file'
    : 'other';


    Paul Lalli
    Paul Lalli, Sep 26, 2005
    #6
  7. Robbie Hatley

    Paul Lalli Guest

    Paul Lalli wrote:
    > Dave Weaver wrote:
    > > while ( my $filename = readdir $dir ) {
    > > my $pathname = "$dirname/$filename";
    > >
    > > my ( $size, $mtime ) = (stat $pathname)[ 7, 9 ];

    >
    > Why are you doing an explicit call to stat() here, but using the file
    > test operators below? Why not be consistant?
    >
    > my ($size, $mtime) = (-s $pathname, -M _);
    >
    > >
    > > my $type = "other";
    > > # See "perldoc -f -X" for a complete list of tests
    > > $type = "dir" if -d $pathname;
    > > $type = "link" if -l $pathname;
    > > $type = "file" if -f $pathname;


    Actually, this code has another error that I didn't recognize before my
    first reply. If $pathname is a link, the -f here will actually be
    testing the entry that $pathname points to, rather than $pathname
    itself. In other words, $type will only be set to "link" if the entry
    that $pathname points to happens to not be a file.

    > You've already done one call to stat() in this loop. No need to
    > duplicate that call 3 more times.
    >
    > my $type = "other";
    > $type = 'dir' if -d _;
    > $type = 'link' if -l _;
    > $type = 'file' if -f _;


    This also has the corrolary problem that this code of mine will not
    function correctly, as it will give a "The stat preceding -l _ wasn't
    an lstat" error.

    If we really want to find out exactly what $pathname is, I think we
    have to do an lstat first, and then continue testing the results of
    that call.

    my ($size) = (lstat $pathname)[7];
    my $mtime = -M _;

    my $type = -d _ ? 'dir'
    : -l _ ? 'link'
    : -f _ ? 'file'
    : 'other';

    Paul Lalli
    Paul Lalli, Sep 26, 2005
    #7
  8. Robbie Hatley

    Guest

    For your reference, you may have a look at NoClone, a duplicate files
    finder shareware, to see what feature can be included in your program.
    http://noclone.net

    Robbie Hatley wrote:
    > Greetings, group. This is the first time I've posted here,
    > or in any Perl-related forum for that matter. I'm a rank
    > beginner at Perl. I'm trying to teaching myself the language
    > from reading books (incl. one with a camel on the cover, and
    > one with a llama; not that anyone here would recongize those,
    > of course), trying my own Perl hacks, and struggling
    > with compiler errror messages.
    >
    > The training program I'm currently trying to write is a
    > duplicate-file finding/removing program. A rather elaborate
    > program when written in C++. An engineer friend told me it
    > would be much simpler in Perl. I had to tell him, "Cool, but
    > I don't know Perl". He seemed offended and said "Your loss.".
    >
    > So... pasted below is my first (incomplete) attempt at writing
    > a real Perl program.
    >
    > I have hundreds of questions about Perl, but for now I'll
    > ask this group just two questions:
    >
    > 1:
    > How do I get a more-detailed directory listing than is offered
    > by the readdir function? Is their any way to goad that
    > function into coughing-up file-type (file or directory or link),
    > size in bytes, mod-time, mod-date, attribtutes, etc.? Or do I
    > have to use some other approach to get that data?
    >
    > 2:
    > Is there a better way to emulate the C++ concept of a "list of
    > structs" than what I'm doing below? (I'm using an array of refs
    > to hashes.)
    >
    >
    >
    > Here's my program at it's current state of development:
    >
    >
    >
    > ################################################################################
    > ##########
    > # dedup3.perl
    > # Duplicate file finding/erasing program.
    > # Written by Robbie Hatley, as a "learn Perl" excercise.
    > # Plan: Recursively decend directory tree starting from current working
    > directory,
    > # and make a master list of all files encountered on this branch. Order the
    > list by size.
    > # Within each size group, compare each file, from left to right, to all the
    > files to
    > # its right. If a duplicate pair is found, alert user and get user input. Give
    > user
    > # these choices:
    > # 1. Erase left file
    > # 2. Erase right file
    > # 3. Ignore this pair of duplicate files and move to next
    > # 4. Quit
    > # If user elects to delete a file, delete it, then move to next duplicate file
    > pair.
    > ################################################################################
    > ##########
    >
    > use strict;
    > use warnings;
    >
    > use Cwd;
    >
    > # Not valid Perl; how do I do this???
    > # struct FileRecord
    > # {
    > # std::string Date;
    > # std::string Time;
    > # std::string Type;
    > # long int Size;
    > # std::string Attr;
    > # std::string Name;
    > # };
    > #
    > # std::list<rhdir::FileRecord> FileList;
    > #
    > # TODO: How do I extract size, mod-time, mod-date, type (file or dir),
    > # attributes, etc. and store in an array of structs (or Perl equiv)???
    > #
    > # Try an array of hashes?
    >
    > my $CurDir = getcwd();
    > print "CWD = ", $CurDir, "\n";
    > opendir(Dot, ".") or die "Can\'t open the directory!!!";
    >
    > my @LocalFiles;
    > my $FileName;
    > my $FileRecord;
    >
    > while ($FileName=readdir(Dot))
    > {
    > $FileRecord=
    > {
    > "Date" => "Unknown",
    > "Time" => "Unknown",
    > "Type" => "Unknown",
    > "Size" => 42,
    > "Attr" => "Unknown",
    > "Name" => $FileName
    > };
    > push @LocalFiles, $FileRecord;
    > }
    >
    > closedir(Dot);
    >
    > foreach $FileRecord (@LocalFiles)
    > {
    > print($$FileRecord{"Name"}, "\n");
    > }
    >
    >
    >
    >
    > --
    > Cheers,
    > Robbie Hatley
    > Tustin, CA, USA
    > email: lonewolfintj at pacbell dot net
    > web: home dot pacbell dot net slant earnur slant
    , Sep 27, 2005
    #8
  9. Robbie Hatley

    Dave Weaver Guest

    Paul Lalli <> wrote:
    > > > my $type = "other";
    > > > # See "perldoc -f -X" for a complete list of tests
    > > > $type = "dir" if -d $pathname;
    > > > $type = "link" if -l $pathname;
    > > > $type = "file" if -f $pathname;

    >
    > Actually, this code has another error that I didn't recognize before my
    > first reply. If $pathname is a link, the -f here will actually be
    > testing the entry that $pathname points to, rather than $pathname
    > itself. In other words, $type will only be set to "link" if the entry
    > that $pathname points to happens to not be a file.


    Thank you for the info; I stand corrected and educated! :)
    Dave Weaver, Sep 27, 2005
    #9
  10. Robbie Hatley

    Dr.Ruud Guest

    Dave Weaver:
    > Paul Lalli:


    >>>> my $type = "other";
    >>>> # See "perldoc -f -X" for a complete list of tests
    >>>> $type = "dir" if -d $pathname;
    >>>> $type = "link" if -l $pathname;
    >>>> $type = "file" if -f $pathname;

    >>
    >> Actually, this code has another error that I didn't recognize
    >> before my first reply. If $pathname is a link, the -f here will
    >> actually be testing the entry that $pathname points to, rather than
    >> $pathname itself. In other words, $type will only be set to "link"
    >> if the entry that $pathname points to happens to not be a file.

    >
    > Thank you for the info; I stand corrected and educated! :)


    You have to test for 'link' first, a dir can be a symlink too.

    my ($size) = (lstat $pathname)[7];
    my $mtime = -M _;

    my $type = '';
    $type .= 'l' if -l _;
    $type .= 'd' if -d _;
    $type .= 'f' if -f _;

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, Sep 27, 2005
    #10
  11. Robbie Hatley

    Joe Smith Guest

    Reinhard Pagitsch wrote:

    > my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
    > $atime,$mtime,$ctime,$blksize,$blocks) = stat($_);
    > my $FileRecord = {};
    > $FileRecord->{Date} = "Unknown";
    > $FileRecord->{Time} = $mtime;
    > $FileRecord->{Type} = $mode;
    > $FileRecord->{Size} = $size;
    > $FileRecord->{Attr} = "Unknown";
    > $FileRecord->{Name} = $_;
    >


    You're Type is wrong: the low-order 12 bits are Mode, not Type.

    ... = lstat($_); # Must use lstat() for -l() to work
    $FileRecord->{AccessMode} = $mode & 0x0fff; # 07777 octal
    my $type_code = ($mode & 0xf000) >> 12;
    $FileRecord->{Type} =
    -l _ ? 'Symlink' :
    -d _ ? 'Directory' :
    -f _ ? 'File' :
    -p _ ? 'NamedPipe' :
    -S _ ? 'UnixSocket' :
    -b _ ? 'BlockDev' :
    -c _ ? 'CharDev' :
    $type_code == 0x0d ? 'Door' :
    "Unknown($type_code)";

    Perl does not know that Solaris has doors:
    solaris8# ls -ldF /var/run/*_door /etc/sysevent/*_door
    Dr--r--r-- 1 root 0 Apr 4 2002 /etc/sysevent/piclevent_door>
    Drw------- 1 root 0 Apr 3 2002 /etc/sysevent/sysevent_door>
    Dr--r--r-- 1 root 0 Feb 21 2005 /var/run/picld_door>
    drwxrwxrwt 2 root 69 Feb 22 2005 /var/run/rpc_door/
    Drw-r--r-- 1 root 0 Feb 22 2005 /var/run/syslog_door>

    -Joe
    Joe Smith, Sep 27, 2005
    #11
  12. Wow, I got more (and more-detailed) responses to this thread
    than I had anticpated! Thanks to all who responed. It will
    take me a while to digest all the ideas presented.

    I haven't had time to follow up on this thread the last couple
    days; been busy at work (disentangling 650000 lines of bad
    C/C++/Win32api code from a departed (ousted) chief programmer;
    an ongoing chore of large magnitude; not as fun as Perl).

    A few brief comments:

    Jürgen Exner wrote, regarding my use of arrays of hash refs to
    emulate C++ list of structs:

    > I think that's the most perlish way.


    Isn't that supposed to be "Perlescent"? :)

    Dave Weaver wrote:

    > If the only thing you're going to do with @files is loop over
    > them like this, why bother slurping them into an array in the
    > first place? It's better and more scaleable to use something
    > like: while( $FileName = readdir( $dir ) {


    The info is going to be used more than once, though.
    I generally prefer to slurp oft-used data from HD to RAM and
    massage it there, rather than hammer the HD with repeated
    reads of the same data.

    > my ( $mode, $size, $mtime ) = (stat)[ 2, 7, 9 ];


    Now that's truly effiecient looking. I think I'll put
    something like that in my program.

    Paul Lalli wrote:

    > my $type = -d _ ? 'dir'
    > : -l _ ? 'link'
    > : -f _ ? 'file'
    > : 'other';


    Ok, so that's 3 nested ?: operators; but what's this "_" thing?
    Some kind of way of feeding "last referenced file" into the
    file-test operators?

    > ...include the $! variable in your die()...


    What's "$!"? Some sort of "last error" thingy?

    (Those last two questions are probably idiotic; but I'm
    away from my perl books as I write this or I'd just look
    them up.)

    I can see I have lots of reading and hacking to do this
    weekend. Again, thanks to all who replied to this thread!

    Cheers,
    Robbie Hatley
    Robbie Hatley, Sep 30, 2005
    #12
  13. Robbie Hatley

    Paul Lalli Guest

    Robbie Hatley wrote:
    > Paul Lalli wrote:
    >
    > > my $type = -d _ ? 'dir'
    > > : -l _ ? 'link'
    > > : -f _ ? 'file'
    > > : 'other';

    >
    > Ok, so that's 3 nested ?: operators; but what's this "_" thing?
    > Some kind of way of feeding "last referenced file" into the
    > file-test operators?


    Yes. The special _ filehandle says to test the already existing data
    retrieved by the previous stat() call, rather than calling stat on the
    same file three additional separate times.

    > > ...include the $! variable in your die()...

    >
    > What's "$!"? Some sort of "last error" thingy?


    Yes. It is the last error returned by the system.

    > (Those last two questions are probably idiotic; but I'm
    > away from my perl books as I write this or I'd just look
    > them up.)


    You were clearly at a computer and connected to the internet at the
    time you wrote this. No reason you couldn't just have used the
    built-in documentation...

    perldoc -f -X
    perldoc perlvar
    http://perldoc.perl.org/functions/-X.html
    http://perldoc.perl.org/perlvar.html#$!

    Paul Lalli
    Paul Lalli, Oct 3, 2005
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Vasanth
    Replies:
    0
    Views:
    547
    Vasanth
    Jun 28, 2004
  2. Pavils Jurjans

    Get detailed error report on remote host

    Pavils Jurjans, Nov 25, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    375
    Scott Allen
    Nov 25, 2004
  3. darrel

    Getting more detailed errors

    darrel, Jun 10, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    476
    darrel
    Jun 13, 2005
  4. Ben Harper
    Replies:
    2
    Views:
    461
    Ben Harper
    Jul 5, 2005
  5. Jack
    Replies:
    0
    Views:
    167
Loading...

Share This Page