How do I get more-detailed directory info?

R

Robbie Hatley

Greetings, group. This is the first time I've posted here,
or in any Perl-related forum for that matter. I'm a rank
beginner at Perl. I'm trying to teaching myself the language
from reading books (incl. one with a camel on the cover, and
one with a llama; not that anyone here would recongize those,
of course), trying my own Perl hacks, and struggling
with compiler errror messages.

The training program I'm currently trying to write is a
duplicate-file finding/removing program. A rather elaborate
program when written in C++. An engineer friend told me it
would be much simpler in Perl. I had to tell him, "Cool, but
I don't know Perl". He seemed offended and said "Your loss.".

So... pasted below is my first (incomplete) attempt at writing
a real Perl program.

I have hundreds of questions about Perl, but for now I'll
ask this group just two questions:

1:
How do I get a more-detailed directory listing than is offered
by the readdir function? Is their any way to goad that
function into coughing-up file-type (file or directory or link),
size in bytes, mod-time, mod-date, attribtutes, etc.? Or do I
have to use some other approach to get that data?

2:
Is there a better way to emulate the C++ concept of a "list of
structs" than what I'm doing below? (I'm using an array of refs
to hashes.)



Here's my program at it's current state of development:



################################################################################
##########
# dedup3.perl
# Duplicate file finding/erasing program.
# Written by Robbie Hatley, as a "learn Perl" excercise.
# Plan: Recursively decend directory tree starting from current working
directory,
# and make a master list of all files encountered on this branch. Order the
list by size.
# Within each size group, compare each file, from left to right, to all the
files to
# its right. If a duplicate pair is found, alert user and get user input. Give
user
# these choices:
# 1. Erase left file
# 2. Erase right file
# 3. Ignore this pair of duplicate files and move to next
# 4. Quit
# If user elects to delete a file, delete it, then move to next duplicate file
pair.
################################################################################
##########

use strict;
use warnings;

use Cwd;

# Not valid Perl; how do I do this???
# struct FileRecord
# {
# std::string Date;
# std::string Time;
# std::string Type;
# long int Size;
# std::string Attr;
# std::string Name;
# };
#
# std::list<rhdir::FileRecord> FileList;
#
# TODO: How do I extract size, mod-time, mod-date, type (file or dir),
# attributes, etc. and store in an array of structs (or Perl equiv)???
#
# Try an array of hashes?

my $CurDir = getcwd();
print "CWD = ", $CurDir, "\n";
opendir(Dot, ".") or die "Can\'t open the directory!!!";

my @LocalFiles;
my $FileName;
my $FileRecord;

while ($FileName=readdir(Dot))
{
$FileRecord=
{
"Date" => "Unknown",
"Time" => "Unknown",
"Type" => "Unknown",
"Size" => 42,
"Attr" => "Unknown",
"Name" => $FileName
};
push @LocalFiles, $FileRecord;
}

closedir(Dot);

foreach $FileRecord (@LocalFiles)
{
print($$FileRecord{"Name"}, "\n");
}




--
Cheers,
Robbie Hatley
Tustin, CA, USA
email: lonewolfintj at pacbell dot net
web: home dot pacbell dot net slant earnur slant
 
R

Reinhard Pagitsch

Robbie said:
Greetings, group. This is the first time I've posted here,
or in any Perl-related forum for that matter. I'm a rank
beginner at Perl. I'm trying to teaching myself the language
from reading books (incl. one with a camel on the cover, and
one with a llama; not that anyone here would recongize those,
of course), trying my own Perl hacks, and struggling
with compiler errror messages.

The training program I'm currently trying to write is a
duplicate-file finding/removing program. A rather elaborate
program when written in C++. An engineer friend told me it
would be much simpler in Perl. I had to tell him, "Cool, but
I don't know Perl". He seemed offended and said "Your loss.".

So... pasted below is my first (incomplete) attempt at writing
a real Perl program.

I have hundreds of questions about Perl, but for now I'll
ask this group just two questions:

1:
How do I get a more-detailed directory listing than is offered
by the readdir function? Is their any way to goad that
function into coughing-up file-type (file or directory or link),
size in bytes, mod-time, mod-date, attribtutes, etc.? Or do I
have to use some other approach to get that data?

2:
Is there a better way to emulate the C++ concept of a "list of
structs" than what I'm doing below? (I'm using an array of refs
to hashes.)

[ code snipped ]

Here is an example:
Where to look:
perldoc -f stat (more detailed informations about a file)
perldoc perldsc, Perl Data Structures Cookbook (ARRAYS OF HASHES)


--- code---
use Cwd;
use strict;

my $CurDir = getcwd();
print "CWD = ", $CurDir, "\n";
opendir(DOT, ".") or die "Can\'t open the directory!!!";

my @LocalFiles;
my $FileName;
my @files = readdir(DOT);
my $Rec;
foreach (@files)
{
my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks) = stat($_);
my $FileRecord = {};
$FileRecord->{Date} = "Unknown";
$FileRecord->{Time} = $mtime;
$FileRecord->{Type} = $mode;
$FileRecord->{Size} = $size;
$FileRecord->{Attr} = "Unknown";
$FileRecord->{Name} = $_;

push(@LocalFiles, $FileRecord);
}

closedir(DOT);


my $role;
foreach $Rec (@LocalFiles)
{
print "{ ";
for $role (keys %$Rec)
{
print "$role=" . $Rec->{$role} . " ";
}
print " }\n";
}

--- code ---
 
J

Jürgen Exner

Robbie said:
1:
How do I get a more-detailed directory listing than is offered
by the readdir function? Is their any way to goad that
function into coughing-up file-type (file or directory or link),
size in bytes, mod-time, mod-date, attribtutes, etc.? Or do I
have to use some other approach to get that data?

Those data are informations about individual files. See stat() and/or the
file test operators (e.g. -M, -f, ...)
2:
Is there a better way to emulate the C++ concept of a "list of
structs" than what I'm doing below? (I'm using an array of refs
to hashes.)

I think that's the most perlish way.

jue
 
T

Tad McClellan

Robbie Hatley said:
and struggling
with compiler errror messages.


It may help to lookup the messages in

perldoc perldiag

1:
How do I get a more-detailed directory listing than is offered
by the readdir function? Is their any way to goad that
function into coughing-up file-type (file or directory or link),
size in bytes, mod-time, mod-date, attribtutes, etc.?


perldoc -f stat

perldoc -f -X


Be sure to pay close attention to

perldoc -f readdir

particularly the part that starts with

If you're planning to filetest the return values out of a "readdir"...

# Plan: Recursively decend directory tree starting from current working
directory,


perldoc File::Find

opendir(Dot, ".") or die "Can\'t open the directory!!!";


You should include the $! variable in your die() message.

There is no need to backslash a single quote in a double quoted string.
 
D

Dave Weaver

Reinhard Pagitsch said:
--- code---
use Cwd;
use strict;

my $CurDir = getcwd();
print "CWD = ", $CurDir, "\n";
opendir(DOT, ".") or die "Can\'t open the directory!!!";

Yuk. Lexical filehandles have been available for years.

opendir my $dir "." or die ...
my @LocalFiles;
my $FileName;
my @files = readdir(DOT);
my $Rec;
foreach (@files)

If the only thing you're going to do with @files is loop over
them like this, why bother slurping them into an array in the first
place?
It's better and more scaleable to use something like
while( $FileName = readdir( $dir ) {
{
my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks) = stat($_);

No need to create all those variables and then ignore them. Just get
the things you need from stat():

my ( $mode, $size, $mtime ) = (stat)[ 2, 7, 9 ];

my $FileRecord = {};
$FileRecord->{Date} = "Unknown";
$FileRecord->{Time} = $mtime;
$FileRecord->{Type} = $mode;
$FileRecord->{Size} = $size;
$FileRecord->{Attr} = "Unknown";
$FileRecord->{Name} = $_;

push(@LocalFiles, $FileRecord);

I would write this as:

push @LocalFiles, {
Time => $mtime,
Type => $mode,
Size => $size,
Name => $FileName,
... etc ...
};


My code to output a simple ls-like listing would be as follows:

#!/usr/bin/perl
use warnings;
use strict;

my $dirname = "/tmp";

opendir my $dir, $dirname or die $!;
while ( my $filename = readdir $dir ) {
my $pathname = "$dirname/$filename";

my ( $size, $mtime ) = (stat $pathname)[ 7, 9 ];

my $type = "other";
# See "perldoc -f -X" for a complete list of tests
$type = "dir" if -d $pathname;
$type = "link" if -l $pathname;
$type = "file" if -f $pathname;

printf "%5s %20s %10d %s\n",
$type,
$filename,
$size,
scalar localtime $mtime;
}
closedir $dir;
 
P

Paul Lalli

Dave said:
while ( my $filename = readdir $dir ) {
my $pathname = "$dirname/$filename";

my ( $size, $mtime ) = (stat $pathname)[ 7, 9 ];

Why are you doing an explicit call to stat() here, but using the file
test operators below? Why not be consistant?

my ($size, $mtime) = (-s $pathname, -M _);
my $type = "other";
# See "perldoc -f -X" for a complete list of tests
$type = "dir" if -d $pathname;
$type = "link" if -l $pathname;
$type = "file" if -f $pathname;

You've already done one call to stat() in this loop. No need to
duplicate that call 3 more times.

my $type = "other";
$type = 'dir' if -d _;
$type = 'link' if -l _;
$type = 'file' if -f _;

or, my preferred way of writing this...
my $type = -d _ ? 'dir'
: -l _ ? 'link'
: -f _ ? 'file'
: 'other';


Paul Lalli
 
P

Paul Lalli

Paul said:
Dave said:
while ( my $filename = readdir $dir ) {
my $pathname = "$dirname/$filename";

my ( $size, $mtime ) = (stat $pathname)[ 7, 9 ];

Why are you doing an explicit call to stat() here, but using the file
test operators below? Why not be consistant?

my ($size, $mtime) = (-s $pathname, -M _);
my $type = "other";
# See "perldoc -f -X" for a complete list of tests
$type = "dir" if -d $pathname;
$type = "link" if -l $pathname;
$type = "file" if -f $pathname;

Actually, this code has another error that I didn't recognize before my
first reply. If $pathname is a link, the -f here will actually be
testing the entry that $pathname points to, rather than $pathname
itself. In other words, $type will only be set to "link" if the entry
that $pathname points to happens to not be a file.
You've already done one call to stat() in this loop. No need to
duplicate that call 3 more times.

my $type = "other";
$type = 'dir' if -d _;
$type = 'link' if -l _;
$type = 'file' if -f _;

This also has the corrolary problem that this code of mine will not
function correctly, as it will give a "The stat preceding -l _ wasn't
an lstat" error.

If we really want to find out exactly what $pathname is, I think we
have to do an lstat first, and then continue testing the results of
that call.

my ($size) = (lstat $pathname)[7];
my $mtime = -M _;

my $type = -d _ ? 'dir'
: -l _ ? 'link'
: -f _ ? 'file'
: 'other';

Paul Lalli
 
D

Dave Weaver

Paul Lalli said:
Actually, this code has another error that I didn't recognize before my
first reply. If $pathname is a link, the -f here will actually be
testing the entry that $pathname points to, rather than $pathname
itself. In other words, $type will only be set to "link" if the entry
that $pathname points to happens to not be a file.

Thank you for the info; I stand corrected and educated! :)
 
D

Dr.Ruud

Dave Weaver:
Paul Lalli:

Thank you for the info; I stand corrected and educated! :)

You have to test for 'link' first, a dir can be a symlink too.

my ($size) = (lstat $pathname)[7];
my $mtime = -M _;

my $type = '';
$type .= 'l' if -l _;
$type .= 'd' if -d _;
$type .= 'f' if -f _;
 
J

Joe Smith

Reinhard said:
my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks) = stat($_);
my $FileRecord = {};
$FileRecord->{Date} = "Unknown";
$FileRecord->{Time} = $mtime;
$FileRecord->{Type} = $mode;
$FileRecord->{Size} = $size;
$FileRecord->{Attr} = "Unknown";
$FileRecord->{Name} = $_;

You're Type is wrong: the low-order 12 bits are Mode, not Type.

... = lstat($_); # Must use lstat() for -l() to work
$FileRecord->{AccessMode} = $mode & 0x0fff; # 07777 octal
my $type_code = ($mode & 0xf000) >> 12;
$FileRecord->{Type} =
-l _ ? 'Symlink' :
-d _ ? 'Directory' :
-f _ ? 'File' :
-p _ ? 'NamedPipe' :
-S _ ? 'UnixSocket' :
-b _ ? 'BlockDev' :
-c _ ? 'CharDev' :
$type_code == 0x0d ? 'Door' :
"Unknown($type_code)";

Perl does not know that Solaris has doors:
solaris8# ls -ldF /var/run/*_door /etc/sysevent/*_door
Dr--r--r-- 1 root 0 Apr 4 2002 /etc/sysevent/piclevent_door>
Drw------- 1 root 0 Apr 3 2002 /etc/sysevent/sysevent_door>
Dr--r--r-- 1 root 0 Feb 21 2005 /var/run/picld_door>
drwxrwxrwt 2 root 69 Feb 22 2005 /var/run/rpc_door/
Drw-r--r-- 1 root 0 Feb 22 2005 /var/run/syslog_door>

-Joe
 
R

Robbie Hatley

Wow, I got more (and more-detailed) responses to this thread
than I had anticpated! Thanks to all who responed. It will
take me a while to digest all the ideas presented.

I haven't had time to follow up on this thread the last couple
days; been busy at work (disentangling 650000 lines of bad
C/C++/Win32api code from a departed (ousted) chief programmer;
an ongoing chore of large magnitude; not as fun as Perl).

A few brief comments:

Jürgen Exner wrote, regarding my use of arrays of hash refs to
emulate C++ list of structs:
I think that's the most perlish way.

Isn't that supposed to be "Perlescent"? :)

Dave said:
If the only thing you're going to do with @files is loop over
them like this, why bother slurping them into an array in the
first place? It's better and more scaleable to use something
like: while( $FileName = readdir( $dir ) {

The info is going to be used more than once, though.
I generally prefer to slurp oft-used data from HD to RAM and
massage it there, rather than hammer the HD with repeated
reads of the same data.
my ( $mode, $size, $mtime ) = (stat)[ 2, 7, 9 ];

Now that's truly effiecient looking. I think I'll put
something like that in my program.

Paul said:
my $type = -d _ ? 'dir'
: -l _ ? 'link'
: -f _ ? 'file'
: 'other';

Ok, so that's 3 nested ?: operators; but what's this "_" thing?
Some kind of way of feeding "last referenced file" into the
file-test operators?
...include the $! variable in your die()...

What's "$!"? Some sort of "last error" thingy?

(Those last two questions are probably idiotic; but I'm
away from my perl books as I write this or I'd just look
them up.)

I can see I have lots of reading and hacking to do this
weekend. Again, thanks to all who replied to this thread!

Cheers,
Robbie Hatley
 
P

Paul Lalli

Robbie said:
Ok, so that's 3 nested ?: operators; but what's this "_" thing?
Some kind of way of feeding "last referenced file" into the
file-test operators?

Yes. The special _ filehandle says to test the already existing data
retrieved by the previous stat() call, rather than calling stat on the
same file three additional separate times.
What's "$!"? Some sort of "last error" thingy?

Yes. It is the last error returned by the system.
(Those last two questions are probably idiotic; but I'm
away from my perl books as I write this or I'd just look
them up.)

You were clearly at a computer and connected to the internet at the
time you wrote this. No reason you couldn't just have used the
built-in documentation...

perldoc -f -X
perldoc perlvar
http://perldoc.perl.org/functions/-X.html
http://perldoc.perl.org/perlvar.html#$!

Paul Lalli
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top