translating an OS directory recursively into a tree object

M

Mathematisch

Hello,

Would someone please show how to create a tree data structure by using
an OS directory as input. So the nodes should be the names of the
directories and the leaves should represent each file. Each directory
should turn into a node and its files and sub directories will become
its children.

In this example, I have downloaded the Tree::Simple module from CPAN
and I know that Perl's File::Find module traverses a directory
recursively. I am stuck as I do not know how one can create a tree
object from the results of File::Find

Below is the code that I've tried on a directory with multiple sub
directories and files. It does not create the desired tree object:

#start of the failed code
use strict;
use warnings;
use File::Find;
use Tree::Simple;
use Data::Dumper;


my $tree = Tree::Simple->new("0", Tree::Simple->ROOT); #create the
root node
find(\&processCurrent, ('C:\PROJECTS\MyDir') ); #process all
directories and files
print Dumper $tree; #the desired tree is not created!

sub processCurrent {
if (-d) { #this is a directory
$tree = Tree::Simple->new($_, $tree); #create a new
subtree, take the last one as parent
}
elsif (-f) { #this is a file
$tree->addChild(Tree::Simple->new($_)); #add this file
to the current tree
}
}

#end of the failed code
 
M

Mathematisch

Quoth Mathematisch <[email protected]>:


Would someone please show how to create a tree data structure by using
an OS directory as input. So the nodes should be the names of the
directories and the leaves should represent each file. Each directory
should turn into a node and its files and sub directories will become
its children.
In this example, I have downloaded the Tree::Simple module from CPAN
and I know that Perl's File::Find module traverses a directory
recursively. I am stuck as I do not know how one can create a tree
object from the results of File::Find
Below is the code that I've tried on a directory with multiple sub
directories and files. It does not create the desired tree object:
#start of the failed code
use strict;
use warnings;
use File::Find;
use Tree::Simple;
use Data::Dumper;
my $tree = Tree::Simple->new("0", Tree::Simple->ROOT); #create the
root node
find(\&processCurrent, ('C:\PROJECTS\MyDir') ); #process all
directories and files

In general it is better to use forward slashes for paths under Win32.
The only exceptions are paths that will be passed to cmd.exe (via system,
or otherwise) and paths that will be passed to native Win32 APIs like
CreateProcess.
print Dumper $tree; #the desired tree is not created!
sub processCurrent {
    if (-d) {                       #this is a directory
       $tree = Tree::Simple->new($_, $tree);       #create a new
subtree, take the last one as parent

This is your problem. You are setting $tree to the new subtree, but
you've then lost your handle on the parent tree. This means that when
File::Find moves up a level in the hierarchy the new nodes will be added
in the wrong place. One solution is to keep a stack of 'parents',
something like this:

    my @parent = Tree::Simple->new("0");

    find {
        wanted => sub {
            my $node = Tree::Simple->new($_);
            $parent[-1]->addChild($node);
            -d and push @parent, $node;
        },
        postprocess => sub {
            pop @parent;
        },
    }, "C:/PROJECTS/MyDir";

This will leave you with the desired tree as the only remaining element
in @parent.

Ben

Ben, thanks a lot for the clarification.

After some further search, I have found a module extension which
parses a directory into a Tree::Simple object (i.e.:
http://search.cpan.org/~stevan/Tree.../lib/Tree/Simple/Visitor/LoadDirectoryTree.pm)

It might be useful for other people having this issue.

Best regards
 
D

David Combs

Mathematisch said:
Would someone please show how to create a tree data structure by using
an OS directory as input.


This probably won't help you solve your problem, but whenever the
subject of parsing directory trees comes up, I just must share it.

Got this off the 'net over a decade ago...

find . -print | sed -e 's,[^/]*/\([^/]*\)$,`--\1,' -e 's,[^/]*/,| ,g'

Simply amazing.


If you have the spare time -- uh -- you could save a whole bunch
of us the (considerable?) effort of trying to figure out exactly how it
works.

Nor am I so sure that I *could* do it. :-(

(No demand, there, of course -- you're ALREADY giving
enormous amouts of time and brain-power to this group!)


Thanks,

David
 
P

Peter J. Holzer

This probably won't help you solve your problem, but whenever the
subject of parsing directory trees comes up, I just must share it.

Got this off the 'net over a decade ago...

find . -print | sed -e 's,[^/]*/\([^/]*\)$,`--\1,' -e 's,[^/]*/,| ,g'

Simply amazing.


If you have the spare time -- uh -- you could save a whole bunch
of us the (considerable?) effort of trying to figure out exactly how it
works.

Nor am I so sure that I *could* do it. :-(

It isn't actually that hard to understand (especially if you try it),
but it's pretty cool. I like it.

Prints file (and directory) names like this:

../a/b/c
sed
-e 's,[^/]*/\([^/]*\)$,`--\1,'

This replaces the penultimate path component (including the trailing
slash) with "`--" but preserves everything before and after. So
./a/b/c
becomes
./a/`--c

This replaces every path component (with a trailing slash)
with "| ". So now we get

| | `--c

Doesn't look very impressive for only one filename, does it? But if you
do it for a whole tree, the vertical lines and backticks line up nicely
and you get a tree (with a few extra lines, but avoiding them would make
it a lot more complicated).

Your mission, should you choose to accept it, is to turn this into a
Perl one-liner ;-).

hp
 
J

Jochen Lehmeier

Your mission, should you choose to accept it, is to turn this into a
Perl one-liner ;-).

You did not specify line length. This fits in one of mine:

perl -we 'sub p{$d=$_[0];$_=$d;s,[^/]*/([^/]*)$,`--$1,;s,[^/]*/,|
,g;print"$_\n";for(<$d/*>,<$d/.??*>){p($_)}};p(".")'

Anyone who finds the obvious bug can keep it.
 
S

Steve C

Peter said:
Your mission, should you choose to accept it, is to turn this into a
Perl one-liner ;-).

The sed part is easy enough:

find . -print |perl -lpe 's,[^/]*/([^/]*)$,`--$1,;s,[^/]*/,| ,g'
 
M

Martijn Lievaart

Your mission, should you choose to accept it, is to turn this into a
Perl one-liner ;-).

Well only if we can improve on it, this one sorts the files, which the original doesn't.

perl -MFile::Find -e 'find({preprocess=>sub{sort@_},wanted=>sub{$_=$File::Find::name;s,[^/]*/([^/]*)$,`--$1,;s,[^/]*/,| ,g;print"$_\n"}},".")'

Any shorter?

M4
 
S

sln

This probably won't help you solve your problem, but whenever the
subject of parsing directory trees comes up, I just must share it.

Got this off the 'net over a decade ago...

find . -print | sed -e 's,[^/]*/\([^/]*\)$,`--\1,' -e 's,[^/]*/,| ,g'

Simply amazing.


If you have the spare time -- uh -- you could save a whole bunch
of us the (considerable?) effort of trying to figure out exactly how it
works.

Nor am I so sure that I *could* do it. :-(

It isn't actually that hard to understand (especially if you try it),
but it's pretty cool. I like it.

Prints file (and directory) names like this:

./a/b/c
sed
-e 's,[^/]*/\([^/]*\)$,`--\1,'

This replaces the penultimate path component (including the trailing
slash) with "`--" but preserves everything before and after. So
./a/b/c
becomes
./a/`--c

This replaces every path component (with a trailing slash)
with "| ". So now we get

| | `--c

Doesn't look very impressive for only one filename, does it? But if you
do it for a whole tree, the vertical lines and backticks line up nicely
and you get a tree (with a few extra lines, but avoiding them would make
it a lot more complicated).

Your mission, should you choose to accept it, is to turn this into a
Perl one-liner ;-).

hp

s{ [^/]*/ }{| }xg,
s{ \|\ ([^|]*)$ }{`--$1}x,
print "$_\n" for @{[qw{
 
S

sln

Your mission, should you choose to accept it, is to turn this into a
Perl one-liner ;-).

hp

s{ [^/]*/ }{| }xg,
s{ \|\ ([^|]*)$ }{`--$1}x,
print "$_\n" for @{[qw{
.
./a
./a/b
./a/b/c
}]};

Forgot that extra space.

s{ [^/]*/ }{| }xg,
s{ \|\ \ ([^|]*)$ }{`--$1}x,
print "$_\n" for @{[qw{
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top