Find.find sorting: files before directories

A

Adam Boyle

I'm looking for a simple way to use Find.find to produce a list of
files/directories for NSIS. Ideally the list will sort files before
directories so as to minimize the number of times SetOutPath is used
(an NSIS function).

The script so far is as follows:
-----------snip-----------------------
dirs = ["jruby-1.0.3"]
excludes = []
for dir in dirs
folder = ''
Find.find(dir) do |path|
if FileTest.directory?(path)
if excludes.include?(File.basename(path))
Find.prune # Don't look any further into this directory.
else
next
end
else
if folder != File.dirname(path)
folder = File.dirname(path)
puts 'SetOutPath "' + folder + '"'
end
puts 'File "' + path + '"'
end
end
end

-----------------------end snip-------------------

Simple directory traversal, no problem there. The issue is that
Find.find doesn't allow any sort of ordering to be specified; namely
that directories are mixed in with files. The script produces the
following output:

---------------output---------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
-----------------------end output-------------------------

The desired output would sort sub-folders before files when traversing
a given directory, thus eliminating the duplicate entries for
SetOutPath "jruby-1.0.3/docs" as seen above.

------------desired output--------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
------------------------------end output--------------------------

Any ideas?
 
A

Adam Boyle

I forgot to include the require:
------------snip---------------------
require 'find' # oops!
dirs = ["jruby-1.0.3"]
excludes = []
for dir in dirs
folder = ''
Find.find(dir) do |path|
if FileTest.directory?(path)
if excludes.include?(File.basename(path))
Find.prune # Don't look any further into this directory.
else
next
end
else
if folder != File.dirname(path)
folder = File.dirname(path)
puts 'SetOutPath "' + folder + '"'
end
puts 'File "' + path + '"'
end
end
end
-----------------------end snip-------------------
 
R

Robert Klemme

I'm looking for a simple way to use Find.find to produce a list of
files/directories for NSIS. Ideally the list will sort files before
directories so as to minimize the number of times SetOutPath is used
(an NSIS function).

The script so far is as follows:
-----------snip-----------------------
dirs = ["jruby-1.0.3"]
excludes = []
for dir in dirs
folder = ''
Find.find(dir) do |path|
if FileTest.directory?(path)
if excludes.include?(File.basename(path))
Find.prune # Don't look any further into this directory.
else
next
end
else
if folder != File.dirname(path)
folder = File.dirname(path)
puts 'SetOutPath "' + folder + '"'
end
puts 'File "' + path + '"'
end
end
end

-----------------------end snip-------------------

Simple directory traversal, no problem there. The issue is that
Find.find doesn't allow any sort of ordering to be specified; namely
that directories are mixed in with files. The script produces the
following output:

---------------output---------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
-----------------------end output-------------------------

The desired output would sort sub-folders before files when traversing
a given directory, thus eliminating the duplicate entries for
SetOutPath "jruby-1.0.3/docs" as seen above.

------------desired output--------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
------------------------------end output--------------------------

Any ideas?

IIRC Find.find does sort sub folders before files. To me your code
seems pretty complex and I believe the major problem is that you check
the same directory over and over again for exclusion because you use
File.basename. Can't you just do this?

Find.find base do |path|
if File.directory? path
if excludes.include? path
Find.prune
else
puts "SetOutPath #{path}"
end
else
puts "File #{path}"
end
end

Kind regards

robert
 
A

Adam Boyle

I mixed up sub-folders and files. I meant to say that I want to sort
files before sub-folders (so that the folder that a file resides in
comes right before the file in the list).

I tried the posted code, but I couldn't get it to run. Some String
comparison problem.
---------output---------------
SetOutPath jruby-1.0.3
SetOutPath Djruby-1.0.3/..
C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:45:in `find': comparison of
String with String failed (ArgumentError)
from :1:in `catch'
from C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:38:in `find'
from :1
--------------end output-------
 
R

Robert Klemme

I mixed up sub-folders and files. I meant to say that I want to sort
files before sub-folders (so that the folder that a file resides in
comes right before the file in the list).

Why do you need that? I mean, if a folder is excluded then you want to
exclude all files and subfolders, don't you?
I tried the posted code, but I couldn't get it to run. Some String
comparison problem.
---------output---------------
SetOutPath jruby-1.0.3
SetOutPath Djruby-1.0.3/..
C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:45:in `find': comparison of
String with String failed (ArgumentError)
from :1:in `catch'
from C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:38:in `find'
from :1
--------------end output-------

I have no idea what's wrong there. Did you try with the Ruby
interpreter (instead of JRuby)?

robert
 
A

Adam Boyle

Why do you need that? I mean, if a folder is excluded then you want to
exclude all files and subfolders, don't you?

Yes, an excluded folder would also exclude its children files and
folders.

I'm thinking that I haven't exactly made it clear what my goal is...

Using Find.find, I want to traverse through a directory structure and
make an NSIS-style list of files and their paths for use in an NSIS
installer script. A list of this sort would be best organized if a
directory's file children are listed before the directory children.

Example:
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
....


The "SetOutPath" lines are the directories, the "File" lines are the
files.

The code you gave will gave results like this (once I used Ruby
instead of JRuby :)...):
SetOutPath jruby-1.0.3
SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
File jruby-1.0.3/README
SetOutPath jruby-1.0.3/lib
SetOutPath jruby-1.0.3/lib/ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
....


The issue is that it is very important that the SetOutPath call for
any particular file comes directly before it in the list (ie,
SetOutPath jruby-1.0.3 for the line File jruby-1.0.3/README). That
way the output path is set correctly when the file is extracted from
the installer executable.

The selected lines from the previous example would ideally be listed
this way:
SetOutPath jruby-1.0.3
File jruby-1.0.3/README
SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
....


It's beginning to seem to me that Find.find just doesn't have an easy
way of sorting the elements being traversed. Any additional help is
greatly appreciated.
 
P

Paul Mckibbin

Adam said:
The desired output would sort sub-folders before files when traversing
If output is the only requirement for sort order, why not just have two
arrays, one for files and one for directories? You can then .sort.uniq
the directory listing and the files separately and output them at the
end of the traversal.

Mac
 
R

Robert Klemme

2008/3/24 said:
Yes, an excluded folder would also exclude its children files and
folders.

I'm thinking that I haven't exactly made it clear what my goal is...

Using Find.find, I want to traverse through a directory structure and
make an NSIS-style list of files and their paths for use in an NSIS
installer script. A list of this sort would be best organized if a
directory's file children are listed before the directory children.

Example:

SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"

SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"

SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"

...


The "SetOutPath" lines are the directories, the "File" lines are the
files.

The code you gave will gave results like this (once I used Ruby
instead of JRuby :)...):

SetOutPath jruby-1.0.3

SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
File jruby-1.0.3/README
SetOutPath jruby-1.0.3/lib
SetOutPath jruby-1.0.3/lib/ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
...


The issue is that it is very important that the SetOutPath call for
any particular file comes directly before it in the list (ie,
SetOutPath jruby-1.0.3 for the line File jruby-1.0.3/README). That
way the output path is set correctly when the file is extracted from
the installer executable.

The selected lines from the previous example would ideally be listed
this way:

SetOutPath jruby-1.0.3

File jruby-1.0.3/README
SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
...


It's beginning to seem to me that Find.find just doesn't have an easy
way of sorting the elements being traversed. Any additional help is
greatly appreciated.

Thanks for clarifying. You could do something like this:

require 'find'

base='.'
excludes = []
dirs = Hash.new {|h,p| h[p]=[]}

Find.find base do |path|
if File.directory? path
Find.prune if excludes.include? path
else
dirs[File.dirname(path)] << path
end
end

dirs.sort.each do |dir,files|
puts "SetOutPath #{dir}"
files.each {|f| puts "File #{f}"}
end

If you want to do printing while traversing then the code becomes more
complicated (either using Find or manual traversal via Dir[]).

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top