A directory "grep" in RUBY?

P

Peter Bailey

Hi,
Can someone point me to a quick "grep" like function in RUBY? I use glob
all the time, but, I need the power of regular expressions when finding
particular files in directories. I see "grep" in the Pickaxe manual,
but, only as it relates to ".enum," which, sorry to say, I don't
understand.

I need something like this, even though, I know this doesn't work:
Dir.glob(/^f[0-9]{7}\.eps/)

Thanks!
Peter
 
W

WATANABE Hirofumi

Hi,

Peter Bailey said:
I need something like this, even though, I know this doesn't work:
Dir.glob(/^f[0-9]{7}\.eps/)

% ls
etc1 f1234500.eps f1234502.eps f1234504.eps f1234567.eps
etc2 f1234501.eps f1234503.eps f1234505.eps
% ruby -e 'puts Dir.entries(".").grep(/f\d{7}\.eps/)'
f1234567.eps
f1234500.eps
f1234503.eps
f1234502.eps
f1234504.eps
f1234501.eps
f1234505.eps
 
R

Robert Klemme

Hi,
Can someone point me to a quick "grep" like function in RUBY? I use glob
all the time, but, I need the power of regular expressions when finding
particular files in directories. I see "grep" in the Pickaxe manual,
but, only as it relates to ".enum," which, sorry to say, I don't
understand.

I need something like this, even though, I know this doesn't work:
Dir.glob(/^f[0-9]{7}\.eps/)

Dir.glob('f*.eps').grep(/^f[0-9]{7}\.eps$/)

There's plenty more options around.

robert
 
P

Peter Bailey

WATANABE said:
Hi,

Peter Bailey said:
I need something like this, even though, I know this doesn't work:
Dir.glob(/^f[0-9]{7}\.eps/)

% ls
etc1 f1234500.eps f1234502.eps f1234504.eps f1234567.eps
etc2 f1234501.eps f1234503.eps f1234505.eps
% ruby -e 'puts Dir.entries(".").grep(/f\d{7}\.eps/)'
f1234567.eps
f1234500.eps
f1234503.eps
f1234502.eps
f1234504.eps
f1234501.eps
f1234505.eps

Thank you! I've never seen "Dir.entries" before. Very cool.

I love this forum!

-Peter
 
H

Hugh Sasse

Hi,
Can someone point me to a quick "grep" like function in RUBY? I use glob

Not the answer you want [below], but look at glark
http://www.incava.org/projects/glark/
It is a ruby grep, "interbred" with find...
all the time, but, I need the power of regular expressions when finding
particular files in directories. I see "grep" in the Pickaxe manual,
but, only as it relates to ".enum," which, sorry to say, I don't
understand.

Dir["*"].grep(/^c/) # all entries matching * (glob) that begin with C.

enum is enumerable -- `ri Enumerable` gives:
<quote>
------------------------------------------------------ Class: Enumerable
The +Enumerable+ mixin provides collection classes with several
traversal and searching methods, and with the ability to sort. The
class must provide a method +each+, which yields successive members
of the collection. If +Enumerable#max+, +#min+, or +#sort+ is used,
the objects in the collection must also implement a meaningful
+<=>+ operator, as these methods rely on an ordering between
members of the collection.

------------------------------------------------------------------------


Instance methods:
-----------------
all?, any?, collect, detect, each_cons, each_slice,
each_with_index, entries, enum_cons, enum_slice, enum_with_index,
find, find_all, grep, include?, inject, inject, map, max, member?,
min, partition, reject, select, sort, sort_by, to_a, to_set, zip

I need something like this, even though, I know this doesn't work:
Dir.glob(/^f[0-9]{7}\.eps/)

Thanks!
Peter

HTH
Hugh
 
L

Louis J Scoras

I need something like this, even though, I know this doesn't work:
Dir.glob(/^f[0-9]{7}\.eps/)

You might not want to do this--see the other suggestions in the
thread--but something along these lines would make the above snippit
work as expected:

class Dir
class << self
alias_method :__original_glob, :glob
end

def self.glob(query,*flags, &blk)
return __original_glob(query,*flags,&blk) unless query.is_a? Regexp

files = []

Dir.new('.').each do |f|
next unless query =~ f
if blk.nil?
files << f
else
blk.call(f)
end
end

blk.nil?? files : nil
end
end

# Testing it out:

if $0 == __FILE__
Dir.glob(/\A\..*/) do |f|
puts "Dot-file: #{f}"
end

backups_regex = Dir.glob(/~\z/)
backups_string = Dir.glob('.?*~')

p backups_regex
p backups_string

p backups_regex == backups_string # => true
end
 
P

Peter Bailey

Thanks, Hugh. I went to that web site and download the .gz file. I need
to find a .gz unzipper for my Windows environment to get at it, but,
I'll do that. It looks very interesting, and, it's all written in RUBY.
And, thanks for the enum explanation. I think I get it now.
 
P

Peter Bailey

Wow! Thanks, Lou. This looks interesting, but, a lot of it seems beyond
me at this point. All I want is particular files in a directory.
 
L

Louis J Scoras

Peter;
Wow! Thanks, Lou. This looks interesting, but, a lot of it seems beyond
me at this point. All I want is particular files in a directory.

No problem at all. If you just need it to work, you can copy that
into a file, require it when you need to and do your thing.

If you're interested on how it works, keep reading. If not my
feelings won't be hurt =)

class Dir

##
# We want to replace the old Dir.glob function with one that also takes a
# Regexp obeject. Now, just to come clean from the begining,
this might not
# be the best of ideas since running a shell glob and doing filtering
# on regexes aren't quite the same thing semantically.
#
# That being said, pragmatically it might be useful, so here we go. First
# thing that needs to be done is to move the old version of the
function out
# of the way. We need to do this because we're still going to use it when
# the use passes in a string value representing a glob. The
funky class <<
# self notation is because glob is a class method on Dir, not an instance
# method:

class << self
alias_method :__original_glob, :glob
end

##
# Now we're free to redefine Dir.glob. Since were in class Dir, self.glob
# is really the same thing. I know the method signature looks a little
# funky, but it needs to match up with the original glob function.
#
# If you look at the rdoc for Dir.glob, you'll see that it can
take a bunch
# of flags, which we won't handle here; however, if the original
is to keep
# working, this information will need to be passed on. It's the same with
# the blk parameter. The & takes the specified block and stuffs it into a
# variable. This is done rather than just using yield because the block
# also needs to be sent to the original method as well.

def self.glob(query,*flags, &blk)

##
# First off is the easy case. IF the parameter passed in is
not a regex,
# then we don't do anything. Just pass it off to the original function.

return __original_glob(query,*flags,&blk) unless query.is_a? Regexp

##
# Now, if there isn't a block to yield to, we're going a to be
building up
# an array with all of the matching files, so thats
initialized before the
# iteration starts

files = []

##
# Based on the code you posted above, I assumed that you
wanted the regex
# just to match things in the current directory, so we'll use '.' as the
# one to iterate over. If you want all the files recursively,
take a look
# at the Find library.

Dir.new('.').each do |f|

##
# Here's the check against the regex. If it doesn't match skip to the
# next file in the directory. Otherwise, what happens next depends on
# whether a block was passed or not...

next unless query =~ f

##
# If we did not get a block, just stick the matching file into the
# array,

if blk.nil?
files << f

##
# Otherwise, it yield it to the block.

else
blk.call(f)
end

end

##
# Now to make it match up with the original method, nil is
returned if the
# block was called. Otherwise, return the array of files

blk.nil?? files : nil
end
end

##
# And that's about it. This is just showing how to use the new
method, but it
# won't be called if you're requiring it from another script.

if $0 == __FILE__

##
# This is the block way of calling it. I works just like an iterator.

Dir.glob(/\A\..*/) do |f|
puts "Dot-file: #{f}"
end

##
# Here's without the block. It returns an array. We also do the same
# search with a string just to show that the original functionality is
# preserved.

backups_regex = Dir.glob(/~\z/)
backups_string = Dir.glob('.?*~')

p backups_regex
p backups_string

p backups_regex == backups_string # => true
end


Hope that helps.
 
P

Peter Bailey

Robert said:
Hi,
Can someone point me to a quick "grep" like function in RUBY? I use glob
all the time, but, I need the power of regular expressions when finding
particular files in directories. I see "grep" in the Pickaxe manual,
but, only as it relates to ".enum," which, sorry to say, I don't
understand.

I need something like this, even though, I know this doesn't work:
Dir.glob(/^f[0-9]{7}\.eps/)

Dir.glob('f*.eps').grep(/^f[0-9]{7}\.eps$/)

There's plenty more options around.

robert

Grepping on a glob. That makes sense. I knew that grep could be used
somehow. Thanks, Robert.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top