key search in files

Q

Qnmt Mndy

i am trying to find a set of keys within specific files under a specific
directory. i read the keys from a file and iterate through them opening
and looking all the files under the specified directory. However only
the last key seems to be found in the files..

srcFiles = Dir.glob(File.join("**", "*.txt"))
keys = File.readlines("sp.txt")

keys.each{ |key|
srcFiles.each{|src|
linenumber = 0
File.readlines(src).each{ |line|
linenumber += 1
if line.include? key then
puts "found #{key}"
}
}
}
 
R

Robert Klemme

2010/5/3 Qnmt Mndy said:
i am trying to find a set of keys within specific files under a specific
directory. i read the keys from a file and iterate through them opening
and looking all the files under the specified directory. However only
the last key seems to be found in the files..

srcFiles =3D Dir.glob(File.join("**", "*.txt"))
keys =3D File.readlines("sp.txt")

keys.each{ |key|
=A0srcFiles.each{|src|
=A0 =A0linenumber =3D 0
=A0 =A0File.readlines(src).each{ |line|
=A0 =A0 =A0linenumber +=3D 1
=A0 =A0 =A0if line.include? key then
=A0 =A0 =A0puts "found #{key}"
=A0 =A0}
=A0}
}

This is likely caused by the fact, that you do not postprocess what
you get from File.readlines:

$ echo 111 >| x
$ echo 222 >> x
$ ruby19 -e 'p File.readlines("x")'
["111\n", "222\n"]
$

Note the trailing line delimiter.

Also, your approach is very inefficient: you open and read every file
# of keys times. You better exchange outer and inner loop and open
each file only once while searching for all keys in one line.

Btw, what you attempt to do can be done by GNU find and fgrep already:

$ find . -type f -name '*.txt' -print0 | xargs -r0 fgrep -f sp.txt

Or, with a shell that knows "**" expansion, e.g. zsh

$ fgrep -f sp.txt **/*.txt

If you are only interested in file names you can add option -l to fgrep.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
Q

Qnmt Mndy

thanks for your reply and advices robert.

the problem was really about postprocessing the result of File.readlines
and your idea about switching the loop order significantly improved the
performance.

about doing the same thing with GNU commands, i wrote this for windows
environment and not sure if it has such a command utility

cem
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top