File open, read and store in Hash, efficient?

K

Kev

Hello,

I am writing a class and I require it to open a file, and store the
contents in key, value pairs.
This is my first

def initialize()
@@store = Hash.new
end

def read_file
if File.exists?("LocationCopy.csv")
f = File.open("LocationCopy.csv","r")
f.each do |line|
temp = line.split(",")
@@store[temp[0]] = temp[1]
end
f.close
end
#puts @@store
end
 
K

Kev

Hello,

I am writing a class and I require it to open a file, and store the
contents in key, value pairs.
This is my first

def initialize()
@@store = Hash.new
end

def read_file
if File.exists?("LocationCopy.csv")
f = File.open("LocationCopy.csv","r")
f.each do |line|
temp = line.split(",")
@@store[temp[0]] = temp[1]
end
f.close
end
#puts @@store
end

Unfortunately thats what I call finger trouble, as I was saying this
is my first attempt at a Ruby application and was wondering if there
is a more efficient method for what I am trying to achieve. Would
using f.each_line and using a block be better?

Thanks,
Kev
 
R

Robert Klemme

2007/3/9 said:
Hello,

I am writing a class and I require it to open a file, and store the
contents in key, value pairs.
This is my first

def initialize()
@@store = Hash.new
end

def read_file
if File.exists?("LocationCopy.csv")
f = File.open("LocationCopy.csv","r")
f.each do |line|
temp = line.split(",")
@@store[temp[0]] = temp[1]
end
f.close
end
#puts @@store
end

Unfortunately thats what I call finger trouble, as I was saying this
is my first attempt at a Ruby application and was wondering if there
is a more efficient method for what I am trying to achieve. Would
using f.each_line and using a block be better?

Efficiency is ok. Using the block form of File.open is safer, i.e.
the file is always closed - even in case of error. But you should not
use a class variable, use @store instead.

And you can make your life easier by using CSV lib. Then it becomes a
one liner:

10:41:07 [~]: cat x
a,b
d,b;c

10:41:08 [~]: ruby -r csv -r enumerator -e 'p CSV.to_enum:)open, "x",
"r", ";").inject({}) {|h,(k,v)| h[k]=v; h}'
{"a,b"=>nil, "d,b"=>"c"}

10:41:32 [~]: ruby -r csv -r enumerator -e 'p CSV.to_enum:)open, "x",
"r", ",").inject({}) {|h,(k,v)| h[k]=v; h}'
{"a"=>"b", "d"=>"b;c"}

CSV.foreach uses "," as default separator:

10:41:49 [~]: ruby -r csv -r enumerator -e 'p CSV.to_enum:)foreach,
"x").inject({}) {|h,(k,v)| h[k]=v; h}'
{"a"=>"b", "d"=>"b;c"}

Explanation: CSV.foreach yiels every record to the block. By using
to_enum (which is part of "enumerator") you can treat the CSV reader
like any Enumerable. With #inhect, a value is passed as first
parameter to the block and the block result is passed to the next
invocation to the block. In this case the hash which is stuffed into
#inject is simply passed on and on and is ultimately the result of
#inject. "p" then prints it.

Kind regards

robert
 
G

gga

Well, your code is more or less okay. It may be buggy in that you are
also storing the \n (end of line) character. You probably need
something like:
@@store[temp[0]] = temp[1].chomp
to remove the it.

You can avoid checking if the file exists (if it does not, an Errno
exception will be raised and propagated upstream). Let the
application, instead of your class, deal with what's probably a user
error (providing a missing file).
You can also avoid the file close by doing it in a block (let ruby's C
code automatically do the file close) and you can use IO#foreach
(File#foreach) for iterating thru each line more easily.
If you know you won't have files that won't fit in memory, you can
read all your text into a string or array in a single go (this is
usually called slurping), which can also speed things up a little in
some cases.

Here are some examples of doing the same thing written in different
ways:


require 'yaml'

class ReaderYAML
def initialize(file)
# slurp the whole file into a string
lines = File.read(file)
# change commas to : (yaml hash representation)
lines.gsub!(/,/, ':')
# create the hash thru yaml
@h = YAML::load(lines)
end
end

require 'csv'

class ReaderCSV
def initialize(file)
# read the file as a CSV file, flatten the resulting array and
# make it a hash
@h = Hash[*(CSV.read(file).flatten)]
end
end

class ReaderCommas
def initialize(file)
@h = {}
# slurp the file into an array
lines = File.readlines(file)
# process each line
lines.each { |line|
key, value = line.chomp.split(',')
@h[key] = value
}
end
end

class ReaderCommasBigFile
def initialize(file)
@h = {}
File.foreach(file) do |line|
key, val = line.chomp.split(',')
@h[key] = val
end
end
end

h = ReaderYAML.new('csv.txt')
p h

h2 = ReaderCSV.new('csv.txt')
p h2

h3 = ReaderCommas.new('csv.txt')
p h3

h4 = ReaderCommasBigFile.new('csv.txt')
p h4


require 'benchmark'

n = 5000
Benchmark.bm(5) do |b|
b.report('big') { n.times do ReaderCommasBigFile.new('csv.txt');
end }
b.report('file') { n.times do ReaderCommas.new('csv.txt'); end }
b.report('csv') { n.times do ReaderCSV.new('csv.txt'); end }
b.report('yaml') { n.times do ReaderYAML.new('csv.txt'); end }
end


The YAML version does not do exactly the same as the others, but
depending on your data, it might still be what you want. It also
works for a very simple key/value pair per line. Albeit YAML involves
a little bit more work, it is still pretty optimized and will turn
numeric data automatically into the appropriate ruby numeric class.
CSV automatically deals with comma separated files for you, albeit it
is somewhat slow.

Anyway, hope that gives you some ideas. Overall, unless you are
dealing with huge files, you should not worry too much about speed
while writing your class.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

hash 13
how do i close a file that is open in ruby 4
About open file for Read 3
hash 6
Efficient file downloading 4
read in file 10
replace lines in a file using hash key-value pairs 3
File Contents into Hash Table? 6

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top