How do you get the rows out of FasterCSV?

Gary · Feb 6, 2007

Hi. I want to add a normalized column to a csv file. That is, I want to
read the file, sum all of a column X, then add another column in which
each X is divided by the sum. So do I use the CSV rows twice without
reading the file twice?

The examples in the FasterCSV documentation at
http://fastercsv.rubyforge.org/ show class methods that provide
file-like operations. I want the file read once while the data is read,
then closed. While reading the first time, the column sum is calculated.
But then I want to go through the csv rows again, this time writing out
the rows with their new column.

Here's a sketch of what I had in mind. It doesn't work as intended...

# read csv file, summing the values
sum = 0
csv =
FasterCSV.new(open(csv_filename),{:headers=>true}).each_with_index do |
row, c |
sum += row["VAL"].to_f
end

# now write
FasterCSV.open("test_csv_file.csv", "w", {:headers=>true}) do |csvout|
csv.each_with_index do | row, c |
row["NORMED"] = row["VAL"].to_f / sum
csvout << row.headers if c==0
csvout << row
end
end

Also, is there a more graceful way to have the headers written out?

ChrisH · Feb 6, 2007

Hi. I want to add a normalized column to a csv file. That is, I want to
read the file, sum all of a column X, then add another column in which
each X is divided by the sum. So do I use the CSV rows twice without
reading the file twice?

...snip...

csvData = FasterCSV.read('/path/to/infile.csv', :headers=>true) ##
read all data into an array of FasterCSV::Rows

##, may run into memory issues
sumColX = 0
csvData.each{|row| sum += row['ColX']}

FasterCSV.open("path/to/outfile.csv", "w") do |csv|
csvData.each{ |row|
csv << row << row['ColX'].to_f / sum ## calc and append norm
value and output row of CSV DATA
}
end

Not tested, or even executed but I think its close 9^)

Cheers
Chris

ara.t.howard · Feb 6, 2007

Hi. I want to add a normalized column to a csv file. That is, I want to
read the file, sum all of a column X, then add another column in which
each X is divided by the sum. So do I use the CSV rows twice without
reading the file twice?

The examples in the FasterCSV documentation at
http://fastercsv.rubyforge.org/ show class methods that provide
file-like operations. I want the file read once while the data is read,
then closed. While reading the first time, the column sum is calculated.
But then I want to go through the csv rows again, this time writing out
the rows with their new column.

Here's a sketch of what I had in mind. It doesn't work as intended...

# read csv file, summing the values
sum = 0
csv =
FasterCSV.new(open(csv_filename),{:headers=>true}).each_with_index do |
row, c |
sum += row["VAL"].to_f
end

# now write
FasterCSV.open("test_csv_file.csv", "w", {:headers=>true}) do |csvout|
csv.each_with_index do | row, c |
row["NORMED"] = row["VAL"].to_f / sum
csvout << row.headers if c==0
csvout << row
end
end

Also, is there a more graceful way to have the headers written out?

harp:~ > cat a.rb
require 'rubygems'
require 'fastercsv'

csv = <<-csv
f,x
a,0
b,1
c,2
d,3
csv

fcsv = FasterCSV.new csv, :headers => true
table = fcsv.read

xs = table['x'].map{|x| x.to_i}
sum = Float xs.inject{|sum,i| sum += i}
norm = xs.map{|x| x / sum}

table['n'] = norm
puts table

harp:~ > ruby a.rb
f,x,n
a,0,0.0
b,1,0.166666666666667
c,2,0.333333333333333
d,3,0.5

-a

James Edward Gray II · Feb 6, 2007

require 'rubygems'
require 'fastercsv'

csv = <<-csv
f,x
a,0
b,1
c,2
d,3
csv

fcsv = FasterCSV.new csv, :headers => true
table = fcsv.read

Or just:

table = FCSV.parse(csv, :headers => true)

xs = table['x'].map{|x| x.to_i}

When you want to convert a field, ask FasterCSV to do it for you
while reading. This changes the above to:

table = FCSV.parse(
csv,
:headers => true,
:converters => lambda { |f, info| info.header == "x" ? f.to_i : f }
)

Or using a built-in converter:

table = FCSV.parse(csv, :headers => true, :converters => :integer)

sum = Float xs.inject{|sum,i| sum += i}

This is now simplified to:

sum = Float table['x'].inject{|sum,i| sum += i}

norm = xs.map{|x| x / sum}

table['n'] = norm
puts table

James Edward Gray II

Guest · Feb 6, 2007

Thanks, works great!!

If I want to parse some columns as floats, some as ints, and leave the
rest alone, is this the best way to do it?

table = FCSV.parse(open(csv_filename),
:headers => true,
:converters => lambda { |f, info|
case info.header
when "NUM"
f.to_i
when "RATE"
f.to_f
else
f
end
})

How do I write the finished file? My attempts appear without headers in
the csv file?

Guest · Feb 6, 2007

ThisForum said:
How do I write the finished file? My attempts appear without headers in
the csv file?

open("test_csv_file.csv", "w") << table

worked and included the headers. Is there a faster way for large csv
tables?

Thanks

James Edward Gray II · Feb 6, 2007

Thanks, works great!!

If I want to parse some columns as floats, some as ints, and leave the
rest alone, is this the best way to do it?

table = FCSV.parse(open(csv_filename),
:headers => true,
:converters => lambda { |f, info|
case info.header
when "NUM"
f.to_i
when "RATE"
f.to_f
else
f
end
})

Instead of:

FCSV.parse(open(csv_filename) ... )

Use:

FCSV.read(csv_filename ... )

Your converters work fine though, yes. If you numbers can be
recognized by some built-in converters, you might even be able to get
away with

FCSV.read(csv_filename, :headers => true, :converters => :numeric)

How do I write the finished file? My attempts appear without
headers in
the csv file?

I would use:

File.open("path/to/file", "w") { |f| f.puts table }

Hope that helps.

James Edward Gray II

ara.t.howard · Feb 6, 2007

Instead of:

FCSV.parse(open(csv_filename) ... )

Use:

FCSV.read(csv_filename ... )

Your converters work fine though, yes. If you numbers can be recognized by
some built-in converters, you might even be able to get away with

FCSV.read(csv_filename, :headers => true, :converters => :numeric)

alias to FCSV.table

perhaps?

-a

James Edward Gray II · Feb 6, 2007

alias to FCSV.table

perhaps?

Alias read() to table() or read() with those options?

James Edward Gray II

ara.t.howard · Feb 6, 2007

Alias read() to table() or read() with those options?

the latter. i hate typing! ;-)

actually, read with those __default__ options. so

def FCSV.table opts = {}

....

headers = opts[:headers] || opts['headers'] || true
converters = opts[:converters] || opts['converters'] || :converters

....

end

thoughts??

-a

James Edward Gray II · Feb 7, 2007

Alias read() to table() or read() with those options?

Click to expand...

the latter. i hate typing! ;-)

actually, read with those __default__ options. so

def FCSV.table opts = {}

....

headers = opts[:headers] || opts['headers'] || true
converters = opts[:converters] || opts['converters']
|| :converters

....

end

thoughts??

Yes, FasterCSV doesn't support goofy Rails-like Hashes.

Beyond that though, I released FasterCSV 1.2.0 today with the
addition of:

def self.table(path, options = Hash.new)
read( path, { :headers => true,
:converters => :numeric,
:header_converters => :symbol }.merge(options) )
end

Enjoy.

James Edward Gray II

William James · Feb 7, 2007

Hi. I want to add a normalized column to a csv file. That is, I want to
read the file, sum all of a column X, then add another column in which
each X is divided by the sum. So do I use the CSV rows twice without
reading the file twice?

Click to expand...

The examples in the FasterCSV documentation at
http://fastercsv.rubyforge.org/show class methods that provide
file-like operations. I want the file read once while the data is read,
then closed. While reading the first time, the column sum is calculated.
But then I want to go through the csv rows again, this time writing out
the rows with their new column.

Click to expand...

Here's a sketch of what I had in mind. It doesn't work as intended...

Click to expand...

# read csv file, summing the values
sum = 0
csv =
FasterCSV.new(open(csv_filename),{:headers=>true}).each_with_index do |
row, c |
sum += row["VAL"].to_f
end

Click to expand...

# now write
FasterCSV.open("test_csv_file.csv", "w", {:headers=>true}) do |csvout|
csv.each_with_index do | row, c |
row["NORMED"] = row["VAL"].to_f / sum
csvout << row.headers if c==0
csvout << row
end
end

Click to expand...

Also, is there a more graceful way to have the headers written out?

Click to expand...

harp:~ > cat a.rb
require 'rubygems'
require 'fastercsv'

csv = <<-csv
f,x
a,0
b,1
c,2
d,3
csv

fcsv = FasterCSV.new csv, :headers => true
table = fcsv.read

xs = table['x'].map{|x| x.to_i}
sum = Float xs.inject{|sum,i| sum += i}

Should be sum + i.

norm = xs.map{|x| x / sum}

table['n'] = norm
puts table

harp:~ > ruby a.rb
f,x,n
a,0,0.0
b,1,0.166666666666667
c,2,0.333333333333333
d,3,0.5

array = "\
f,x
a,0
b,1
c,2
d,3
".split.map{|s| s.split ","}

headers = array.shift << 'n'

array = array.transpose
sum = array[1].inject{|a,b| a.to_i + b.to_i}.to_f
array << array[1].map{|x| x.to_i / sum}
array = array.transpose
([headers] + array).each{|x| puts x.join(',')}

--- output -----
f,x,n
a,0,0.0
b,1,0.166666666666667
c,2,0.333333333333333
d,3,0.5

Skipping headers in FasterCSV	9	Feb 12, 2010
FasterCSV - varying headers	5	Oct 1, 2009
FasterCSV - illegal quoting error - thought it was correct?	7	Oct 27, 2010
FasterCSV heavy loads?	4	Apr 2, 2008
Finding duplicate records before creating using FasterCSV	11	Feb 4, 2010
FasterCSV 1.4.0 -- The Final 1.8 Release	1	Sep 11, 2008
[ANN] FasterCSV 1.0.0 -- The "Sorry it's late, Ara" release!	2	Nov 5, 2006
Special characters in csv header using fastercsv	16	Nov 17, 2009

How do you get the rows out of FasterCSV?

Gary

ChrisH

ara.t.howard

James Edward Gray II

Guest

Guest

James Edward Gray II

ara.t.howard

James Edward Gray II

ara.t.howard

James Edward Gray II

William James

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads