How to read a .csv file into a 2D array?

T

Todd Gardner

Hello everyone,

How can I read the following text from a .csv file into a 2D array?

Is there a prewritten routine allowing me not to have to parse the file myself?

CREDIT 20040505120000[0:GMT] PAYMENT - THANK YOU 1
CREDIT 20040309120000[0:GMT] PAYMENT - THANK YOU 146.8
CREDIT 20040329120000[0:GMT] PAYMENT - THANK YOU 1500
CREDIT 20040409120000[0:GMT] PROFESSIONAL CAREER DEV 1082.05
DEBIT 20040601120000[0:GMT] TARGET 00003236 -21.64
DEBIT 20040502120000[0:GMT] TARGET 00003236 -113.32
DEBIT 20040417120000[0:GMT] TARGET 00003236 -47.02
CREDIT 20040327120000[0:GMT] TARGET 00003236 129.89
DEBIT 20040326120000[0:GMT] USPS 0568370007 -8.85
DEBIT 20040508120000[0:GMT] USPS 5654840286 -7.85
DEBIT 20040510120000[0:GMT] USPS 5654840286 -12.55

Thanks in advance for any suggestions,

Todd
 
R

Robert Klemme

Todd Gardner said:
Hello everyone,

How can I read the following text from a .csv file into a 2D array?

What's the separator here, is it a space or a tab?
Is there a prewritten routine allowing me not to have to parse the file myself?

CREDIT 20040505120000[0:GMT] PAYMENT - THANK YOU 1
CREDIT 20040309120000[0:GMT] PAYMENT - THANK YOU 146.8
CREDIT 20040329120000[0:GMT] PAYMENT - THANK YOU 1500
CREDIT 20040409120000[0:GMT] PROFESSIONAL CAREER DEV 1082.05
DEBIT 20040601120000[0:GMT] TARGET 00003236 -21.64
DEBIT 20040502120000[0:GMT] TARGET 00003236 -113.32
DEBIT 20040417120000[0:GMT] TARGET 00003236 -47.02
CREDIT 20040327120000[0:GMT] TARGET 00003236 129.89
DEBIT 20040326120000[0:GMT] USPS 0568370007 -8.85
DEBIT 20040508120000[0:GMT] USPS 5654840286 -7.85
DEBIT 20040510120000[0:GMT] USPS 5654840286 -12.55

AFAIR there is something on RAA, but it's not difficult to do it by hand:

records = []

while ( line = gets )
records << line.chomp.split / /
end

Or, if you need conversions

records = []

while ( line = gets )
rec = line.chomp.split / /

rec.map! do |elem|
case elem
when /^[+-]?\d+$/
elem.to_i
when /^[+-]?\d+\.\d+$/
elem.to_f
else
elem
end
end

records << rec
end

etc.

robert
 
G

gabriele renzi

il 13 Jun 2004 03:26:14 -0700, (e-mail address removed) (Todd Gardner) ha
scritto::
Hello everyone,

How can I read the following text from a .csv file into a 2D array?

Is there a prewritten routine allowing me not to have to parse the file myself?

there is a prewritten module: csv.rb in the standard distribution.
 
D

Daniele Alessandri

How can I read the following text from a .csv file into a 2D array?
Is there a prewritten routine allowing me not to have to parse the file myself?

Ruby 1.8 comes with a set of classes (see csv.rb) to read and write
CSV files, but I think a raw approach in your case is faster than
using those classes.
CREDIT 20040505120000[0:GMT] PAYMENT - THANK YOU 1
CREDIT 20040309120000[0:GMT] PAYMENT - THANK YOU 146.8
[cut]

Assuming that your CSV columns are delimited by the tab char (as shown
in your text), you can do:

csvarray = []
File.open("csvdata.csv", "r").each_line { |csvrecord|
csvarray << csvrecord.chomp.split("\t")
}

Now you have into a two-dimensional array

puts csvarray[0][0] # (1st cell of the 1st record)
puts csvarray[3][2] # (3rd cell of the 4th record)
 
T

Todd Gardner

Actually this is a tab delimited file, here is the .csv.

6/10/2004,-44.87,4INKJETS.COM 888-321-2552 CA
6/8/2004,-107.26,SAFEWAY STORE00014837 SAN JOSE CA
6/7/2004,-24.95,DR *REGSOFT.COM Regsoft.com GA
6/3/2004,114.96,ONLINE PAYMENT
5/28/2004,214.99,ONLINE PAYMENT
5/27/2004,-114.96,SAFEWAY STORE00014837 SAN JOSE CA
5/24/2004,-214.99,NEWEGG COMPUTERS 800-390-1119 CA
3/9/2004,-40,TQ PHONE ADVANCE - CA

How can I do this more elegantly?

ary = []
fi = File.open("test1.csv","r")
fo = File.open("test1.out","w")
fi.each { |line|
a = line.strip.split(',')
ary << a
fo.puts line
}
#~ # test print-out of one arrayrow
i=0
1.times do
puts ary
i+=1
end

fi.close
fo.close

Thanks again for listening to the newbie question!

Todd
 
G

Gavin Sinclair

How can I do this more elegantly?

I'm not sure what you're trying to achieve.
ary = []
fi = File.open("test1.csv","r")
fo = File.open("test1.out","w")
fi.each { |line|
a = line.strip.split(',')
ary << a
fo.puts line
}

That's just copying fi to fo with no transformation.
#~ # test print-out of one arrayrow
i=0
1.times do
puts ary
i+=1
end


*1* times do something? That must be a mistake.
fi.close
fo.close

What you're actually *doing* (i.e. creating a 2D array) looks fine.
The rest of the stuff seems meaningless.

Cheers,
Gavin
 
N

NAKAMURA, Hiroshi

Hi,

Todd said:
How can I read the following text from a .csv file into a 2D array?

Is there a prewritten routine allowing me not to have to parse the file myself?

With ruby/1.8.1;
% ruby18 -rcsv -e 'p CSV.open("text.tsv", "r", ?\t).collect { |row|
row.to_a }'

With ruby/1.9;
% ruby -rcsv -e 'p CSV.parse(File.read("text.tsv"), ?\t)'

A method CSV#parse for parsing string at once (without stream reading)
is recently added. I'll add to ruby/1.8.2.

Regards,
// NaHi
 
S

Sven Schott

I was actually just about to ask that. This was how I did it(pretty
much the same way as yours).

file = File.open("file.csv", "r")
arr =[]
file.each { |i| arr << i.chomp.split(/\t/) }

if you want to output it nicely you can do something like

puts arr.each { |a| puts a.join("\t").to_s + "\n" }

Or to a file

savefile = File.new("file2.csv", "w")
arr.each { |a| savefile << arr.join("\t").to_s + "\n" }

Just my 2c. I think I did it well 'cause I worked on it for a while. I
love blocks.

Sven
 
R

Robert Klemme

Sven Schott said:
I was actually just about to ask that. This was how I did it(pretty
much the same way as yours).

file = File.open("file.csv", "r")
arr =[]
file.each { |i| arr << i.chomp.split(/\t/) }

This can be nicely done with a one liner:

arr = File.open("file.csv") {|io| io.inject([]) {|a, line| a <<
line.chomp.split(/\t/)} }

which has the added value of closing the file properly.
if you want to output it nicely you can do something like

puts arr.each { |a| puts a.join("\t").to_s + "\n" }

Or to a file

savefile = File.new("file2.csv", "w")
arr.each { |a| savefile << arr.join("\t").to_s + "\n" }

File.open("file2.csv", "w") do |savefile|
arr.each { |a| savefile.puts arr.join("\t") }
end
Just my 2c. I think I did it well 'cause I worked on it for a while. I
love blocks.

Then you should get used to the habit to use them with File.open() - that
way you ensure that file handles are closed properly. :)

Kind regards

robert

Sven


Actually this is a tab delimited file, here is the .csv.

6/10/2004,-44.87,4INKJETS.COM 888-321-2552 CA
6/8/2004,-107.26,SAFEWAY STORE00014837 SAN JOSE CA
6/7/2004,-24.95,DR *REGSOFT.COM Regsoft.com GA
6/3/2004,114.96,ONLINE PAYMENT
5/28/2004,214.99,ONLINE PAYMENT
5/27/2004,-114.96,SAFEWAY STORE00014837 SAN JOSE CA
5/24/2004,-214.99,NEWEGG COMPUTERS 800-390-1119 CA
3/9/2004,-40,TQ PHONE ADVANCE - CA

How can I do this more elegantly?

ary = []
fi = File.open("test1.csv","r")
fo = File.open("test1.out","w")
fi.each { |line|
a = line.strip.split(',')
ary << a
fo.puts line
}
#~ # test print-out of one arrayrow
i=0
1.times do
puts ary
i+=1
end

fi.close
fo.close

Thanks again for listening to the newbie question!

Todd

 
S

Sven Schott

Very nice! I wanted to do a one-liner but I had no idea how.

I love ruby.

Sven
Sven Schott said:
I was actually just about to ask that. This was how I did it(pretty
much the same way as yours).

file = File.open("file.csv", "r")
arr =[]
file.each { |i| arr << i.chomp.split(/\t/) }

This can be nicely done with a one liner:

arr = File.open("file.csv") {|io| io.inject([]) {|a, line| a <<
line.chomp.split(/\t/)} }

which has the added value of closing the file properly.
if you want to output it nicely you can do something like

puts arr.each { |a| puts a.join("\t").to_s + "\n" }

Or to a file

savefile = File.new("file2.csv", "w")
arr.each { |a| savefile << arr.join("\t").to_s + "\n" }

File.open("file2.csv", "w") do |savefile|
arr.each { |a| savefile.puts arr.join("\t") }
end
Just my 2c. I think I did it well 'cause I worked on it for a while. I
love blocks.

Then you should get used to the habit to use them with File.open() -
that
way you ensure that file handles are closed properly. :)

Kind regards

robert

Sven


Actually this is a tab delimited file, here is the .csv.

6/10/2004,-44.87,4INKJETS.COM 888-321-2552 CA
6/8/2004,-107.26,SAFEWAY STORE00014837 SAN JOSE CA
6/7/2004,-24.95,DR *REGSOFT.COM Regsoft.com GA
6/3/2004,114.96,ONLINE PAYMENT
5/28/2004,214.99,ONLINE PAYMENT
5/27/2004,-114.96,SAFEWAY STORE00014837 SAN JOSE CA
5/24/2004,-214.99,NEWEGG COMPUTERS 800-390-1119 CA
3/9/2004,-40,TQ PHONE ADVANCE - CA

How can I do this more elegantly?

ary = []
fi = File.open("test1.csv","r")
fo = File.open("test1.out","w")
fi.each { |line|
a = line.strip.split(',')
ary << a
fo.puts line
}
#~ # test print-out of one arrayrow
i=0
1.times do
puts ary
i+=1
end

fi.close
fo.close

Thanks again for listening to the newbie question!

Todd


 
T

Todd Gardner

NAKAMURA said:
Hi,



With ruby/1.8.1;
% ruby18 -rcsv -e 'p CSV.open("text.tsv", "r", ?\t).collect { |row|
row.to_a }'

With ruby/1.9;
% ruby -rcsv -e 'p CSV.parse(File.read("text.tsv"), ?\t)'

A method CSV#parse for parsing string at once (without stream reading)
is recently added. I'll add to ruby/1.8.2.

Regards,
// NaHi

Hello NaHi,

Do you have any idea what I may be doing wrong in the code below?

E:\Documents and Settings\tnt\ruby\ex\array>ruby -rcsv -e 'p
CSV.open("text.csv"
, "r", ?\t).collect { |row| row.to_a }'

'row' is not recognized as an internal or external command,
operable program or batch file.

Domo Arigato Agimashita I'm trying to say many thanks. I am guessing
you won't be able to tell because of my almost nonexistent Japanese,

Todd
 
N

NAKAMURA, Hiroshi

Hi, Todd,

Todd said:
Do you have any idea what I may be doing wrong in the code below?

Seems to be a command line escaping problem. Try this on the same console.

C:\home\ruby\bin>ruby -rcsv -e "p CSV.open(ARGV.shift, 'r', ?\t).collect
{ |row| row.to_a }" \temp\Todd.tsv

It dumps followings for me.

[["CREDIT", "20040505120000[0:GMT]", "PAYMENT - THANK YOU", "1"],
["CREDIT", "20
040309120000[0:GMT]", "PAYMENT - THANK YOU", "146.8"], ["CREDIT",
"2004032912000
0[0:GMT]", "PAYMENT - THANK YOU", "1500"], ["CREDIT",
"20040409120000[0:GMT]", "
PROFESSIONAL CAREER DEV", "1082.05"], ["DEBIT", "20040601120000[0:GMT]",
"TARGET
00003236", "-21.64"], ["DEBIT", "20040502120000[0:GMT]", "TARGET
00003236", "-113.32"], ["DEBIT", "20040417120000[0:GMT]", "TARGET
0000323
6", "-47.02"], ["CREDIT", "20040327120000[0:GMT]", "TARGET
00003236", "12
9.89"], ["DEBIT", "20040326120000[0:GMT]", "USPS 0568370007", "-8.85"],
["DEBIT"
, "20040508120000[0:GMT]", "USPS 5654840286", "-7.85"], ["DEBIT",
"2004051012000
0[0:GMT]", "USPS 5654840286", "-12.55"]]
Domo Arigato Agimashita I'm trying to say many thanks. I am guessing
you won't be able to tell because of my almost nonexistent Japanese,

"Domo Arigato" is completely a valid Japanese. Thank you for trying to
write in Japanese.
I hope my English accessible.

Regards,
// NaHi
 
R

Relm

Sven Schott said:
I was actually just about to ask that. This was how I did it(pretty
much the same way as yours).

file = File.open("file.csv", "r")
arr =[]
file.each { |i| arr << i.chomp.split(/\t/) }

This can be nicely done with a one liner:

arr = File.open("file.csv") {|io| io.inject([]) {|a, line| a <<
line.chomp.split(/\t/)} }

#inject is somewhat scary...

arr = File.open("file.csv") {|io|
io.map {|line| line.chomp.split(/\t/)}
}
 
R

Robert Klemme

Relm said:
Sven Schott said:
I was actually just about to ask that. This was how I did it(pretty
much the same way as yours).

file = File.open("file.csv", "r")
arr =[]
file.each { |i| arr << i.chomp.split(/\t/) }

This can be nicely done with a one liner:

arr = File.open("file.csv") {|io| io.inject([]) {|a, line| a <<
line.chomp.split(/\t/)} }

#inject is somewhat scary...

Power often scares people... You can implement virtually all methods in
Enumerable by using #inject. It's great! :)
arr = File.open("file.csv") {|io|
io.map {|line| line.chomp.split(/\t/)}
}

Yes, and you can even squeeze that in one line, too. :)

Regards

robert
 
T

Todd Gardner

NAKAMURA said:
Hi, Todd,

Todd said:
Do you have any idea what I may be doing wrong in the code below?

Seems to be a command line escaping problem. Try this on the same console.

C:\home\ruby\bin>ruby -rcsv -e "p CSV.open(ARGV.shift, 'r', ?\t).collect
{ |row| row.to_a }" \temp\Todd.tsv

It dumps followings for me.

[["CREDIT", "20040505120000[0:GMT]", "PAYMENT - THANK YOU", "1"],
["CREDIT", "20
040309120000[0:GMT]", "PAYMENT - THANK YOU", "146.8"], ["CREDIT",
"2004032912000
0[0:GMT]", "PAYMENT - THANK YOU", "1500"], ["CREDIT",
"20040409120000[0:GMT]", "
PROFESSIONAL CAREER DEV", "1082.05"], ["DEBIT", "20040601120000[0:GMT]",
"TARGET
00003236", "-21.64"], ["DEBIT", "20040502120000[0:GMT]", "TARGET
00003236", "-113.32"], ["DEBIT", "20040417120000[0:GMT]", "TARGET
0000323
6", "-47.02"], ["CREDIT", "20040327120000[0:GMT]", "TARGET
00003236", "12
9.89"], ["DEBIT", "20040326120000[0:GMT]", "USPS 0568370007", "-8.85"],
["DEBIT"
, "20040508120000[0:GMT]", "USPS 5654840286", "-7.85"], ["DEBIT",
"2004051012000
0[0:GMT]", "USPS 5654840286", "-12.55"]]
Domo Arigato Agimashita I'm trying to say many thanks. I am guessing
you won't be able to tell because of my almost nonexistent Japanese,

"Domo Arigato" is completely a valid Japanese. Thank you for trying to
write in Japanese.
I hope my English accessible.
^^^^^^^^^^
acceptable

At least, that's what I "think" you are saying. And yes, your English
is quite acceptable.
Regards,
// NaHi

Hello NaHi,

I greatly appreciate any of your input. I don't believe I have
changed the problem however, I continue to get an error. I am
guessing I don't have the file "Todd.tsv" in the proper location.

INPUT
E:\Documents and Settings\tnt\ruby\ex\array>ruby -rcsv -e "p
CSV.open(ARGV.shift
, 'r', ?\t).collect { |row| row.to_a }" \temp\Todd.tsv

OUTPUT
c:/ruby/lib/ruby/1.8/csv.rb:228:in `initialize': No such file or
directory - \te
mp\Todd.tsv (Errno::ENOENT)
from c:/ruby/lib/ruby/1.8/csv.rb:228:in `open'
from c:/ruby/lib/ruby/1.8/csv.rb:228:in `open_reader'
from c:/ruby/lib/ruby/1.8/csv.rb:208:in `open'
from -e:1
===================================================================
E:\Documents and Settings\tnt\ruby\ex\array>dir
Volume in drive E is ROY2D2P2
Volume Serial Number is 60B9-9B87

Directory of E:\Documents and Settings\tnt\ruby\ex\array

06/17/2004 02:48p <DIR> .
06/17/2004 02:48p <DIR> ..
06/13/2004 04:37p 269 010.rbw
06/13/2004 04:48p 344 020.rbw
06/15/2004 12:43p 92 030.rbw
06/15/2004 06:13p 81 040.rbw
06/13/2004 03:42a 353 test.csv
06/13/2004 03:42a 353 test1.csv
06/13/2004 04:37p 353 test1.out
06/17/2004 02:25p 353 Todd.tsv
8 File(s) 2,198 bytes
2 Dir(s) 111,811,252,224 bytes free

Do I have the file Todd.tsv in the proper location?

Domo Arigato,

Todd
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top