Parsing CSV file into a database

N

Nick Hoyle

How would i go about parsing a csv from the web eg yahoo finance stock
data into my database table called stocks?

I would like this process to happen each time a user logs into there
account so it will update the table with the latest data possible from
the csv file.
 
J

Jesús Gabriel y Galán

How would i go about parsing a csv from the web eg yahoo finance stock
data into my database table called stocks?

If you have a url to download the csv, you can read it with something
like open-uri.
Then use a CSV library like FasterCSV for csv parsing, and then I
would use a ORM
(Sequel, ActiveRecord or Datamapper come to mind) to create the rows
in your table.

Hope this helps,

Jesus.
 
N

Nick Hoyle

ok thanks thats great, thanks for the quick response. Have you any
sample code or examples that you could possible provide me?

Thanks for your time

Nick
 
J

James Gray

ok thanks thats great, thanks for the quick response. Have you any
sample code or examples that you could possible provide me?

It's probably as simple as:

require "open-uri"
require "rubygems"
require "faster_csv"
open("url goes here") do |csv|
FCSV.new(csv).each do |row|
# load row into database here...
end
end

Give it a shot and come back with specific questions when you get stuck.

James Edward Gray II
 
N

Nick Hoyle

Ok great! will do

Nick

James said:
It's probably as simple as:

require "open-uri"
require "rubygems"
require "faster_csv"
open("url goes here") do |csv|
FCSV.new(csv).each do |row|
# load row into database here...
end
end

Give it a shot and come back with specific questions when you get stuck.

James Edward Gray II
 
B

Bosko Ivanisevic

Ok great! will do

Nick

Here is a sample for downloading historical data from Yahoo:

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = "/table.csv?s=#{symbol}&g=d" +
"&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}" +
"&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}"
Net::HTTP.start("itable.finance.yahoo.com", 80) { |http|
res = http.get(query)
res.body
}
end
 
7

7stud --

Bosko said:
Here is a sample for downloading historical data from Yahoo:

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = "/table.csv?s=#{symbol}&g=d" +
"&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}" +
"&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}"
Net::HTTP.start("itable.finance.yahoo.com", 80) { |http|
res = http.get(query)
res.body
}
end


require 'date'
require 'net/http'

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = "/table.csv?s=#{symbol}&g=d" +
"&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}" +
"&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}"
Net::HTTP.start("itable.finance.yahoo.com", 80) { |http|
res = http.get(query)
res.body
}
end

p get_historical_data("AAPL", "12/28/2008", "1/7/2009")

--output:--
/usr/lib/ruby/1.8/net/http.rb:1556:in `read_status_line': wrong status
line: "Date,Open,High,Low,Close,Volume,Adj Close" (Net::HTTPBadResponse)
from /usr/lib/ruby/1.8/net/http.rb:1538:in `read_new'
from /usr/lib/ruby/1.8/net/http.rb:833:in `request'
from /usr/lib/ruby/1.8/net/http.rb:615:in `get'
from r1test.rb:13:in `get_historical_data'
from r1test.rb:12:in `start'
from /usr/lib/ruby/1.8/net/http.rb:324:in `start'
from r1test.rb:12:in `get_historical_data'
from r1test.rb:18
 
B

Bosko Ivanisevic

require 'date'
require 'net/http'

def get_historical_data(symbol, startDateString, endDateString)
  startDate = Date.parse(startDateString)
  endDate = Date.parse(endDateString)
  query = "/table.csv?s=#{symbol}&g=d" +
          "&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}" +
          "&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}"
  Net::HTTP.start("itable.finance.yahoo.com", 80) { |http|
    res = http.get(query)
    res.body
  }
end

p get_historical_data("AAPL", "12/28/2008", "1/7/2009")

--output:--
/usr/lib/ruby/1.8/net/http.rb:1556:in `read_status_line': wrong status
line: "Date,Open,High,Low,Close,Volume,Adj Close" (Net::HTTPBadResponse)
        from /usr/lib/ruby/1.8/net/http.rb:1538:in `read_new'
        from /usr/lib/ruby/1.8/net/http.rb:833:in `request'
        from /usr/lib/ruby/1.8/net/http.rb:615:in `get'
        from r1test.rb:13:in `get_historical_data'
        from r1test.rb:12:in `start'
        from /usr/lib/ruby/1.8/net/http.rb:324:in `start'
        from r1test.rb:12:in `get_historical_data'
        from r1test.rb:18

Probably cut-paste caused error. Here is my slightly rewritten script
and output:

require 'net/http'
require 'date'

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = "/table.csv?s=#{symbol}&g=d"
query.concat("&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}")
query.concat("&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}")
Net::HTTP.start("itable.finance.yahoo.com", 80) { |http|
res = http.get(query)
res.body
}
end

puts get_historical_data("MSFT", "12/28/2008", "1/7/2009")

Date,Open,High,Low,Close,Volume,Adj Close
2009-01-07,20.19,20.29,19.48,19.51,72709900,19.51
2009-01-06,20.75,21.00,20.61,20.76,58083400,20.76
2009-01-05,20.20,20.67,20.06,20.52,61475200,20.52
2009-01-02,19.53,20.40,19.37,20.33,50084000,20.33
2008-12-31,19.31,19.68,19.27,19.44,46419000,19.44
2008-12-30,19.01,19.49,19.00,19.34,43224100,19.34
2008-12-29,19.15,19.21,18.64,18.96,58512800,18.96
 
7

7stud --

Bosko said:
Probably cut-paste caused error. Here is my slightly rewritten script
and output:

I get the same error:


require 'net/http'
require 'date'

def get_historical_data(symbol, startDateString, endDateString)
startDate = Date.parse(startDateString)
endDate = Date.parse(endDateString)
query = "/table.csv?s=#{symbol}&g=d"
query.concat("&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}")
query.concat("&d=#{endDate.month-1}&e=#{endDate.mday}&f=#
{endDate.year.to_s}")
Net::HTTP.start("itable.finance.yahoo.com", 80) { |http|
res = http.get(query)
res.body
}
end

puts get_historical_data("MSFT", "12/28/2008", "1/7/2009")

--output:--

$ ruby r1test.rb
/usr/lib/ruby/1.8/net/http.rb:1556:in `read_status_line': wrong status
line: "Date,Open,High,Low,Close,Volume,Adj Close" (Net::HTTPBadResponse)
from /usr/lib/ruby/1.8/net/http.rb:1538:in `read_new'
from /usr/lib/ruby/1.8/net/http.rb:833:in `request'
from /usr/lib/ruby/1.8/net/http.rb:615:in `get'
from r1test.rb:13:in `get_historical_data'
from r1test.rb:12:in `start'
from /usr/lib/ruby/1.8/net/http.rb:324:in `start'
from r1test.rb:12:in `get_historical_data'
from r1test.rb:18
 
S

Siep Korteling

7stud said:
I get the same error:


I hope this resolves the issue:
These 2 lines should be on one line in your editor:
query.concat("&a=#{startDate.month-1}&b=#{startDate.mday}&c=#
{startDate.year}")

And the same is true for the next "query.concat etcetera" lines.

hth,

Siep
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top