script to download Yahoo Finance data

D

dan roberts

Folks,

This is my first Python project so please bear with me. I need to
download data from Yahoo Finance in CSV format. The symbols are
provided in a text file, and the project details are included below.
Does anyone have some sample code that I could adapt?

Many thanks in advance,
dan

/*---NEED TO DO------*/
Considering IBM as an example, the steps are as follows.

A. Part 1 - download 'Historical Prices' from
http://finance.yahoo.com/q?s=ibm
1. Get the Start Date from the form at the top of this page,
http://finance.yahoo.com/q/hp?s=IBM
(I can provide the end date.)
2. Click on Get Prices
3. Then finally click on Download to Spreadsheet and save the file
with a name like IBM_StartDate_EndDate.csv.
(2) and (3) are equivalent to using this link directly,
http://ichart.yahoo.com/table.csv?s=IBM&a=00&b=2&c=1962&d=05&e=30&f=2004&g=d&ignore=.csv
Can you please post an example of a loop that would do the above for a
series of company symbols, saved in a text file?

B. Part 2 - download 'Options' from http://finance.yahoo.com/q?s=ibm
This seems more difficult because the data is in html format (there's
no option to download CSV files). What's the easiest/best way to take
care of this?
 
T

Terry Reedy

dan roberts said:
Folks,

This is my first Python project so please bear with me. I need to
download data from Yahoo Finance in CSV format. The symbols are
provided in a text file, and the project details are included below.
Does anyone have some sample code that I could adapt?

Perhaps someone will post something. In the meanwhile, the specialized
function you need is urlopen() in urllib(or possibly same function in
urllib2). This does all the hard work for you. The rest is standard
Python that you should learn. A start (maybe missing some function args):

import urllib
template =
"http://ichart.yahoo.com/table.csv?s=%s&a=00&b=2&c=1962&d=05&e=30&f=2004&g=
d&ignore=.csv"
symlist = file('whatever').read()
for sym in symlist:
url = template % sym
stuff = urllib.urlopen(url)
<write to disk>

Terry J. Reedy
 
W

wes weston

dan said:
Folks,

This is my first Python project so please bear with me. I need to
download data from Yahoo Finance in CSV format. The symbols are
provided in a text file, and the project details are included below.
Does anyone have some sample code that I could adapt?

Many thanks in advance,
dan

/*---NEED TO DO------*/
Considering IBM as an example, the steps are as follows.

A. Part 1 - download 'Historical Prices' from
http://finance.yahoo.com/q?s=ibm
1. Get the Start Date from the form at the top of this page,
http://finance.yahoo.com/q/hp?s=IBM
(I can provide the end date.)
2. Click on Get Prices
3. Then finally click on Download to Spreadsheet and save the file
with a name like IBM_StartDate_EndDate.csv.
(2) and (3) are equivalent to using this link directly,
http://ichart.yahoo.com/table.csv?s=IBM&a=00&b=2&c=1962&d=05&e=30&f=2004&g=d&ignore=.csv
Can you please post an example of a loop that would do the above for a
series of company symbols, saved in a text file?

B. Part 2 - download 'Options' from http://finance.yahoo.com/q?s=ibm
This seems more difficult because the data is in html format (there's
no option to download CSV files). What's the easiest/best way to take
care of this?

Dan,
Note the parser funtion is not used here but,
might come in handy.
wes
#--------------------------------------------------------------------------
import tkSimpleDialog
import tkFileDialog
import sys

#--------------------------------------------------------------------------
def GetHistoricalDailyDataForOneSymbol( symbol, day1, day2):
"""
Download the yahoo historical data as a comma separated file.
"""
#day1 = "2001-01-01"
# 0123456789
y1 = day1[:4]
m1 = day1[5:7]
d1 = day1[8:]
y2 = day2[:4]
m2 = day2[5:7]
d2 = day2[8:]
url_str = "http://chart.yahoo.com/table.csv?s=" + symbol + \
"&a=" + m1 + "&b=" + d1 + "&c=" + y1 + \
"&d=" + m2 + "&e=" + d2 + "&f=" + y2 + \
"&g=d&q=q&y=0" + "&z=" + symbol.lower() + "&x=.csv"
f = urllib.urlopen(url_str)
lines = f.readlines()
f.close()
return lines
#--------------------------------------------------------------------
def GetStockHistFile(symbol,earlyDate,lateDate,fn):
list = GetHistoricalDailyDataForOneSymbol( symbol, earlyDate, lateDate)
if (not list) or (len(list) < 1):
return 0
fp = open( fn,"w" )
for line in list[1:]:
fp.write( line )
fp.close()
return len(list)
#--------------------------------------------------------------------
def ParseDailyDataLine(line):
""" 25-Jan-99,80.8438,81.6562,79.5625,80.9375,25769100
22-Jan-99,77.8125,80.1172,77.625,78.125,20540000
21-Jan-99,80.875,81.6562,78.875,79.1562,20019300
20-Jan-99,83.4688,83.875,81.2422,81.3125,31370300
19-Jan-99,75.6875,79.1875,75.4375,77.8125,25685400
"""
if line[0] == '<':
return None
list = string.split(line,",")
pos = 0;
for str in list: #skip header
#print "str=",str,"pos=",pos
if pos == 0: #"9-Jan-01"
try:
list1 = string.split( str, "-" )
day = int(list1[0])
month = list1[1]
month = int(MonthStringToMonthInt( month ))
year = int( list1[2] )
if year < 70: # oh well, it will work for 70 years or until yahoo changes
year += 2000 # year is 101 here for a string input of 01
else:
year += 1900
date = "%d-%02d-%02d" % (year, month, day) #mx.DateTime.DateTime( year, month, day )
#println( "date=" + Date.toString() ); // this "2001-01-05"
except:
print "error in ParseDailyDataLine"
print "line=["+str+"]"
elif pos == 1:
Open = WES.MATH.MyMath.Round(float( str ),4)
elif pos == 2:
High = WES.MATH.MyMath.Round(float( str ),4)
elif pos == 3:
Low = WES.MATH.MyMath.Round(float( str ),4)
elif pos == 4:
Close = WES.MATH.MyMath.Round(float( str ),4)
elif pos == 5:
Volume = long ( str )
elif pos == 6:
AdjClose = WES.MATH.MyMath.Round(float( str ),4)
else:
print "ret none 1"
return None
pos += 1

if pos == 7:
return (date,Open,High,Low,Close,Volume)
else:
print "ret none 2"
return None
#--------------------------------------------------------------------
if __name__ == '__main__':
str = tkSimpleDialog.askstring("","enter <symbol> <early date> <late date>")
if not str:
sys.exit(1) #return
list = str.split()
symbol = list[0]
earlyDate = list[1]
lateDate = list[2]
fn = tkFileDialog.asksaveasfilename()
if not fn:
sys.exit(1) #return
if GetStockHistFile(symbol,earlyDate,lateDate,fn):
tkMessageBox.showinfo("Added",symbol )
else:
tkMessageBox.showinfo("Error Adding",symbol )
 
W

wes weston

dan said:
Folks,

This is my first Python project so please bear with me. I need to
download data from Yahoo Finance in CSV format. The symbols are
provided in a text file, and the project details are included below.
Does anyone have some sample code that I could adapt?

Many thanks in advance,
dan

/*---NEED TO DO------*/
Considering IBM as an example, the steps are as follows.

A. Part 1 - download 'Historical Prices' from
http://finance.yahoo.com/q?s=ibm
1. Get the Start Date from the form at the top of this page,
http://finance.yahoo.com/q/hp?s=IBM
(I can provide the end date.)
2. Click on Get Prices
3. Then finally click on Download to Spreadsheet and save the file
with a name like IBM_StartDate_EndDate.csv.
(2) and (3) are equivalent to using this link directly,
http://ichart.yahoo.com/table.csv?s=IBM&a=00&b=2&c=1962&d=05&e=30&f=2004&g=d&ignore=.csv
Can you please post an example of a loop that would do the above for a
series of company symbols, saved in a text file?

B. Part 2 - download 'Options' from http://finance.yahoo.com/q?s=ibm
This seems more difficult because the data is in html format (there's
no option to download CSV files). What's the easiest/best way to take
care of this?

dan,
Aha, yahoo has changed the url for the file
download - and it doesn't seem to work by my
past method for some reason.
I need this; to download historical data.
At one time I was parsing the data off the screen
but changed to downloading the file when they
made it available. I hope that old code is
still around. If/when I get it working, I'll
pass it on.
wes
 
J

John Hunter

dan> Folks, This is my first Python project so please bear with
dan> me. I need to download data from Yahoo Finance in CSV
dan> format. The symbols are provided in a text file, and the
dan> project details are included below. Does anyone have some
dan> sample code that I could adapt?

In the matplotlib finance module there is some code to get historical
quotes from yahoo. I'll repost the relevent bit here - but you can
grab the src distribution from http://matplotlib.sf.net and look in
matplotlib/finance.py for more info

"converter" is defined in the matplotlib.dates module and is used to
convert the data to and from a date time instance, eg epoch, mx
datetimes or python2.3 datetimes. If you don't need that
functionality it is easy to strip from the function below

def quotes_historical_yahoo(ticker, date1, date2,
converter=EpochConverter()):

"""
Get historical data for ticker between date1 and date2. converter
is a DateConverter class appropriate for converting your dates

results are a list of

d, open, close, high, low, volume


where d is an instnace of your datetime supplied by the converter
"""

y,m,d = converter.ymd(date1)
d1 = (m, d, y)
y,m,d = converter.ymd(date2)
d2 = (m, d, y)

urlFmt = 'http://table.finance.yahoo.com/table.csv?a=%d&b=%d&c=%d&d=%d&e=%d&f=%d&s=%s&y=0&g=d&ignore=.csv'
url = urlFmt % (d1[0], d1[1], d1[2],
d2[0], d2[1], d2[2], ticker)

ticker = ticker.upper()

results = []
try:
lines = urlopen(url).readlines()
except IOError, exc:
print 'urlopen() failure\n' + url + '\n' + exc.strerror[1]
return None

for line in lines[1:]:

vals = line.split(',')

if len(vals)!=7: continue
datestr = vals[0]

d = converter.strptime(datestr, '%d-%b-%y')
open, high, low, close = [float(val) for val in vals[1:5]]
volume = int(vals[5])

results.append((d, open, close, high, low, volume))
results.reverse()
return results
 
W

wes weston

dan said:
Folks,

This is my first Python project so please bear with me. I need to
download data from Yahoo Finance in CSV format. The symbols are
provided in a text file, and the project details are included below.
Does anyone have some sample code that I could adapt?

Many thanks in advance,
dan

/*---NEED TO DO------*/
Considering IBM as an example, the steps are as follows.

A. Part 1 - download 'Historical Prices' from
http://finance.yahoo.com/q?s=ibm
1. Get the Start Date from the form at the top of this page,
http://finance.yahoo.com/q/hp?s=IBM
(I can provide the end date.)
2. Click on Get Prices
3. Then finally click on Download to Spreadsheet and save the file
with a name like IBM_StartDate_EndDate.csv.
(2) and (3) are equivalent to using this link directly,
http://ichart.yahoo.com/table.csv?s=IBM&a=00&b=2&c=1962&d=05&e=30&f=2004&g=d&ignore=.csv
Can you please post an example of a loop that would do the above for a
series of company symbols, saved in a text file?

B. Part 2 - download 'Options' from http://finance.yahoo.com/q?s=ibm
This seems more difficult because the data is in html format (there's
no option to download CSV files). What's the easiest/best way to take
care of this?

Yahoo has changed the history download link from:

url_str = "http://chart.yahoo.com/table.csv?s=" + symbol + \
"&a=" + m1 + "&b=" + d1 + "&c=" + y1 + \
"&d=" + m2 + "&e=" + d2 + "&f=" + y2 + \
"&g=d&q=q&y=0" + "&z=" + symbol.lower() + "&x=.csv"

to:

#"http://ichart.yahoo.com/table.csv?s...06&amp;e=1&amp;f=2004&amp;g=d&amp;ignore=.csv"

Sorry there's some apples and oranges there.

The first one worked within the last month or two but, no more.
The second one is taken from viewing the source. When I try it,
it yields data starting with dates at the first end (a,b,c) and
back about 50 days. Which is weird. The second date is more current
but, seems to be ignored.

If you download the file, it has the dates shouwn on the screen.

If you use this in the program of my prior post, it should work
as described - wrong but getting data.
 
W

wes weston

dan said:
Folks,

This is my first Python project so please bear with me. I need to
download data from Yahoo Finance in CSV format. The symbols are
provided in a text file, and the project details are included below.
Does anyone have some sample code that I could adapt?

Many thanks in advance,
dan

/*---NEED TO DO------*/
Considering IBM as an example, the steps are as follows.

A. Part 1 - download 'Historical Prices' from
http://finance.yahoo.com/q?s=ibm
1. Get the Start Date from the form at the top of this page,
http://finance.yahoo.com/q/hp?s=IBM
(I can provide the end date.)
2. Click on Get Prices
3. Then finally click on Download to Spreadsheet and save the file
with a name like IBM_StartDate_EndDate.csv.
(2) and (3) are equivalent to using this link directly,
http://ichart.yahoo.com/table.csv?s=IBM&a=00&b=2&c=1962&d=05&e=30&f=2004&g=d&ignore=.csv
Can you please post an example of a loop that would do the above for a
series of company symbols, saved in a text file?

B. Part 2 - download 'Options' from http://finance.yahoo.com/q?s=ibm
This seems more difficult because the data is in html format (there's
no option to download CSV files). What's the easiest/best way to take
care of this?

This is really weird. The following test is run 4 times with no modification
and the output shown. As you can see, only two results are the same. Yet,
if you go to the yahoo historical data and download the file, the file
is ok?

Think I'll try another source of data.
wes

import urllib

def GetHistoricalDailyDataForOneSymbol( symbol, day1, day2):
#day1 = "2001-01-01"
y1 = day1[:4]
#m1 = int(day1[5:7]) - 1
m1 = int(day1[5:7])
d1 = day1[8:]

y2 = day2[:4]
#m2 = int(day2[5:7]) - 1
m2 = int(day2[5:7])
d2 = day2[8:]

url_str = "http://ichart.yahoo.com/table.csv?s...mp;c=%s&amp;d=%d&amp;e=%s&amp;f=%s&amp;x=.csv"\
% (symbol.upper(),m1,d1,y1, m2,d2,y2 )


f = urllib.urlopen(url_str)
lines = f.readlines()
f.close()
return lines


#--------------------------------------------------------------------
if __name__ == "__main__":
d1 = '2004-05-01'
d2 = '2004-06-01'
list = GetHistoricalDailyDataForOneSymbol('jkhy',d1,d2)
print d1,d2,"num=",len(list)
print list[1][:-1]," to"
print list[-2][:-1]


2004-05-01 2004-06-01 num= 59
21-Jun-04,19.32,19.50,19.32,19.42,350200,19.42 to
30-Mar-04,19.06,19.30,18.91,19.30,113100,19.262004-05-01 2004-06-01 num= 59
21-Jun-04,19.32,19.50,19.32,19.42,350200,19.42 to
30-Mar-04,19.06,19.30,18.91,19.30,113100,19.262004-05-01 2004-06-01 num= 67
1-Jul-04,20.08,20.13,19.80,19.88,352900,19.88 to
30-Mar-04,19.06,19.30,18.91,19.30,113100,19.262004-05-01 2004-06-01 num= 66
30-Jun-04,19.63,20.10,19.53,20.10,348900,20.10 to
30-Mar-04,19.06,19.30,18.91,19.30,113100,19.26
 
W

wes weston

If you go to yahoo finance historical data for jkhy
and get
Jun 1 2004 to
Jul 1 2004

it returns Jun 1 to Jun 21 !!!!!!!

This is all in netscape/yahoo - no python.

They don't have a feedback link for historical data,
so I did an advertiser feedback.
wes
 
D

Dave Brueck

Wes said:
If you go to yahoo finance historical data for jkhy
and get
Jun 1 2004 to
Jul 1 2004

it returns Jun 1 to Jun 21 !!!!!!!

This is all in netscape/yahoo - no python.

They don't have a feedback link for historical data,
so I did an advertiser feedback.

Worked fine for me. Note that 21 days worth of info is about right for one
month (since you exclude weekends and holidays), but for me it displayed the
correct dates.

-Dave
 
W

wes weston

Dave said:
Worked fine for me. Note that 21 days worth of info is about right for one
month (since you exclude weekends and holidays), but for me it displayed the
correct dates.

-Dave
Dave,
I saved the page and could send it to you. In other
tests, I got different results with the same driving
data.
wes
 
W

wes weston

Dave said:
Worked fine for me. Note that 21 days worth of info is about right for one
month (since you exclude weekends and holidays), but for me it displayed the
correct dates.

-Dave
Dave,
I tried it again. The initial try got
June 1 200 to June 25
A repeat got the correct result
Jun 1 2004 to Jul 1 2004 - weird
wes
 
W

wes weston

dan said:
Folks,

This is my first Python project so please bear with me. I need to
download data from Yahoo Finance in CSV format. The symbols are
provided in a text file, and the project details are included below.
Does anyone have some sample code that I could adapt?

Many thanks in advance,
dan

/*---NEED TO DO------*/
Considering IBM as an example, the steps are as follows.

A. Part 1 - download 'Historical Prices' from
http://finance.yahoo.com/q?s=ibm
1. Get the Start Date from the form at the top of this page,
http://finance.yahoo.com/q/hp?s=IBM
(I can provide the end date.)
2. Click on Get Prices
3. Then finally click on Download to Spreadsheet and save the file
with a name like IBM_StartDate_EndDate.csv.
(2) and (3) are equivalent to using this link directly,
http://ichart.yahoo.com/table.csv?s=IBM&a=00&b=2&c=1962&d=05&e=30&f=2004&g=d&ignore=.csv
Can you please post an example of a loop that would do the above for a
series of company symbols, saved in a text file?

B. Part 2 - download 'Options' from http://finance.yahoo.com/q?s=ibm
This seems more difficult because the data is in html format (there's
no option to download CSV files). What's the easiest/best way to take
care of this?

Dan,
About the time you posted, yahoo changed the url
to download the history file. I use this too but,
hadn't used in a few weeks. I posted a little program
and sent it to you directly(I think). Well, later I
tried it and it didn't work.
The odd thing was that totally within netscape,
if I asked for history bewtween Jun 1 2004 and
Jul 1 2004; yahoo returned eratic results. One time
it would end at Jun 21, then Jun 25, then Jul 1!!
I sent them mail about it. The last time I tried it
history worked ok.
The python messages under yours have more details;
like the new url for the file.
wes
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top