Downloading multiple csv files from a website

T

tkpmep

I'd like to download data from the website
http://www.russell.com/Indexes/performance/daily_values_US.asp. On
this web page, there are links to a number of .csv files, and I'd like
to download all of them automatically each day. The file names are not
visible on the page, but if I click on a link, a csv file opens in
Excel. I've searched this group and looked into urllib, but have not
found functions or code snippets that will allow me to download and
rename each file. Would someone kindly point me to appropriate
libraries/functions and/or code snippets that will get me started?

Thanks in advance

Thomas Philips
 
K

kyosohma

I'd like to download data from the websitehttp://www.russell.com/Indexes/performance/daily_values_US.asp. On
this web page, there are links to a number of .csv files, and I'd like
to download all of them automatically each day. The file names are not
visible on the page, but if I click on a link, a csv file opens in
Excel. I've searched this group and looked into urllib, but have not
found functions or code snippets that will allow me to download and
rename each file. Would someone kindly point me to appropriate
libraries/functions and/or code snippets that will get me started?

Thanks in advance

Thomas Philips

This link shows how to extract a list of URLs:
http://www.java2s.com/Code/Python/Network/ExtractlistofURLsinawebpage.htm

and this one shows how to download:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/83208

Mike
 
T

tkpmep

Our systems administrator suggested that I try wget, a GNU utility
that is designed to pick up data. It might prove to be the easiest way
to get the data I want, and I am going to try that first.

Thanks again.

Thomas Philips
 
T

tkpmep

Mike,

Thanks for the pointers. I looked through the ASPN cookbook, but found
a more reliable (and easier to implement) way to get the files I want.
I downloaded GNU Wget from http://users.ugent.be/~bpuype/wget/( the
current version is 1.10.2), and then ran it from Python as follows

import os
rc = os.system('wget --debug --output-document="c:\\downloads\
\russell1000index_cyr.csv" --output-file=log.txt
http://www.russell.com/common/indexes/csvs/russell1000index_cyr.csv')


rc is the return code, and is 0 if the download succeeds. I also tried
the subprocess module

import subprocess
f = subprocess.Popen('wget --debug --output-document="c:\\downloads\
\russell1000index_cyr.csv" --output-file=log.txt
http://www.russell.com/common/indexes/csvs/russell1000index_cyr.csv')

This, too, works just fine. Wget does a lot more error-checking than
the recipe in the Python cookbook, does FTP as well as http, and
supports OpenSSL - its essentially a one-stop solution. In addition, I
can write batch files that do all the downloads without any need for
Python to be installed on the machine.

Thanks again

Thomas Philips
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top