Downloading multiple csv files from a website

tkpmep · Aug 17, 2007

I'd like to download data from the website
http://www.russell.com/Indexes/performance/daily_values_US.asp. On
this web page, there are links to a number of .csv files, and I'd like
to download all of them automatically each day. The file names are not
visible on the page, but if I click on a link, a csv file opens in
Excel. I've searched this group and looked into urllib, but have not
found functions or code snippets that will allow me to download and
rename each file. Would someone kindly point me to appropriate
libraries/functions and/or code snippets that will get me started?

Thanks in advance

Thomas Philips

kyosohma · Aug 17, 2007

I'd like to download data from the websitehttp://www.russell.com/Indexes/performance/daily_values_US.asp. On
this web page, there are links to a number of .csv files, and I'd like
to download all of them automatically each day. The file names are not
visible on the page, but if I click on a link, a csv file opens in
Excel. I've searched this group and looked into urllib, but have not
found functions or code snippets that will allow me to download and
rename each file. Would someone kindly point me to appropriate
libraries/functions and/or code snippets that will get me started?

Thanks in advance

Thomas Philips

This link shows how to extract a list of URLs:
http://www.java2s.com/Code/Python/Network/ExtractlistofURLsinawebpage.htm

and this one shows how to download:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/83208

Mike

tkpmep · Aug 17, 2007

Our systems administrator suggested that I try wget, a GNU utility
that is designed to pick up data. It might prove to be the easiest way
to get the data I want, and I am going to try that first.

Thanks again.

Thomas Philips

tkpmep · Aug 17, 2007

Mike,

Thanks for the pointers. I looked through the ASPN cookbook, but found
a more reliable (and easier to implement) way to get the files I want.
I downloaded GNU Wget from http://users.ugent.be/~bpuype/wget/( the
current version is 1.10.2), and then ran it from Python as follows

import os
rc = os.system('wget --debug --output-document="c:\\downloads\
\russell1000index_cyr.csv" --output-file=log.txt
http://www.russell.com/common/indexes/csvs/russell1000index_cyr.csv')

rc is the return code, and is 0 if the download succeeds. I also tried
the subprocess module

import subprocess
f = subprocess.Popen('wget --debug --output-document="c:\\downloads\
\russell1000index_cyr.csv" --output-file=log.txt
http://www.russell.com/common/indexes/csvs/russell1000index_cyr.csv')

This, too, works just fine. Wget does a lot more error-checking than
the recipe in the Python cookbook, does FTP as well as http, and
supports OpenSSL - its essentially a one-stop solution. In addition, I
can write batch files that do all the downloads without any need for
Python to be installed on the machine.

Thanks again

Thomas Philips

Downloading multiple files based on info extracted from CSV	5	Dec 12, 2013
A website that I couldn't make a screenshot of it nor save any page from.	1	Oct 29, 2023
How can I train a neural network by reading different csv files	0	Nov 24, 2022
Downloading/Saving to a Directory	0	Nov 28, 2013
Find and count strings of text from multiple files	17	Dec 16, 2021
Collect Excel Data from Website	5	Apr 30, 2022
Web Page Parsing/Downloading	1	Nov 22, 2013
writing reading from a csv or txt file	3	Mar 30, 2014

Downloading multiple csv files from a website

tkpmep

kyosohma

tkpmep

tkpmep

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads