What's the best way to extract 2 values from a CSV file from each row systematically?

Q

quarantinemiles

Hey guys,

I'm a little new to Python, and am still learning!

I'm test building a web scraper that extracts prices from a website, based on two values I want to extract from a CSV file. The CSV has at least 1000 rows, an example:

0,0,KGD,0,DME,0,,0,0

The values I want to extract are KGD and DME (columns 3 and 5).

Each row in the CSV file contains values in columns 3 and 5 that I'd like to extract. What's the best way to extract these data, so I can insert them as inputs in two different fields in a form? A list, dictionary, or MySQL? I try not to do anything with MySQL as I'm not familiar with it at all.

I'm thinking of dictionary because at least I can make it work as a key/value pair (for columns 3 and 5), but a dictionary is unordered. I'd like to automatically go through each row in the CSV file from beginning to end to extract the two values in columns 3 and 5 and insert them into fields in a form.

I'd really appreciate any suggestions or help, thanks in advance!
 
N

Neil Cerutti

Hey guys,

I'm a little new to Python, and am still learning!

I'm test building a web scraper that extracts prices from a
website, based on two values I want to extract from a CSV file.
The CSV has at least 1000 rows, an example:

0,0,KGD,0,DME,0,,0,0

The values I want to extract are KGD and DME (columns 3 and 5).

Use the csv module.

http://docs.python.org/2/library/csv.html
 
J

Joel Goldstick

Hey guys,

I'm a little new to Python, and am still learning!

I'm test building a web scraper that extracts prices from a website, based
on two values I want to extract from a CSV file. The CSV has at least 1000
rows, an example:

0,0,KGD,0,DME,0,,0,0

The values I want to extract are KGD and DME (columns 3 and 5).

Each row in the CSV file contains values in columns 3 and 5 that I'd like
to extract. What's the best way to extract these data, so I can insert them
as inputs in two different fields in a form? A list, dictionary, or MySQL?
I try not to do anything with MySQL as I'm not familiar with it at all.

I'm thinking of dictionary because at least I can make it work as a
key/value pair (for columns 3 and 5), but a dictionary is unordered. I'd
like to automatically go through each row in the CSV file from beginning to
end to extract the two values in columns 3 and 5 and insert them into
fields in a form.

I'd really appreciate any suggestions or help, thanks in advance!

You should check out the csv module here:
http://docs.python.org/2/library/csv.html#module-csv
It will read your csv file into a list (the file rows) of lists (the
columns). You can easily loop over the data to extract the columns you
want using list indexing
 
T

Tim Chase

based on two values I want to extract from a CSV file. The
CSV has at least 1000 rows, an example:

0,0,KGD,0,DME,0,,0,0 [snip]
I'd like to automatically go through each row in the CSV file from
beginning to end to extract the two values in columns 3 and 5 and
insert them into fields in a form.

The csv module has several tools that make this easy to do. If there
are column-headers, you can do

import csv
with file("myfile.csv", "rb") as f:
for row in csv.DictReader(f)
insert_fields_into_form(
row["KGD"],
row["DME"],
)

which I like for clarity, ease of updating, and robustness (if for
some reason, the columns get moved around, or columns get
added/removed, as long as the headers remain the same, you can get
the data).

If it doesn't have headers, then you'd have to manually pick out the
columns, either by tuple-unpacking:

with file("myfile.csv", "rb") as f:
for _, _, kgd, _, dme in csv.reader(f)
insert_fields_into_form(kgd, dme)

or by directly indexing:

KGD_COL = 3
DME_COL = 5
with file("myfile.csv", "rb") as f:
for row in csv.reader(f)
insert_fields_into_form(
row[KGD_COL],
row[DME_COL],
)

both of which are more fragile than DictReader when it comes to
columns being added/removed. I leave the implementation of
insert_fields_into_form() up to you :)

-tkc
 
A

Alex Lee

Thanks for the help guys! I'll definitely read up on the csv module documentation.

Tim, that's incredibly helpful, thanks a lot! :) My CSV file doesn't have headers, but I'm sure I can just as easily add it in manually.
 
R

Roland Mueller

Hello,

2013/9/24 Alex Lee said:
Thanks for the help guys! I'll definitely read up on the csv module
documentation.

Tim, that's incredibly helpful, thanks a lot! :) My CSV file doesn't have
headers, but I'm sure I can just as easily add it in manually.

a better way to provide column headers is to use the fieldname parameter
when creating the CSV reader. fieldnames is a string array that should
match the number of columns in the csv file:

*class *csv.DictReader(*csvfile*, *fieldnames=None*, *restkey=None*, *
restval=None*, *dialect='excel'*, **args*, ***kwds*)

BR,
Roland

--
 
L

Luca Cerone

I'd really appreciate any suggestions or help, thanks in advance!

Hi Alex if you know that you want only columns 3 and 5, you could also use list comprehension to fetch the values:

import csv

with open('yourfile.csv','rU') as fo:
#the rU means read using Universal newlines
cr = csv.reader(fo)
values_list = [(r[2],r[4]) for r in cr] #you have a list of tuples containing the values you need

Cheers,
Luca
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,757
Messages
2,569,543
Members
45,026
Latest member
camilin05

Latest Threads

Top