parse a csv file into a text file

D

Dave Angel

Oops. Forgot the newline.
In python 2.x,

Instead of
f.write (a + " " + b) f.write (a + " " + b + "\n")
you can use
print >> f, a, b

print will add in the space and newline, just as it does to
sys.stdout.
 
M

MRAB

On 2014-02-06 07:52, Zhen Zhang wrote:> On Wednesday, February 5, 2014
Zhen Zhang said:
Code:
import csv
file = open('raw.csv')
reader = csv.reader(file)

f = open('NicelyDone.text','w')

for line in reader:
f.write("%s %s"%line[1],%line[5])

Are you using Python 2 or 3?
Here is my question:
1:What is the data format for line[1],

That's something you can easily figure out by printing out the
intermediate values. Try something like:
for line in reader:
print type(line[1]), repr(line(1))

See if that prints what you expect.
how come f.write() does not work.

What does "does not work" mean? What does get written to the file?
Or do you get some sort of error?

I'm pretty sure I see your error, but I'm trying to lead you to being
able to diagnose it yourself :)

Hi Roy ,

Thank you so much for the reply,
I am currenly running python 2.7

i run the
print type(line[1]), repr(line(1))
It tells me that 'list object is not callable
"line" is a list and within repr you're using (...) (parentheses)
instead of [...] (square brackets).

It might be clearer if you call the variable "row" because the CSV
reader returns rows, and each row is a list of strings.
It seems the entire line is a data type of list instead of a data
type of "line" as i thought.

The line[1] is a string element of list after all.

f.write("%s %s %s" %(output,location,output))works great,
as MRAB mentioned, I have to do write it in term of tuples.

This is the code I am currently using

for line in reader:
location ="%s"%(line[1])
if '(' in location:
# at this point, bits = ['Toronto ', 'Ont.)']
bits = location.split('(')
location = bits[0].strip()
output = "%s %s\n" %(location,line[5])
f.write("%s" %(output))
A 1-tuple (a tuple containing one item) is:

(item, )

It's actually the comma that makes it a tuple (except for the 0-tuple
"()"); it's just that it's often necessary to wrap it in (...), and
people then think it's those that are making it a tuple, but it's not!
It extracts desired information into a text file as i wanted.
however, the python program gives me a Error after the execution.
location="%s"%(line[1])
IndexError: list index out of range

I failed to figure out why.
What is the value of "line" at that point?
 
R

Rustom Mody

It's actually the comma that makes it a tuple (except for the 0-tuple
"()"); it's just that it's often necessary to wrap it in (...), and
people then think it's those that are making it a tuple, but it's not!

Interesting viewpoint -- didn't know that!
 
N

Neil Cerutti

Hi, every one.

I am a second year EE student.
I just started learning python for my project.

I intend to parse a csv file with a format like

3520005,"Toronto (Ont.)",C > ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1
[...]
into a text file like the following
Toronto 2503281 [...]
This is what i have so far.


Code:
import csv
file = open('raw.csv')[/QUOTE]

You must open the file in binary mode, as that is what the csv
module expects in Python 2.7. newline handling can be enscrewed
if you forget.

file = open('raw.csv', 'b')
 
M

Mark Lawrence

You must open the file in binary mode, as that is what the csv
module expects in Python 2.7. newline handling can be enscrewed
if you forget.

file = open('raw.csv', 'b')

I've never opened a file in binary mode to read with the csv module
using any Python version. Where does it state that you must do this?
 
T

Tim Chase

I've never opened a file in binary mode to read with the csv module
using any Python version. Where does it state that you must do
this?

While the docs don't currently say anything about it, all the
examples at [1] use 'rb' or 'wb' when opening the file. I've long
wondered about that. Especially as I've passed non-file objects like
lists/iterators to the csv.reader/csv.DictReader and had them work
just fine (and would be a little perturbed if they broke).

-tkc

[1] http://docs.python.org/2/library/csv.html
 
T

Tim Golden

I've never opened a file in binary mode to read with the csv module
using any Python version. Where does it state that you must do this?

If you don't, you tend to get interleaved blank lines. (Presumably
unless your .csv is using \n-only linefeeds).

TJG
 
N

Neil Cerutti

I've never opened a file in binary mode to read with the csv module
using any Python version. Where does it state that you must do
this?

While the docs don't currently say anything about it, all the
examples at [1] use 'rb' or 'wb' when opening the file. I've
long wondered about that. Especially as I've passed non-file
objects like lists/iterators to the csv.reader/csv.DictReader
and had them work just fine (and would be a little perturbed if
they broke).

They do actually mention it.

From: http://docs.python.org/2/library/csv.html

csv.reader(csvfile, dialect='excel', **fmtparams)

Return a reader object which will iterate over lines in the
given csvfile. csvfile can be any object which supports the
iterator protocol and returns a string each time its next()
method is called — file objects and list objects are both
suitable. If csvfile is a file object, it must be opened with
the ‘b’ flag on platforms where that makes a difference.

So it's stipulated only for file objects on systems where it
might make a difference.
 
T

Tim Chase

[first, it looks like you're posting via Google Groups which
annoyingly double-spaces everything in your reply. It's possible to
work around this, but you might want to subscribe via email or an
actual newsgroup client. You can read more at
https://wiki.python.org/moin/GoogleGroupsPython ]

Does the split make a list or tuple?

In this case, it happens to return a list, which you can check with

print type("one two three".split())

However, also in this case, it doesn't matter, since either indexes
just fine.
when i do location=line[1],
it gives me a error even though the program did run correctly and
output the correct file. location=line[1]
IndexError: list index out of range

Then it looks like you've got a blank line that doesn't actually have
data in it, so when it tries index into it, the only thing there is
[0], not [1]. As the message suggests :)

-tkc
 
T

Tim Chase

They do actually mention it.

From: http://docs.python.org/2/library/csv.html

If csvfile is a file object, it must be opened with
the ‘b’ flag on platforms where that makes a difference..

So it's stipulated only for file objects on systems where it
might make a difference.

Ah, I *knew* I'd read that somewhere but my searches in firefox (for
"binary", "rb" and "wb") didn't manage to catch that particular
instance. Thanks for disinterring that.

-tkc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top