CSV module, DictReader problem (bug?)

J

Jeff Blaine

It's been a year or so since I written Python code, so maybe
I am just doing something really dumb, but...

Documentation
=============

class DictReader(csvfile[,fieldnames=None,
[,restkey=None[, restval=None[, dialect='excel'
[, *args, **kwds]]]]])


Create an object which operates like a regular reader
but maps the information read into a dict whose keys
are given by the optional fieldnames parameter. If the
fieldnames parameter is omitted, the values in the
first row of the csvfile will be used as the fieldnames.

Code
====

import csv

r = csv.DictReader('C:\Temp\Book1.csv')
print r.next()
# EOF

Output
======

{'C': ':'}
 
F

Fredrik Lundh

Jeff said:
It's been a year or so since I written Python code, so maybe
I am just doing something really dumb, but...

Documentation
=============

class DictReader(csvfile[,fieldnames=None,
[,restkey=None[, restval=None[, dialect='excel'
[, *args, **kwds]]]]])


Create an object which operates like a regular reader
but maps the information read into a dict whose keys
are given by the optional fieldnames parameter. If the
fieldnames parameter is omitted, the values in the
first row of the csvfile will be used as the fieldnames.

Code
====

import csv

r = csv.DictReader('C:\Temp\Book1.csv')
print r.next()
# EOF

here's the documentation for the regular reader, from Python 2.5:

reader(...)
csv_reader = reader(iterable [, dialect='excel']
[optional keyword args])
for row in csv_reader:
process(row)

The "iterable" argument can be any object that returns a line
of input for each iteration, such as a file object or a list.
...

so the reader is simply looping over the characters in your filename. try

r = csv.DictReader(open('C:\Temp\Book1.csv'))

instead.

</F>
 
J

John Machin

Jeff said:
It's been a year or so since I written Python code, so maybe
I am just doing something really dumb, but...

Documentation
=============

class DictReader(csvfile[,fieldnames=None,
[,restkey=None[, restval=None[, dialect='excel'
[, *args, **kwds]]]]])


Create an object which operates like a regular reader
but maps the information read into a dict whose keys
are given by the optional fieldnames parameter. If the
fieldnames parameter is omitted, the values in the
first row of the csvfile will be used as the fieldnames.

Code
====

import csv

r = csv.DictReader('C:\Temp\Book1.csv')

Problem 1:

"""csvfile can be any object which supports the iterator protocol and
returns a string each time its next method is called -- file objects
and list objects are both suitable. If csvfile is a file object, it
must be opened with the 'b' flag on platforms where that makes a
difference."""

So, open the file, so that the next() method returns the next chunk of
content, not the next byte in the name of the file.

Note that the arg is called "csvfile", not "csvfilename".

Problem 2: [OK in this instance, but that's like saying you have taken
one step in a minefield and are still alive] backslashes and Windows
file names:

If you were to write 'c:\temp\book1.csv', it would blow up ... because
\t -> tab and \b -> backspace. Get into the habit of *always* using raw
strings r'C:\Temp\Book1.csv' for Windows file names (and re patterns).
You could use double backslashing 'C:\\Temp\\Book1.csv' but it's
uglier.
print r.next()
# EOF

Output
======

{'C': ':'}

HTH,
John
 
T

Tom Plunket

John said:
If you were to write 'c:\temp\book1.csv', it would blow up ... because
\t -> tab and \b -> backspace. Get into the habit of *always* using raw
strings r'C:\Temp\Book1.csv' for Windows file names (and re patterns).
You could use double backslashing 'C:\\Temp\\Book1.csv' but it's
uglier.

....alternatively you can just use 'unix slashes', e.g.
'c:/temp/book1.csv', since those work just fine 'cause the Windows
APIs deal with them properly.
-tom!
 
J

John Machin

Tom said:
...alternatively you can just use 'unix slashes', e.g.
'c:/temp/book1.csv', since those work just fine 'cause the Windows
APIs deal with them properly.

Not all APIs do the right thing. If you fire up the cmd.exe shell and
feed it slashes as path separators, it barfs. Example:
C:\junk>dir c:/junk/*.bar
Invalid switch - "junk".
Hence the advice to use rawstrings with backslashes -- they work under
all circumstances.
 
S

skip

John> Not all APIs do the right thing. If you fire up the cmd.exe shell
John> and feed it slashes as path separators, it barfs. Example:
John> C:\junk>dir c:/junk/*.bar
John> Invalid switch - "junk".
John> Hence the advice to use rawstrings with backslashes -- they work
John> under all circumstances.

I think he means "the Windows APIs" within a Python program.

Skip
 
S

Steve Holden

John said:
Not all APIs do the right thing. If you fire up the cmd.exe shell and
feed it slashes as path separators, it barfs. Example:
C:\junk>dir c:/junk/*.bar
Invalid switch - "junk".
Hence the advice to use rawstrings with backslashes -- they work under
all circumstances.
The command shell is not an API.

regards
Steve
 
J

John Machin

John> Not all APIs do the right thing. If you fire up the cmd.exe shell
John> and feed it slashes as path separators, it barfs. Example:
John> C:\junk>dir c:/junk/*.bar
John> Invalid switch - "junk".
John> Hence the advice to use rawstrings with backslashes -- they work
John> under all circumstances.

I think he means "the Windows APIs" within a Python program.

Skip

I too think he meant that. I left the mental gymnastics of wrapping what
I wrote into an os.system call as an exercise for the reader.

| >>> import os
| >>> os.system("dir c:/junk/*.bar")
| Invalid switch - "junk".
| 1
| >>> os.system(r"dir c:\junk\*.bar")
| [snip]
| 02/11/2006 01:21 PM 8 foo.bar
| [snip]
| 0
| >>>

Cheers,
John
 
F

Fredrik Lundh

John said:
Not all APIs do the right thing. If you fire up the cmd.exe shell and
feed it slashes as path separators, it barfs. Example:
C:\junk>dir c:/junk/*.bar
Invalid switch - "junk".
Hence the advice to use rawstrings with backslashes -- they work under
all circumstances.

we went through this a couple of days ago; all Windows API:s are
*documented* to accept backward or forward slashes, the command shell is
*documented* to only accept backward slashes.

if you're wrapping some cmd.exe command in an internal API, it's usually
easier to call "os.path.normpath" the last thing you do before you call
"os.system", than to get all the backslashes right in your code.

also see:

http://www.effbot.org/pyfaq/why-can-t-raw-strings-r-strings-end-with-a-backslash.htm

</F>
 
F

Fredrik Lundh

Fredrik said:
if you're wrapping some cmd.exe command in an internal API, it's usually
easier to call "os.path.normpath" the last thing you do before you call
"os.system", than to get all the backslashes right in your code.

also see:

http://www.effbot.org/pyfaq/why-can-t-raw-strings-r-strings-end-with-a-backslash.htm

and as just I pointed out in a comment on that page, getting the slashes
right doesn't help you with spaces in filenames. to write reliable code
for os.system, you want something like:

import os.path
import subprocess

def mysystem(command, *files):
files = map(os.path.normpath, files)
files = subprocess.list2cmdline(files)
return os.system(command + " " + files)

mysystem("more", "/program files/subversion/readme.txt")

but then you might as well use subprocess.call, of course.

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top