Some csv oddities

F

Francis Avila

Three things that seem odd with csv:

Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... delimiter = ':'
.... escapechar = '\\'
.... lineterminator = '\n'
.... quoting = csv.QUOTE_NONE
.... skipinitialspace = False
....Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "D:\PYTHON23\lib\csv.py", line 39, in __init__
raise Error, "Dialect did not validate: %s" % ", ".join(errors)
_csv.Error: Dialect did not validate: doublequote parameter must be True or
False


Now, it seems to me that QUOTE_NONE makes doublequote meaningless, because
there's no quote character. And csv.writer doesn't write the quotechar
escaped--using the above dialect, cw.writerow(['1', '"', '3']) will write
the raw bytes '1:":3\n'. However, the corresponding csv.reader chokes on
those bytes.


Untested patch (csv.py, line 64):

if self.doublequote not in (True, False):
- errors.append("doublequote parameter must be True or False")
+ if self.quoting != QUOTE_NONE:
+ errors.append("doublequote parameter must be True or
False")

Moving on...

I also can't seem to use my own registered dialects:
.... delimiter = ':'
.... escapechar = '\\'
.... lineterminator = '\n'
.... quoting = csv.QUOTE_NONE
.... skipinitialspace = False
.... doublequote = False #Fine
....
csv.register_dialect('unix', unix)
csv.list_dialects() #Worked ['excel-tab', 'excel', 'unix']
fp = file('csvtest', 'wb')
csv.writer(fp, 'unix') #csv.reader fails too, same error
Traceback (most recent call last):
Traceback (most recent call last):
File said:
del ud['__module__']
del ud['__doc__']
cw = csv.writer(fp, **ud) #if **ud works, why not 'unix'?

Third issue was mentioned above: csv.reader chokes on quotechar, even if
ignored.
cw.writerow('1 : 3'.split())
cw.writerow('1 " 3'.split())
del cw
fp.flush()
fp.close()
fp = file('csvtest', 'rb')
fp.read() # Looks good '1:\\::3\n1:":3\n'
fp.seek(0)
cr = csv.reader(fp, **ud)
cr.next() ['1', ':', '3']
cr.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
_csv.Error: newline inside string

My guess is that it's trying to grab a string delimited by ", and hits the
newline before getting a matching ":
import StringIO
fp = StringIO.StringIO()
cw = csv.writer(fp, **ud)
cr = csv.reader(fp, **ud)
cw.writerow('1 " 3 " 5'.split())
fp.buflist ['1:":3:":5\n']
cr.next()
['1', ':3:', '5']

This should be ['1', '"', '3', '"', '5'].

One might just set quotechar=None for this dialect, but this raises:
Traceback (most recent call last):

This is contrary to PEP 305's specs:

quotechar specifies a one-character string to use as the quoting character.
It defaults to '"'. Setting this to None has the same effect as setting
quoting to csv.QUOTE_NONE.

So it seems that it is impossible for csv.reader to parse dialects which
don't use quoting.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top