F
Francis Avila
Three things that seem odd with csv:
Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... delimiter = ':'
.... escapechar = '\\'
.... lineterminator = '\n'
.... quoting = csv.QUOTE_NONE
.... skipinitialspace = False
....Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "D:\PYTHON23\lib\csv.py", line 39, in __init__
raise Error, "Dialect did not validate: %s" % ", ".join(errors)
_csv.Error: Dialect did not validate: doublequote parameter must be True or
False
Now, it seems to me that QUOTE_NONE makes doublequote meaningless, because
there's no quote character. And csv.writer doesn't write the quotechar
escaped--using the above dialect, cw.writerow(['1', '"', '3']) will write
the raw bytes '1:":3\n'. However, the corresponding csv.reader chokes on
those bytes.
Untested patch (csv.py, line 64):
if self.doublequote not in (True, False):
- errors.append("doublequote parameter must be True or False")
+ if self.quoting != QUOTE_NONE:
+ errors.append("doublequote parameter must be True or
False")
Moving on...
I also can't seem to use my own registered dialects:
.... delimiter = ':'
.... escapechar = '\\'
.... lineterminator = '\n'
.... quoting = csv.QUOTE_NONE
.... skipinitialspace = False
.... doublequote = False #Fine
....
Traceback (most recent call last):
Third issue was mentioned above: csv.reader chokes on quotechar, even if
ignored.
File "<stdin>", line 1, in ?
_csv.Error: newline inside string
My guess is that it's trying to grab a string delimited by ", and hits the
newline before getting a matching ":
This should be ['1', '"', '3', '"', '5'].
One might just set quotechar=None for this dialect, but this raises:
Traceback (most recent call last):
This is contrary to PEP 305's specs:
quotechar specifies a one-character string to use as the quoting character.
It defaults to '"'. Setting this to None has the same effect as setting
quoting to csv.QUOTE_NONE.
So it seems that it is impossible for csv.reader to parse dialects which
don't use quoting.
Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... delimiter = ':'
.... escapechar = '\\'
.... lineterminator = '\n'
.... quoting = csv.QUOTE_NONE
.... skipinitialspace = False
....Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "D:\PYTHON23\lib\csv.py", line 39, in __init__
raise Error, "Dialect did not validate: %s" % ", ".join(errors)
_csv.Error: Dialect did not validate: doublequote parameter must be True or
False
Now, it seems to me that QUOTE_NONE makes doublequote meaningless, because
there's no quote character. And csv.writer doesn't write the quotechar
escaped--using the above dialect, cw.writerow(['1', '"', '3']) will write
the raw bytes '1:":3\n'. However, the corresponding csv.reader chokes on
those bytes.
Untested patch (csv.py, line 64):
if self.doublequote not in (True, False):
- errors.append("doublequote parameter must be True or False")
+ if self.quoting != QUOTE_NONE:
+ errors.append("doublequote parameter must be True or
False")
Moving on...
I also can't seem to use my own registered dialects:
.... delimiter = ':'
.... escapechar = '\\'
.... lineterminator = '\n'
.... quoting = csv.QUOTE_NONE
.... skipinitialspace = False
.... doublequote = False #Fine
....
Traceback (most recent call last):csv.register_dialect('unix', unix)
csv.list_dialects() #Worked ['excel-tab', 'excel', 'unix']
fp = file('csvtest', 'wb')
csv.writer(fp, 'unix') #csv.reader fails too, same error
Traceback (most recent call last):
File said:del ud['__module__']
del ud['__doc__']
cw = csv.writer(fp, **ud) #if **ud works, why not 'unix'?
Third issue was mentioned above: csv.reader chokes on quotechar, even if
ignored.
Traceback (most recent call last):cw.writerow('1 : 3'.split())
cw.writerow('1 " 3'.split())
del cw
fp.flush()
fp.close()
fp = file('csvtest', 'rb')
fp.read() # Looks good '1:\\::3\n1:":3\n'
fp.seek(0)
cr = csv.reader(fp, **ud)
cr.next() ['1', ':', '3']
cr.next()
File "<stdin>", line 1, in ?
_csv.Error: newline inside string
My guess is that it's trying to grab a string delimited by ", and hits the
newline before getting a matching ":
['1', ':3:', '5']import StringIO
fp = StringIO.StringIO()
cw = csv.writer(fp, **ud)
cr = csv.reader(fp, **ud)
cw.writerow('1 " 3 " 5'.split())
fp.buflist ['1:":3:":5\n']
cr.next()
This should be ['1', '"', '3', '"', '5'].
One might just set quotechar=None for this dialect, but this raises:
Traceback (most recent call last):
This is contrary to PEP 305's specs:
quotechar specifies a one-character string to use as the quoting character.
It defaults to '"'. Setting this to None has the same effect as setting
quoting to csv.QUOTE_NONE.
So it seems that it is impossible for csv.reader to parse dialects which
don't use quoting.