Forums
New posts
Search forums
Members
Current visitors
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Menu
Log in
Register
Install the app
Install
Forums
Archive
Archive
Python
Unexpected behaviour of csv module
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
[QUOTE="Andrew McLean, post: 1946877"] I have a bunch of csv files that have the following characteristics: - field delimiter is a comma - all fields quoted with double quotes - lines terminated by a *space* followed by a newline What surprised me was that the csv reader included the trailing space in the final field value returned, even though it is outside of the quotes. I've produced a test program (see below) that demonstrates this. There is a workaround, which is to not pass the csv reader the file iterator, but rather a generator that returns lines from the file with the trailing space stripped. Interestingly, the same behaviour is seen if there are spaces before the field separator. They are also included in the preceding field value, even if they are outside the quotations. My workaround wouldn't help here. Anyway is this a bug or a feature? If it is a feature then I'm curious as to why it is considered desirable behaviour. - Andrew import csv filename = "test_data.csv" # Generate a test file - note the spaces before the newlines fout = open(filename, "wb") fout.write('"Field1","Field2","Field3" \n') fout.write('"a","b","c" \n') fout.write('"d" ,"e","f" \n') fout.close() # Function to test a reader def read_and_print(reader): for line in reader: print ",".join(['"%s"' % field for field in line]) # Read the test file - and print the output reader = csv.reader(open("test_data.csv", "rb")) read_and_print(reader) # Now the workaround: a generator to strip the strings before the reader decodes them def stripped(input): for line in input: yield line.strip() reader = csv.reader(stripped(open("test_data.csv", "rb"))) read_and_print(reader) # Try using lineterminator instead - it doesn't work reader = csv.reader(open("test_data.csv", "rb"), lineterminator=" \r\n") read_and_print(reader) [/QUOTE]
Verification
Post reply
Forums
Archive
Archive
Python
Unexpected behaviour of csv module
Top