Question on the "csv" library

V

vsoler

I am trying to read a csv file generated by excel.

Although I succeed in reading the file, the format that I get is not
suitable for me.

I've done:
print row


['codigo;nombre;cantidad']
['a;qwe;1']
['b;asd;2']
['c;zxc;3']

My questions are:

1- Why using "print spamReader" I cannot see the data?
I expected to see a list of lists, a kind of a matrix, but I get
nothing

2- Why are the rows in a single string?
I expected a list of fields that, when text, would be delimited by
"
To tell the truth, the file generated by excel does not contain the
strings delimited by ". Isn't it weird?

3- Is there anything I can do to have my data in a list of lists
structure? would another kind of data suit better my needs?

Thank you for your help

Vicente Soler
 
B

Benjamin Kaplan

I am trying to read a csv file generated by excel.

Although I succeed in reading the file, the format that I get is not
suitable for me.

I've done:
       print row


['codigo;nombre;cantidad']
['a;qwe;1']
['b;asd;2']
['c;zxc;3']

My questions are:

1- Why using "print spamReader" I cannot see the data?
   I expected to see a list of lists, a kind of a matrix, but I get
nothing

It's because csv.reader does the same thing open does. It returns an
iterable, not a list. The file is opened but not processed until you
iterate through it. You cannot see the data when you do print
spamReader because of this and because the csv reader object does
define a __str__ method.
2- Why are the rows in a single string?
  I expected a list of fields that, when text, would be delimited by
"
 To tell the truth, the file generated by excel does not contain the
strings delimited by ". Isn't it weird?

CSV stands for comma separated variables. It actually has nothing to
do with the quotes- they're just used in case an element has a comma
in it, so the reader knows it's a string literal and not a separate
column. Your comma separated variable worksheet seems to be semicolon
separated. That's why the reader isn't splitting it.
3- Is there anything I can do to have my data in a list of lists
structure? would another kind of data suit better my needs?
just do a list comprehension
sheet = [row for row in spamReader]
 
A

Andreas Waldenburger

[stuff]
3- Is there anything I can do to have my data in a list of lists
structure? would another kind of data suit better my needs?
just do a list comprehension
sheet = [row for row in spamReader]

Might I humbly suggest

/W
 
A

Andreas Waldenburger

[snip]

Might I humbly suggest

Oh, and while I'm humbly suggesting:

spam_reader instead of spamReader or SpamReader or SpamrEadeR or
suchlike. Caps are "reserved" for classes.

Not a necessity, of course. But it's the dialect around these parts.


/W
 
V

vsoler

Might I humbly suggest

Oh, and while I'm humbly suggesting:

spam_reader instead of spamReader or SpamReader or SpamrEadeR or
suchlike. Caps are "reserved" for classes.

Not a necessity, of course. But it's the dialect around these parts.

/W

Thank you for your answers. Let me however make some comments:

1- the csv file was generated with Excel 2007; no prompts for what the
separator should be; Excel has used ";" by default, without asking
anything

2- about capitalisation, I used the var "spamReader" because I just
copy/pasted from the official python site:

http://docs.python.org/library/csv.html

3- when I try
sheet = [row for row in spamReader]
print sheet
[]

all I get is an empty list; something seems not to be working properly

Same result list: I get an empty list

sheet = list(spamReader)

Thank you again for your help, which is highly appreciated.

Vicente Soler
 
A

Andreas Waldenburger

On Aug 27, 9:42 pm, Andreas Waldenburger <[email protected]>
wrote:

[snip what I wrote]

Thank you for your answers. Let me however make some comments:

1- the csv file was generated with Excel 2007; no prompts for what the
separator should be; Excel has used ";" by default, without asking
anything

Apparently it's possible, but it's a kludge

http://astrochimp.com/2005/12/20/export-csv-with-any-delimiter/comment-page-1/

Anyways, ...

2- about capitalisation, I used the var "spamReader" because I just
copy/pasted from the official python site:

http://docs.python.org/library/csv.html
Oh, switcheroo!

Well, those hypocrites!

3- when I try
sheet = [row for row in spamReader]
print sheet
[]

all I get is an empty list; something seems not to be working properly

Same result list: I get an empty list

sheet = list(spamReader)

Thank you again for your help, which is highly appreciated.

I'm assuming you do that after you've iterated over the file before?
Because it did work in a previous post of yours.

Here's the deal: When you iterate over a file (or read from it in any
other manner) you "advance" in the file. After iteration, you're at the
end, of course. So iterating again yields silch.

Hmm ... don't know how that translates to a CSVReader. But I postulate
that it'l work on a fresh run.


/W
 
M

Mark Lawrence

vsoler said:
[snip]
Might I humbly suggest
sheet = list(spamReader) # ?
Oh, and while I'm humbly suggesting:

spam_reader instead of spamReader or SpamReader or SpamrEadeR or
suchlike. Caps are "reserved" for classes.

Not a necessity, of course. But it's the dialect around these parts.

/W

Thank you for your answers. Let me however make some comments:

1- the csv file was generated with Excel 2007; no prompts for what the
separator should be; Excel has used ";" by default, without asking
anything
I find this difficult to believe, but assuming that you are correct then
use the delimiter=';' argument in the call to csv.reader. See the csv
module documentation section "Dialects and Formatting Parameters" for
more information.
2- about capitalisation, I used the var "spamReader" because I just
copy/pasted from the official python site:

http://docs.python.org/library/csv.html

3- when I try
sheet = [row for row in spamReader]
print sheet
[]

all I get is an empty list; something seems not to be working properly

Same result list: I get an empty list

sheet = list(spamReader)

Thank you again for your help, which is highly appreciated.

Vicente Soler
 
J

John Machin

I find this difficult to believe,

Mark, there exist parallel universes the denizens of which use strange
notation e.g. 1.234,56 instead of 1,234.56 and would you believe they
use ';' instead of ',' as a list separator ... Excel perfidiously
gives them what they expect rather than forcing them to comply with
The One True Way.
 
S

Simon Brunning

2009/8/28 John Machin said:
Mark, there exist parallel universes the denizens of which use strange
notation e.g. 1.234,56 instead of 1,234.56

When displaying data, sure.
and would you believe they
use ';' instead of ',' as a list separator ...

CSV is a data transfer format, not a display format. Locale specific
stuff like this has no place in it. Dates, IMHO, should be in the ugly
but unambiguous ISO 8601 format in a CSV. It's for import and export,
not for looking pretty.

Besides - CSV; the clue's in the name. ;-)
Excel perfidiously
gives them what they expect rather than forcing them to comply with
The One True Way.

When people export to a comma separated value file, they are almost
certainly expecting a file containing values separated by comas. If
Excel isn't giving them this by default, it's broken.
 
M

Mark Lawrence

John said:
Mark, there exist parallel universes the denizens of which use strange
notation e.g. 1.234,56 instead of 1,234.56 and would you believe they
use ';' instead of ',' as a list separator ... Excel perfidiously
gives them what they expect rather than forcing them to comply with
The One True Way.
I suggest a new file type csvewtlsinac i.e. comma seperated value except
when the list seperator is not a comma.:) Does this cover everything?

p.s. is it "separator" or "seperator", after 50+ years I still can't
remember?
 
S

Steven Rumbalski

I am trying to read a csv file generated by excel.

Although I succeed in reading the file, the format that I get is not
suitable for me.

I've done:

        print row

['codigo;nombre;cantidad']
['a;qwe;1']
['b;asd;2']
['c;zxc;3']

My questions are:

1- Why using "print spamReader" I cannot see the data?
    I expected to see a list of lists, a kind of a matrix, but I get
nothing

2- Why are the rows in a single string?
   I expected a list of fields that, when text, would be delimited by
"
  To tell the truth, the file generated by excel does not contain the
strings delimited by ". Isn't it weird?

3- Is there anything I can do to have my data in a list of lists
structure? would another kind of data suit better my needs?

Thank you for your help

Vicente Soler

the csv module can handle any delimiter.

change this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'))
to this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'),
delimiter=';')

hope this helps,
Steven Rumbalski
 
V

vsoler

I am trying to read a csv file generated by excel.
Although I succeed in reading the file, the format that I get is not
suitable for me.
I've done:
<_csv.reader object at 0x01022E70>
        print row
['codigo;nombre;cantidad']
['a;qwe;1']
['b;asd;2']
['c;zxc;3']

My questions are:
1- Why using "print spamReader" I cannot see the data?
    I expected to see a list of lists, a kind of a matrix, but I get
nothing
2- Why are the rows in a single string?
   I expected a list of fields that, when text, would be delimited by
"
  To tell the truth, the file generated by excel does not contain the
strings delimited by ". Isn't it weird?
3- Is there anything I can do to have my data in a list of lists
structure? would another kind of data suit better my needs?
Thank you for your help
Vicente Soler

the csv module can handle any delimiter.

change this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'))
to this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'),
delimiter=';')

hope this helps,
Steven Rumbalski

Thank you very much for all your comments. After reading them I can
conclude that:

1- the CSV format is not standardized; each piece of software uses it
differently

2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
comes from my regional Spanish settings

3- Excel does not even put quotes around litteral texts, not even when
the text contains a blank

But, perhaps, there is no standard alternative to CSV !!!
 
G

Gabriel Genellina

I am trying to read a csv file generated by excel.
['a;qwe;1']
['b;asd;2']
['c;zxc;3']

Thank you very much for all your comments. After reading them I can
conclude that:

1- the CSV format is not standardized; each piece of software uses it
differently

Yes! And that's part of the pain of working with 'csv' files.
2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
comes from my regional Spanish settings

Yes - but not just Excel, other programs do call "CSV" files that are
TAB-separated, by example.
3- Excel does not even put quotes around litteral texts, not even when
the text contains a blank

I guess you'll get quotes around text containing ';' characters
But, perhaps, there is no standard alternative to CSV !!!

Of course there are! You may use SYLK, DIFF, XML, XDR...

"The nice thing about standards is that there are so many to choose from."
(Andrew S. Tanenbaum)

But look for the xlrd package, it lets you read Excel files directly from
Python.
 
D

David Smith

vsoler said:
Thank you very much for all your comments. After reading them I can
conclude that:

1- the CSV format is not standardized; each piece of software uses it
differently

True, but there are commonalities. See
http://en.wikipedia.org/wiki/Comma-separated_values
2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
comes from my regional Spanish settings

The C really does stand for comma. I've never seen MS spit out
semi-colon separated text on a CSV format.
3- Excel does not even put quotes around litteral texts, not even when
the text contains a blank

There is no need to quote text literals with whitespace in them. There
is a need when a newline exists or when the separator character is
embedded in the field.


--David
 
M

Mark Lawrence

vsoler said:
I am trying to read a csv file generated by excel.
Although I succeed in reading the file, the format that I get is not
suitable for me.
I've done:
import csv
spamReader = csv.reader(open('C:\\abc.csv', 'r'))
print spamReader
<_csv.reader object at 0x01022E70>
for row in spamReader:
print row
['codigo;nombre;cantidad']
['a;qwe;1']
['b;asd;2']
['c;zxc;3']
My questions are:
1- Why using "print spamReader" I cannot see the data?
I expected to see a list of lists, a kind of a matrix, but I get
nothing
2- Why are the rows in a single string?
I expected a list of fields that, when text, would be delimited by
"
To tell the truth, the file generated by excel does not contain the
strings delimited by ". Isn't it weird?
3- Is there anything I can do to have my data in a list of lists
structure? would another kind of data suit better my needs?
Thank you for your help
Vicente Soler
the csv module can handle any delimiter.

change this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'))
to this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'),
delimiter=';')

hope this helps,
Steven Rumbalski

Thank you very much for all your comments. After reading them I can
conclude that:

1- the CSV format is not standardized; each piece of software uses it
differently

2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
comes from my regional Spanish settings

3- Excel does not even put quotes around litteral texts, not even when
the text contains a blank

But, perhaps, there is no standard alternative to CSV !!!
This depends on the use case of yourself or your users. If you could
give more detail on what you are trying to achieve then I'm sure that
more help will be forthcoming. For example, could the file be saved in
Excel 97-2003 xls format and then processed with the excellent
xlrd/xlwt/xlutils? They are available here:-
http://pypi.python.org/pypi/xlutils/1.4.0
 
H

Hrvoje Niksic

David Smith said:
The C really does stand for comma. I've never seen MS spit out
semi-colon separated text on a CSV format.

That's because you're running MS Office in a US language environment.
In almost all of Europe MS products use comma as the decimal separator,
and semicolon for CSV field separation.
 
J

John Machin

3- Excel does not even put quotes around litteral texts, not even when
the text contains a blank

Correct. Quoting is necessary only if a text field contains a
delimiter (semicolon/comma), a newline, or the quote character.

You can read Excel CSV output using the Python csv module by
specifying delimiter=";"

If you need to feed a CSV file to some software that demands that text
fields be quoted unconditionally, you have at least two options:

(1) If you know which fields should be text fields, you can read the
Excel-output file with Python csv, convert your non-text fields to
float, and write it back out with quoting=csv.QUOTE_NONNUMERIC.

(2) If you want precise control (and thus precise knowledge), use xlrd
(http://pypi.python.org/pypi/xlrd) to read Excel 97-2003 .xls files --
it will tell you cell by cell (NOT column by column (the user has
control of the type at the cell level)) whether the cell contains text
(reported as a unicode object), a number (float), a Boolean (int, 1 or
0), a date (float, days since ? (read the docs)), or an error code
e.g. #DIV/0! (int, read the docs) ... then you can write it out
however you like.

HTH,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,141
Latest member
BlissKeto
Top