how to operate the excel by python?

Ò

ÒÊÃÉɽÈË

i want to compare the content in excel,but i don't know whick module to use!
can you help me?
 
A

alex23

ÒÊÃÉɽÈË said:
i want to compare the content in excel,but i don't know whick module to use!
can you help me?

I noticed a package on PyPi today that might be useful to you:

http://www.python.org/pypi/xlrd/0.3a1

The homepage is a little brief, so I clipped their example from the
README:

import xlrd
book = xlrd.open_workbook("myfile.xls")
print "The number of worksheets is", book.nsheets
print "Worksheet name(s):", book.sheet_names()
sh = book.sheet_by_index(0)
print sh.name, sh.nrows, sh.ncols
print "Cell D30 is", sh.cell_value(rowx=29, colx=3)
for rx in range(sh.nrows):
print sh.row(rx)

I haven't had cause to use it myself, however.

-alex23
 
J

John Machin

Rune said:
The key is Python for Windows :
http://starship.python.net/crew/mhammond/win32/

See here for an Excel dispatch example:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/325735

When doing such operations, I generally save all the Excel files to CSV
files and do the operations on them using the csv module.

Problems with that approach:

1. Unfortunately, "save as CSV" is very much a WYSIWYG operation. If the
"number formats" are not sensible, loss of information can result.

Example: user gets (text) data file which needs extra columns added.
User loads it into Excel, adds extra data, saves as csv. One column is
an identifier which just happens to be numeric -- say a 12-digit credit
card number 123456789012. The user doesn't care that this is showing on
the Excel screen as 1.23457E+11 (default format) as he is using the
6-digit customer account number 654321 to find the extra data he needs
to add. He may not even see the 1.23457E+11 because it's in column BQ
and he's inserting 3 columns in front of column E.

To avoid that problem, one has to check all the columns and reformat
those that do not display all the data. This is not something that a
user can be relied on to do, even when stated clearly in a procedure manual.

2. The tedium and error-proneness of "saving as": (a) you get given a
file named "fubar.csv" but it was "fubar.csv.xls" before the user
renamed it. (b) Excel files can have multiple worksheets, which have to
be saved each as a separate csv file.

Consequently, an approach which reads the .XLS file directly has
attractions.

One such approach, which unlike the COM approach doesn't need Excel to
be on the reading machine, and doesn't even need Windows, is the free
"xlrd" module [of which I am the author] -- see
http://www.lexicon.net/sjmachin/xlrd.htm
or
http://www.python.org/pypi/xlrd/

Regards,
John
 
R

Rune Strand

John,

I wrote a script that autmates the conversion from XLS to CSV. It's
easy. But your points are still good. Thanks for making me aware the
xlrd module!
 
T

Tim Roberts

John Machin said:
Problems with that approach:

1. Unfortunately, "save as CSV" is very much a WYSIWYG operation. If the
"number formats" are not sensible, loss of information can result.

This is a real problem. US postal codes are a particular nasty issue. The
value "01234", for example, will be imported into Excel as "1234".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top