extract text from ods TableCell using odfpy

F

frankentux

Hi there,

I'm losing hair trying to figure out how I can actually get the text
out of an existing .ods file. Currently I have:
#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf import text
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows:
cells = row.getElementsByType(TableCell)
for cell in cells:
print dir(cell.getElementsByType(text.P))

This is a spreadsheet containing 200 rows, each with 4 cells
containing strings. What I'd like to be able to do is something like:
for row in rows:
cells = row.getElementsByType(TableCell)

users.append((cells[0].value,cells[1].value,cells[2].value,cells[3].value))

Thus, what I'd like to know is how to actually get the value out of
the cell. I've read through the odfpy api documentation (which is
almost completely focused on writing, not reading) and googled for
info, but I still haven't found anything.
 
F

frankentux

Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:

#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild
 
N

norseman

frankentux said:
Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:

#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild
=========================
cd /opt
find . -name "*odf*" -print
(empty)
cd /usr/local/lib/python2.5
find . -name "*odf*" -print
(empty)


OK - where is it? :)


Steve
(e-mail address removed)
 
J

John Machin

frankentux said:
Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:
#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild

=========================
cd /opt
find . -name "*odf*" -print
(empty)
cd /usr/local/lib/python2.5
find . -name "*odf*" -print
(empty)

OK - where is it? :)

Consider using:
find --http --google "odfpy"
;-)
 
N

norseman

Ciaran said:
2008/8/26 norseman said:
frankentux said:
Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:

#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild
=========================
cd /opt
find . -name "*odf*" -print
(empty)
cd /usr/local/lib/python2.5
find . -name "*odf*" -print
(empty)


OK - where is it? :)

Sorry. Stupid of me. The module is not part of the standard libary.
It's at http://opendocumentfellowship.com/projects/odfpy

Ciaran
==============
I got the download and all went pretty well. Setup.py compiled OK and
install put it where it belongs.

As a test I went to try odflint and keep getting a zlib not found error.
It is installed (/usr/local/lib) and the python zlib things .py, .pyc
and .pyo all seem present. Not sure what is happening.


I took a look at Python.2.5.2's zipfile.py

statement: import zlib was changed to import libz as zlib
(ALL libs are prefixed with lib... by convention)
Problem below the test happens with or without my change.

Test I ran:

python
(sign on yah de yah yah)
import zipfile
zipfile.is_zipfile("zx")
False
zipfile.is_zipfile("zz.zip")
True
zipfile.is_zipfile("zx.zip")
False (file non existent - no error generated, but answer correct)

Thus all returned correct answers. Distro Python code runs as expected.

However:

odflint OOstuf2.odt |\__
python /usr/local/bin/odflint OOstuf2.odt |/ Both return following:

Traceback (most recent call last):
File "/usr/local/bin/odflint", line 213, in <module>
lint(sys.argv[1])
File "/usr/local/bin/odflint", line 197, in lint
content = zfd.read(zi.filename)
File "/usr/local/lib/python2.5/zipfile.py", line 498, in read
"De-compression requires the (missing) zlib module"
RuntimeError: De-compression requires the (missing) zlib module

Anybody:
What did I miss correcting? Seems odflint only uses zipfile.references.

System: Slackware 10.2 on 2.4GgHz Laptop


Steve
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,145
Latest member
web3PRAgeency
Top