mod_python and xml.dom.minidom

D

dpapathanasiou

I wrote a python script called xml_utils.py which parses xml using
minidom.

It works when it's run on its own, but when I try to import it and run
it inside a mod_python handler, I get this error:

File "../common/xml_utils.py", line 80, in parse_item_attribute
File "/usr/lib/python2.5/xml/dom/minidom.py", line 1924, in
parseString
from xml.dom import expatbuilder
SystemError: Parent module 'xml.dom' not loaded

Basically, it's the same problem I found in this post:
http://mail.python.org/pipermail/python-list/2007-January/424018.html

This site (http://www.dscpl.com.au/wiki/ModPython/Articles/
ExpatCausingApacheCrash) goes through a detailed explanation, and I
found that the version of pyexpat is newer than libexpat:

# ldd /usr/local/apache2/bin/httpd | grep expat
libexpat.so.0 => /usr/local/apache2/lib/libexpat.so.0 (0xb7f71000)
# strings /usr/local/apache2/lib/libexpat.so.0 | grep expat_
expat_1.95.2

$ python
Python 2.5.2 (r252:60911, Jan 4 2009, 17:40:26)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.(2, 0, 0)

But this is where I'm stuck: the article suggests recompiling apache
with the newer version of expat.

Apache's configure utility (I'm using httpd version 2.2.11) doesn't
explicitly describe an expat library option.

Also, if libexpat is version 1.95.2, wouldn't I have to get version
2.0 to be compatible with pyexpat?

If anyone has any advice or suggestions, I'd appreciate hearing them.
 
D

Daniel Fetchinson

I wrote a python script called xml_utils.py which parses xml using
minidom.

It works when it's run on its own, but when I try to import it and run
it inside a mod_python handler, I get this error:

File "../common/xml_utils.py", line 80, in parse_item_attribute
File "/usr/lib/python2.5/xml/dom/minidom.py", line 1924, in
parseString
from xml.dom import expatbuilder
SystemError: Parent module 'xml.dom' not loaded

Basically, it's the same problem I found in this post:
http://mail.python.org/pipermail/python-list/2007-January/424018.html

This site (http://www.dscpl.com.au/wiki/ModPython/Articles/
ExpatCausingApacheCrash) goes through a detailed explanation, and I
found that the version of pyexpat is newer than libexpat:

# ldd /usr/local/apache2/bin/httpd | grep expat
libexpat.so.0 => /usr/local/apache2/lib/libexpat.so.0 (0xb7f71000)
# strings /usr/local/apache2/lib/libexpat.so.0 | grep expat_
expat_1.95.2

$ python
Python 2.5.2 (r252:60911, Jan 4 2009, 17:40:26)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.(2, 0, 0)

But this is where I'm stuck: the article suggests recompiling apache
with the newer version of expat.

Apache's configure utility (I'm using httpd version 2.2.11) doesn't
explicitly describe an expat library option.

Also, if libexpat is version 1.95.2, wouldn't I have to get version
2.0 to be compatible with pyexpat?

If anyone has any advice or suggestions, I'd appreciate hearing them.

My only advice is, don't use mod_python. The project is dead, you
should use mod_wsgi instead: http://code.google.com/p/modwsgi/

Cheers,
Daniel
 
P

Paul Boddie

Apache's configure utility (I'm using httpd version 2.2.11) doesn't
explicitly describe an expat library option.

Also, if libexpat is version 1.95.2, wouldn't I have to get version
2.0 to be compatible with pyexpat?

The aim would be to persuade Apache to configure itself against the
same Expat library that pyexpat is using, which would involve the
headers and libraries referenced during the pyexpat configuration
process, although I seem to recall something about pyexpat bundling
its own version of Expat - that would complicate matters somewhat.
If anyone has any advice or suggestions, I'd appreciate hearing them.

Expat might be getting brought into Apache via mod_dav:

http://www.webdav.org/mod_dav/install.html

Perhaps disabling mod_dav when configuring Apache might drop Expat
from Apache's library dependencies.

Paul
 
G

Graham Dumpleton

The aim would be to persuade Apache to configure itself against the
same Expat library that pyexpat is using, which would involve the
headers and libraries referenced during the pyexpat configuration
process, although I seem to recall something about pyexpat bundling
its own version of Expat - that would complicate matters somewhat.


Expat might be getting brought into Apache via mod_dav:

http://www.webdav.org/mod_dav/install.html

Perhaps disabling mod_dav when configuring Apache might drop Expat
from Apache's library dependencies.

The OP was using Python 2.5, so shouldn't be an issue because pyexpat
properly name space prefixes its version of expat. See:

http://code.google.com/p/modwsgi/wiki/IssuesWithExpatLibrary

where explicitly says that only applies to Python prior to Python 2.5.

His problem is therefore likely to be something completely different.

Graham
 
D

dpapathanasiou

His problem is therefore likely to be something completely different.

You are correct.

As per the earlier advice, I switched from mod_python to mod_wsgi but
I still see the same error:

[Mon May 11 10:30:21 2009] [notice] Apache/2.2.11 (Unix) mod_wsgi/2.4
Python/2.5.2 configured -- resuming normal operations
[Mon May 11 10:30:26 2009] [error] Traceback (most recent call last):
[Mon May 11 10:30:26 2009] [error] File "../db/items_db.py", line
38, in <lambda>
[Mon May 11 10:30:26 2009] [error] db_object.associate(sdb_object,
(lambda primary_key, primary_data:xml_utils.parse_item_attribute
(primary_data, attribute)))
[Mon May 11 10:30:26 2009] [error] File "../common/xml_utils.py",
line 80, in parse_item_attribute
[Mon May 11 10:30:26 2009] [error] item_doc = minidom.parseString
(item)
[Mon May 11 10:30:26 2009] [error] File "/usr/lib/python2.5/xml/dom/
minidom.py", line 1924, in parseString
[Mon May 11 10:30:26 2009] [error] from xml.dom import
expatbuilder
[Mon May 11 10:30:26 2009] [error] SystemError: Parent module
'xml.dom' not loaded
[Mon May 11 10:30:26 2009] [error] Traceback (most recent call last):
[Mon May 11 10:30:26 2009] [error] File "../db/items_db.py", line
38, in <lambda>
[Mon May 11 10:30:26 2009] [error] db_object.associate(sdb_object,
(lambda primary_key, primary_data:xml_utils.parse_item_attribute
(primary_data, attribute)))
[Mon May 11 10:30:26 2009] [error] File "../common/xml_utils.py",
line 80, in parse_item_attribute
[Mon May 11 10:30:26 2009] [error] item_doc = minidom.parseString
(item)
[Mon May 11 10:30:26 2009] [error] File "/usr/lib/python2.5/xml/dom/
minidom.py", line 1924, in parseString
[Mon May 11 10:30:26 2009] [error] from xml.dom import
expatbuilder
[Mon May 11 10:30:26 2009] [error] SystemError: Parent module
'xml.dom' not loaded

The odd thing is that when xml_utils.py is run outside of either
apache module, xml.dom does load, and the minidom parsing works.

I'm not sure why this is happening, but the next thing I'll do is try
replacing minidom with ElementTree, and see if that has any issues
running under either apache module.
 
D

dpapathanasiou

For the record, and in case anyone else runs into this particular
problem, here's how resolved it.

My original xml_utils.py was written this way:

from xml.dom import minidom

def parse_item_attribute (item, attribute_name):
item_doc = minidom.parseString(item)
...

That version worked under the python interpreter, but failed under
both mod_python and mod_wsgi apache modules with an error ("Parent
module 'xml.dom' not loaded").

I found that changing the import statement and the minidom reference
within the function resolved the problem.

I.e., after rewriting xml_utils.py this way, it works under both
apache modules as well as in the python interpreter:

import xml.dom.minidom

def parse_item_attribute (item, attribute_name):
item_doc = xml.dom.minidom.parseString(item)
...
 
G

Graham Dumpleton

For the record, and in case anyone else runs into this particular
problem, here's how resolved it.

My original xml_utils.py was written this way:

from xml.dom import minidom

def parse_item_attribute (item, attribute_name):
    item_doc = minidom.parseString(item)
    ...

That version worked under the python interpreter, but failed under
both mod_python andmod_wsgiapache modules with an error ("Parent
module 'xml.dom' not loaded").

I found that changing the import statement and the minidom reference
within the function resolved the problem.

I.e., after rewriting xml_utils.py this way, it works under both
apache modules as well as in the python interpreter:

import xml.dom.minidom

def parse_item_attribute (item, attribute_name):
    item_doc = xml.dom.minidom.parseString(item)
    ...

FWIW, have just seen someone else raising an issue where something
caused problems unless a full package path was used. In that case it
was the 'email' package.

The common thing between these two packages is that they do funny
stuff with sys.modules as part of import.

For 'email' package it is implementing some sort of lazy loader and
aliasing thing to support old names. For 'xml.dom' it seems to replace
the current module with a C extension variant on the fly if the C
extension exists.

Were you getting this issue with xml.dom showing on first request all
the time, or only occasionally occurring? If the latter, were you
running things in a multithreaded configuration and was the server
being loaded with lots of concurrent requests?

For your particular Python installation, does the '_xmlplus' module
exist? Ie., can you import it as '_xmlplus' or 'xml.doc._xmlplus'?

Graham
 
D

dpapathanasiou

Were you getting this issue with xml.dom showing on first request all
the time, or only occasionally occurring? If the latter, were you
running things in a multithreaded configuration and was the server
being loaded with lots of concurrent requests?

It was the former.
For your particular Python installation, does the '_xmlplus' module
exist? Ie., can you import it as '_xmlplus' or 'xml.doc._xmlplus'?

No, it appears I don't have _xmlplus; neither 'import _xmlplus' nor
'import xml.doc._xmlplus' works.

My python installation is the default which came with debian 5.0
(i.e., I didn't build it from source with unorthodox configuration
options, or use apt).

As a final note, I wound up switching to cElementTree for parsing the
xml (not only for performance but also because the code is much more
concise), and I found that I don't need a full package path with that
module.

I.e, the import statement is:

from cElementTree import ElementTree, Element, SubElement, iterparse,
tostring, fromstring

and within each function I can simply refer to Element, SubElement,
etc. w/o the full path prefix.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top