W
Willem Ligtenberg
I decided to use SAX to parse my xml file.
But the parser crashes on:
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError
raise exception
xml.sax._exceptions.SAXParseException: NCBI_Entrezgene.dtd:8:0: error in processing external entity reference
This is caused by:
<!DOCTYPE Entrezgene-Set PUBLIC "-//NCBI//NCBI Entrezgene/EN"
"NCBI_Entrezgene.dtd">
If I remove it, it parses normally.
I've created my parser like this:
import sys
from xml.sax import make_parser
from handler import EntrezGeneHandler
fopen = open("mouse2.xml", "r")
ch = EntrezGeneHandler()
saxparser = make_parser()
saxparser.setContentHandler(ch)
saxparser.parse(fopen)
And the handler is:
from xml.sax import ContentHandler
class EntrezGeneHandler(ContentHandler):
"""
A handler to deal with EntrezGene in XML
"""
def startElement(self, name, attrs):
print "Start element:", name
So it doesn't do much yet. And still it crashes...
How can I tell the parser not to look at the DOCTYPE declaration.
On a website:
http://www.devarticles.com/c/a/XML/Parsing-XML-with-SAX-and-Python/1/
it states that the SAX parsers are not validating, so this error shouldn't
even occur?
Cheers,
Willem
But the parser crashes on:
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError
raise exception
xml.sax._exceptions.SAXParseException: NCBI_Entrezgene.dtd:8:0: error in processing external entity reference
This is caused by:
<!DOCTYPE Entrezgene-Set PUBLIC "-//NCBI//NCBI Entrezgene/EN"
"NCBI_Entrezgene.dtd">
If I remove it, it parses normally.
I've created my parser like this:
import sys
from xml.sax import make_parser
from handler import EntrezGeneHandler
fopen = open("mouse2.xml", "r")
ch = EntrezGeneHandler()
saxparser = make_parser()
saxparser.setContentHandler(ch)
saxparser.parse(fopen)
And the handler is:
from xml.sax import ContentHandler
class EntrezGeneHandler(ContentHandler):
"""
A handler to deal with EntrezGene in XML
"""
def startElement(self, name, attrs):
print "Start element:", name
So it doesn't do much yet. And still it crashes...
How can I tell the parser not to look at the DOCTYPE declaration.
On a website:
http://www.devarticles.com/c/a/XML/Parsing-XML-with-SAX-and-Python/1/
it states that the SAX parsers are not validating, so this error shouldn't
even occur?
Cheers,
Willem