G
Guest
Hi:
I want to convert html to xml.
I am doing this:
from xml.dom.ext.reader import HtmlLib
from xml.dom import ext, Node
from xml.dom.NodeFilter import NodeFilter
def main( argv ):
# build a DOM tree from the html
reader = HtmlLib.Reader()
dom_object = reader.fromUri( sys.argv[1] )
info = getTableInfo( dom_object, 9 )
reader.releaseNode( dom_object );
if __name__ == "__main__":
main( sys.argv )
This takes almost a minute on a 6000 line html file on a PIII 700 Mhz 256 RAM. This is too slow.
Can you suggest another way of doing this in Python?
I want to convert html to xml.
I am doing this:
from xml.dom.ext.reader import HtmlLib
from xml.dom import ext, Node
from xml.dom.NodeFilter import NodeFilter
def main( argv ):
# build a DOM tree from the html
reader = HtmlLib.Reader()
dom_object = reader.fromUri( sys.argv[1] )
info = getTableInfo( dom_object, 9 )
reader.releaseNode( dom_object );
if __name__ == "__main__":
main( sys.argv )
This takes almost a minute on a 6000 line html file on a PIII 700 Mhz 256 RAM. This is too slow.
Can you suggest another way of doing this in Python?