minidom and pulldom

D

David Pinto

I'm trying to use either the minidom or pulldom to find table tags in
html web pages. I've tried parsing two web pages that show up fine in
my browser, but I get errors when I call minidom.parse, or try to get
events with pulldom. Is there a parser that is as forgiving as web
browsers?
 
M

Martin v. =?iso-8859-15?q?L=F6wis?=

I'm trying to use either the minidom or pulldom to find table tags in
html web pages. I've tried parsing two web pages that show up fine in
my browser, but I get errors when I call minidom.parse, or try to get
events with pulldom. Is there a parser that is as forgiving as web
browsers?

minidom is an XML parser. Most Web pages are not XML, but some form of
HTML.

You should have better chances with parsing HTML using htmllib.

Regards,
Martin
 
J

John J. Lee

I'm trying to use either the minidom or pulldom to find table tags in
html web pages. I've tried parsing two web pages that show up fine in
[...]
minidom is an XML parser. Most Web pages are not XML, but some form of
HTML.

You should have better chances with parsing HTML using htmllib.

Or, better, HTMLParser.HTMLParser -- works better with XHTML.

If you don't mind dependencies and want a document tree, a good plan
is to shove everything through mxTidy or uTidylib to generate XHTML,
then use the XML API of your choice.


John
 
J

John J. Lee

I'm trying to use either the minidom or pulldom to find table tags in
html web pages. I've tried parsing two web pages that show up fine in
my browser, but I get errors when I call minidom.parse, or try to get
events with pulldom. Is there a parser that is as forgiving as web
browsers?

Didn't this get answered just the other day?

minidom and pulldom are built on XML parsers. HTML is not XML.

If you want a tree, I recommend using pushing the HTML through mxTidy
or uTidylib, and feeding the resultant XHTML to the XML API of your
choice.


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top