parse a table in HTML page.

T

Thomas Guettler

Have you looked at beautiful soup?
http://www.crummy.com/software/BeautifulSoup/

antonio_wn8 said:
Hi all,
I have a need to read and parse a table in HTML page.

I’m using the following script:
http://trac.davidgrant.ca/browser/src/python/misc/siteuptime/TableParser.py

It works fine aside from link in href.

Example:

String to parse:
<tr><td><a href='vaffa.html'>elog</a></td><td>normal text</td></tr>

Output:
[[['elog', 'normal text']]]

as you can see it misses the info about href...
how can get this information 'vaffa.html'?
 
S

Stefan Behnel

antonio_wn8 said:
I have a need to read and parse a table in HTML page.

I’m using the following script:
http://trac.davidgrant.ca/browser/src/python/misc/siteuptime/TableParser.py

It works fine aside from link in href.

Example:

String to parse:
<tr><td><a href='vaffa.html'>elog</a></td><td>normal text</td></tr>

Output:
[[['elog', 'normal text']]]

You should try lxml.html. It gives you various tools like XPath to look for
specific elements and helper functions to find the links in an HTML document.

http://codespeak.net/lxml/

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top