htmldata 1.0.4 - Manipulate HTML documents via data structure

C

C. Barnes

htmldata 1.0.4 is available.

http://oregonstate.edu/~barnesc/htmldata/

The htmldata module allows one to translate HTML
documents back and forth to list data structures.
This allows for programmatic reading and writing
of HTML documents, with much flexibility.

Functions are also available for extracting
and/or modifying all URLs present in the HTML
or stylesheets of a document.

Version 1.0.4 is a bugfix release offering:
* Python 2.0-2.4 support (thanks to Paul Clinch
for the Python 2.2 patch)
* Properly working XHTML parsing.
* Miscellaneous other fixes (see the changelog
for details).

I have found this library useful for writing
robots, for "wrapping" all of the URLs on
websites inside my own proxy CGI script, for
filtering HTML, and for doing flexible wget-like
mirroring.

It keeps things as simple as possible, so it
should be easy to learn.

- Connelly Barnes




__________________________________
Do you Yahoo!?
Send holiday email and support a worthy cause. Do good.
http://celebrity.mail.yahoo.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top