any HTML/XML handling scripting language ?

B

Bru, Pierre

hi,

I was used to do page scraping/html handling with Compaq's Web
Language (AKA WebL, http://research.compaq.com/SRC/WebL/) but this
scripiting tool is no longer supported since a long time and knows
only HTML up to 3.2 :(

so I wrote a piece of java to get a page and used an HTML parser to
parse the crappy HTML found on the 'net and beautify it (add missing
close tags, etc)

now that I have this XML-like HTML, I would like to edit it with a
script (not edit-compile loop when I want to change little things in
the way I modify, easiest for people without java compiler to create
their own modification).

for ex, with WebL you can delete each table row that contains the word
"foo" with something like:

P=GetURL("http://www.foo.bar/index.html");
each e in Elem(P,"tr") contain Pat(P,"foo") Delete(e); end
PrintLn(Markup(P));

and many other easy handling like this one (check the PDF Reference
Manual for more)

does something like that exist out of the box (or near out out the
box)?

TIA,
Pierre.

PS. as WebL sources are available, I could modify WebL (for my own
usage) but I could not redistribute the modified version because of
copyright. so modifying WebL is not a solution :(
 
M

Martin Honnen

Bru, Pierre wrote:

for ex, with WebL you can delete each table row that contains the word
"foo" with something like:

P=GetURL("http://www.foo.bar/index.html");
each e in Elem(P,"tr") contain Pat(P,"foo") Delete(e); end
PrintLn(Markup(P));

and many other easy handling like this one (check the PDF Reference
Manual for more)

does something like that exist out of the box (or near out out the
box)?

Maybe HttpUnit helps:
http://www.httpunit.org/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top