screenscraping using htmltools and rexml

P

Peter Bodik

Hi,
I need to do some screen scraping and I've spent a couple hour getting
htmltools and rexml do the right thing. Here's the code:

parser = HTMLTree::parser.new(false, false)
parser.feed(res.body)
tree = parser.tree.html_node.as_rexml_document

I works for one page, but for another I get "undefined method `add' for
#<HTMLTree::Element:0x37f9cc8>" in as_rexml_document

It seems like a library mismatch, but I just downloaded ruby and all
the libraries in the past couple days. Does anybody know what versions
I need to make this work?

Thanks a lot!
Peter
 
P

Peter Bodik

I have just tried rexml 3.1.3 and the "stable" version of rexml 2.4.8,
but none of them work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top