Convert HTML to XML

N

Ninja Li

Hi,

I tried to parse a HTML page using HTML::TreeBuilder but it is a
little cumbersome. Is there an easier way to parse HTML, say from HTML
to XML? Which perl package and methods should I use?

Thanks in advance.

Nick
 
N

Ninja Li

HTML::TreeBuilder really is the "right" tool for parsing HTML you get
from the web. One of it's major strengths is it can generate reasonable
parse-trees from even unreasonable HTML.

Keep in mind that scraping earnings.com's website may be in violation of
their terms of use, and you should make sure you have appropriate
permission before doing that in an automated way.

--L

Thanks for your help and concern. We are a client of the website and
are trying to move for Excel-based program to perl.
 
S

sln

Thanks for your help and concern. We are a client of the website and
are trying to move for Excel-based program to perl.

I looked at the source to the page link you provided.
I hope thats not in violation and the Feds are gonna come get me.

I wouldn't call it scraping would you? I'd guess Yaaahooei/Googleballs
own the web cause they do it all the time.

I've heard there is some kind of Perl module that will turn table data
into some kind of hash for you. I have personal software (written by me)
that sucks table data out of html/xml like buttaa. Unfortunately you can't
get it.

Look for that module on cpan or somewhere.

-sln
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top