HTML parsing

W

worlman385

I need to parse the following HTML page and extract TV listing data
using VC++

http://tvlistings.zap2it.com/tvlistings/ZCGrid.do

any good way to extract the data?

is easy for VC++ to call PERL script and do some regular expression?

since the HTML page is not XML well formed, I cannot use a XML parser
right?

any other good ways to extract HTML page data?
 
M

Malcolm Dew-Jones

(e-mail address removed) wrote:

: I need to parse the following HTML page and extract TV listing data
: using VC++

: http://tvlistings.zap2it.com/tvlistings/ZCGrid.do

: any good way to extract the data?

: is easy for VC++ to call PERL script and do some regular expression?

: since the HTML page is not XML well formed, I cannot use a XML parser
: right?

: any other good ways to extract HTML page data?

Perl, HTML::parser (my spelling is right but case may be wrong).

#!perl
use strict;
use HTML::parser;
... perl code, etc...

As an aside, this is also an excellent tool for sax-like parsing of xml.
It has an xml mode that expects properly balanced tags, and etc, and
though it it doesn't handle all xml features, HTML::parser comes with
almost all distros of perl, which means that any a script that uses it can
work with almost any installation of perl, even if you can't install
anything additional (a real life saver in a controlled environment).
 
P

Peter Flynn

I need to parse the following HTML page and extract TV listing data
using VC++

http://tvlistings.zap2it.com/tvlistings/ZCGrid.do

any good way to extract the data?

is easy for VC++ to call PERL script and do some regular expression?

since the HTML page is not XML well formed, I cannot use a XML parser
right?

any other good ways to extract HTML page data?

Pass the page through HTML Tidy, which produces well-formed XHTML.
Then use XSLT to extract what you need.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top