Strange behaviour when parsing a XML file

F

Francesco Moi

Hi.

I want to parse these XML contents:
http://news.search.yahoo.com/news/rss?va=linux (This is a RSS file)

I tried with:
----------------
use LWP::Simple qw($ua get);
use LWP::Simple qw($ua head);

use HTML::TokeParser;

use LWP::UserAgent;

my $Url = "http://news.search.yahoo.com/news/rss?va=linux";
my $content = get($Url);

$parser=HTML::TokeParser->new(\$content);

while (my $token = $parser->get_token) {

my $tag_type = shift @{ $token };

if ($tag_type eq 'S') {

my($tag, $attr, $attrseq, $rawtxt) = @{ $token };

if ($tag eq 'title'){$title =
$parser->get_trimmed_text("/title");}
if ($tag eq 'link'){$link = $parser->get_trimmed_text("/link");}
if ($tag eq 'description'){
$description = $parser->get_trimmed_text("/description");
print "$title - $link - $description\n\n";}}}
------------

But I get this information:
---------
<![CDATA[Foo_Title]]> - Foo_Url -
--------

"<![CDATA" appears (no idea about its meaning) and no data about
description.

However if I substitute
"http://news.search.yahoo.com/news/rss?va=linux" with
"http://www.boingboing.net/index.rdf", it works OK.

Whay am I doing wrong? Regards.
 
F

Francesco Moi

Hi Tad.

Yes, I would like to get 'Foo_Title' instead of '
<![CDATA[Foo_Title]]>', and 'Foo_Description' instead of nothing.

Regards.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top