finding text-segment in a HTTP::Element tree

A

Arthur B.

Hello,
I found many exemples on how to look for specific tags using the
'look-down'. However I don't see how to write properly, using look-down,
something that would call my sub on each text-segment ( or say call my sub
and let the sub check it is dealing with a text-segment )

Tahnk you if anyone can help
 
K

ko

Arthur said:
Hello,
I found many exemples on how to look for specific tags using the
'look-down'. However I don't see how to write properly, using look-down,
something that would call my sub on each text-segment ( or say call my sub
and let the sub check it is dealing with a text-segment )

Tahnk you if anyone can help

I could be wrong, but I don't think there is a HTTP::Element module on
CPAN (at least it doesn't show up in the first 100 doing a search).
HTML::Element has a look_down method. Is this what you are using?

What exactly are you trying to do, and what code have you tried so far?

keith
 
A

Arthur B.

ko said:
I could be wrong, but I don't think there is a HTTP::Element module on
CPAN (at least it doesn't show up in the first 100 doing a search).

HTML Element, my mistake
HTML::Element has a look_down method. Is this what you are using?
yes

What exactly are you trying to do,

See my post : call a sub for each text-segment, taking the segment
as a parameter. See HTML::Element for a definition of text-segment
and what code have you tried so far?

Nothing, I can't figure how to do it
 
K

ko

Arthur B. said:
ko wrote:
[snip]
What exactly are you trying to do,

See my post : call a sub for each text-segment, taking the segment
as a parameter. See HTML::Element for a definition of text-segment
and what code have you tried so far?

Nothing, I can't figure how to do it

Two choices I'm aware of:

1. Write a recursive routine that processes all tags to deal with the
text segments.
2. Use objectify_text() to get all text segments. The method turns
text segments into HTML::Element objects and allows direct access to
the text:

#!/usr/bin/perl -w
use strict;
use HTML::TreeBuilder; # inherits from HTML::Element

my $html;
{
local $/;
$html = <DATA>;
}

my $root = HTML::TreeBuilder->new;
$root->parse($html);
$root->eof;
$root->objectify_text;
my @text_nodes = $root->look_down('_tag','~text');
# your sub to process text
print $_->attr('text'), "\n" foreach @text_nodes;

$root->deobjectify_text;
$root->delete;

__DATA__
<html>
<body>
<p>some text</p>
<p>more text</p>
</body>
</html>

Text segments are accessible through the object's 'text' attribute
using the attr() method. So to do a substitution::

(my $text = $_->attr('text')) =~ s#PATTERN#REPLACEMENT#; #read
$_->attr('text',$text); # make change

Don't know how I can explain look_down() any better than it already is
in the docs. Maybe if you copy/paste the examples from the docs into a
script and played around with it you'll get a better feel for how to
use it. Perhaps you're not too familiar with using objects? If so,
'perldoc perlboot' should help get you started.

HTH - keith
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top