Arthur B. said:
ko wrote:
[snip]
What exactly are you trying to do,
See my post : call a sub for each text-segment, taking the segment
as a parameter. See HTML::Element for a definition of text-segment
and what code have you tried so far?
Nothing, I can't figure how to do it
Two choices I'm aware of:
1. Write a recursive routine that processes all tags to deal with the
text segments.
2. Use objectify_text() to get all text segments. The method turns
text segments into HTML::Element objects and allows direct access to
the text:
#!/usr/bin/perl -w
use strict;
use HTML::TreeBuilder; # inherits from HTML::Element
my $html;
{
local $/;
$html = <DATA>;
}
my $root = HTML::TreeBuilder->new;
$root->parse($html);
$root->eof;
$root->objectify_text;
my @text_nodes = $root->look_down('_tag','~text');
# your sub to process text
print $_->attr('text'), "\n" foreach @text_nodes;
$root->deobjectify_text;
$root->delete;
__DATA__
<html>
<body>
<p>some text</p>
<p>more text</p>
</body>
</html>
Text segments are accessible through the object's 'text' attribute
using the attr() method. So to do a substitution::
(my $text = $_->attr('text')) =~ s#PATTERN#REPLACEMENT#; #read
$_->attr('text',$text); # make change
Don't know how I can explain look_down() any better than it already is
in the docs. Maybe if you copy/paste the examples from the docs into a
script and played around with it you'll get a better feel for how to
use it. Perhaps you're not too familiar with using objects? If so,
'perldoc perlboot' should help get you started.
HTH - keith