adding a data field to a subclass of HTML::Parser

S

sinoslav

I'm pretty new to Perl's object system and am trying to figure out how
to add a data field to my subclass of HTML::parser so that the
subclass can collect information into that data field as it parses an
HTML document. For example, let's say I want to collect an array
@names of all the NAME attributes of my <A> tags, and add a method
that returns that array:

{
Package MyParser;
use base HTML::parser;

sub start {
my ($self, $tag, $attr, $attrseq, $origtext) = @_;

if($tag eq "a") {
push @names, $attr->{name};
}
}


sub get_names {
return \@names;
}
....

}

How can I declare the @names array as a field of MyParser?

Many thanks!

Roger
 
A

anno4000

This follows the "InsideOut" approach:

Yes! It's the only way a class can painlessly add fields to
an existing class. I'd like to point out that your code takes
absolutely no note of the implementation of the base class
HTML::parser.

With Perl 5.10 (and current bleadperl) you could declare your
hash %names a "FieldHash". That makes the hash garbage-collected
and thread-safe, so the DESTROY method could go, and the comment
at the end of your code too. As a minor convenience, you use
objects as hash keys directly. The refaddr()-action is built in.

I have indicated how your class could be built using
Hash::Util::FieldHash.
package MyParser;
use base HTML::parser;
use Scalar::Util qw/refaddr/;

Not needed. Instead

use Hash::Util::FieldHash;
my %names;

Hash::Util::FieldHash::fieldhash my %names;
sub start {
my ($self, $tag, $attr, $attrseq, $origtext) = @_;
if ($tag eq 'a')
{
push @{$names{refaddr $self}}, $attr->{name};

push @{$names{ $self}}, $attr->{name};
}
$self->SUPER::start(@_);
}

sub get_names {
my $self = shift;
return $names{refaddr $self};

return $names{ $self};
}

sub DESTROY {
my $self = shift;
delete $names{ refaddr $self };
}

DESTROY is not needed.
1; # not thread-safe

The comment neither :)
package main;
use Data::Dumper;

my $p = MyParser->new();
$p->parse_file(*DATA);
my $aref = $p->get_names;
print Dumper $aref;

Nothing changes for the user of the class.

Anno
 
A

anno4000

Michele Dondi said:
use Hash::Util::FieldHash; [snip]
my %names;

Hash::Util::FieldHash::fieldhash my %names;

This is all very cool, I only wonder whether it could be made into a
syntactically sweeter form.

Absolutely. I'm working on it.
For example I'm not familiar with
attributes at all, so I don't know if this is plainly utter nonsense,
but how about something like

my %names : field; # ?

Using attributes is an attractive method for that.

I have gone another way and built a pragmatic module "xfields" (somewhat
similar to the existing "fields") that facilitates the creation of
FieldHash based classes. It isn't quite done yet, but I have put a
workable version up on

http://www.tu-berlin.de/zrz/mitarbeiter/anno4000/xfields/

You can read the pod there or download the tarball and play with
it (you'll need a new-ish version (>= 5.9.4) of bleadperl for that).
See below for a usage example.

I'm still struggling with design issues. I want to make it as simple
as possible, but it's oh-so-hard to decide which features will be
useful and which are going to be cruft. Even a re-design based on
attributes, the way you suggested, isn't out of the question.

On the other hand, I have core ambitions for the module. If I don't
get it done RSN, it'll be too late for 5.10 (if it isn't already).

Anno

PS: From the Examples section of the xfields pod:

#!/usr/local/bin/bleadperl

package Bottle;
use xfields -rw => qw( material cap content);

sub describe {
my $bottle = shift;
my $state = $bottle->cap ? '' : 'n open';
my $descr = "This is a$state " . $bottle->material . ' bottle';
$descr .= ' with a ' . $bottle->cap . ' cap' if $bottle->cap;
$descr .= ".";
if ( my $content = $bottle->content ) {
$descr .= " It contains $content."
} else {
$descr .= " It is empty.";
}
print "$descr\n";
$bottle;
}

package main;

my $bottle = Bottle->new(
material => 'glass',
);
$bottle->describe;

$bottle->cap = 'red screw-on';
$bottle->content = 'water';
$bottle->describe;

$bottle->material = 'clear plastic';
$bottle->cap =~ s/screw-on/plastic/;
$bottle->content = 'a brown sticky liquid, probably Coca Cola';
$bottle->describe;


__END__
 
A

anno4000

Michele Dondi said:
TY for the detailed explanations.


Then don't lose time answering me, just get it done!! ;-)

Ah, no. That's the one great thing about this kind of software work
that it goes at *my* pace. If I lose a chance to make my code more
popular, so what? It can always go on CPAN. The world will have to
make do without xfields until I am satisfied.

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,573
Members
45,046
Latest member
Gavizuho

Latest Threads

Top