parsing event handler and object data

Discussion in 'Perl Misc' started by Michael Goerz, Oct 6, 2006.

  1. Hi,

    I'm having some trouble with the event based HTML parser module
    HTML::parser. See the attached example code. The problem is this:

    The event handlers seem to be completely self-contained, they only get
    the parameters that are passed to them by the parser. However, I'd like
    them to access variables from a higher scope, such as object data from
    the class I'm using the HTML parser in. I suppose the same problem
    arises with other event-based parsers, too. What's the right way to do
    something like this?

    Thanks,
    Michael
     
    Michael Goerz, Oct 6, 2006
    #1
    1. Advertising

  2. On Oct 6, 8:08 am, Michael Goerz <4ward.com> wrote:
    > Hi,
    >
    > I'm having some trouble with the event based HTML parser module
    > HTML::parser. See the attached example code. The problem is this:
    >
    > The event handlers seem to be completely self-contained, they only get
    > the parameters that are passed to them by the parser. However, I'd like
    > them to access variables from a higher scope, such as object data from
    > the class I'm using the HTML parser in.


    What you are saying is you want to pass the HTML::parser a callback
    that calls back to an object method rather than just to a subroutine.

    > I suppose the same problem arises with other event-based parsers, too.


    Or any API with callbacks.

    Your question is, in fact, almost FAQ but it's perhaps not immediately
    obvious that this is the case.

    The FAQ in question is "How can I pass/return a {Function, FileHandle,
    Array, Hash, Method, Regex}?". One way of looking at it is that you are
    asking "How can I pass a Method?"

    > What's the right way to do something like this?


    This is Perl! There's more than on right way.

    > Content-Type: application/x-perl; name="TestParser.pm"


    text/plain please (or simply inline your text).

    [ code slightly simplified for illustrative puposes, the OP's code was
    an execllent *mimimal* but *complete* illustration of his point ]

    > sub parse{
    > my $self = shift;
    > my $p = HTML::parser->new( api_version => 3,
    > start_h => [\&start, "tagname, attr"],
    > );
    > $p->parse_file($self->{infile});
    >}
    >
    > sub start{
    > # Doesn't work, how do I access variables from a hight scope?
    > $self->{tagname} = shift;
    >}


    Right. There are three approches that spring to mind.

    1) Move start() inside the lexical scope of parse() so that $self is
    in scope. This is a slightly complicated by the fact that Perl doesn't
    have proper named nested subs but does have anonymous closures.

    2) Call start() as a method using a small closure as a shim.

    3) Use package for the variables that you want to be shared between
    multiple lexical scopes.

    Note: Solution 3 is considered dity by some. It is generally the
    easiest to debug unless you are interfacing to a object that will
    persist beyond the context is which it is created in which case it
    becomes the hardest to debug.

    In your code the HTML::parser object will only exist within the time
    that parser() is on the stack. Futhermore, if parser() is called
    reentrantly then you can be sure that the HTML::parser object from the
    outer instance of parser() will never try to call back dring the
    execution of the inner parser(). Only because these two conditions are
    met is it safe to opt for solution 3.


    # Solution 1

    sub parse{
    my $self = shift;
    my $start = sub {
    $self->{tagname} = shift;
    };

    my $p = HTML::parser->new( api_version => 3,
    start_h => [ $start, "tagname, attr"],
    );
    $p->parse_file($self->{infile});
    }

    # Solution 2

    sub parse {
    my $self = shift;
    my $p = HTML::parser->new( api_version => 3,
    start_h => [ sub { $self->start(@_) },
    "tagname, attr"],
    );
    $p->parse_file($self->{infile});
    }

    sub start {
    my $self = shift; # We're now a method
    $self->{tagname} = shift;
    }

    # Solution 3

    our $self;

    sub parse {
    local $self = shift;
    my $p = HTML::parser->new( api_version => 3,
    start_h => [\&start, "tagname, attr"],
    );
    $p->parse_file($self->{infile});
    }

    sub start {
    $self->{tagname} = shift;
    }
     
    Brian McCauley, Oct 6, 2006
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sonic
    Replies:
    1
    Views:
    539
    John Saunders
    Jan 7, 2005
  2. Replies:
    1
    Views:
    710
    Damien
    Feb 22, 2007
  3. RC

    How to recall add event from an Event handler??

    RC, Jan 6, 2005, in forum: ASP .Net Web Controls
    Replies:
    1
    Views:
    266
    John Saunders
    Jan 6, 2005
  4. Adi
    Replies:
    2
    Views:
    193
  5. Replies:
    0
    Views:
    149
Loading...

Share This Page