Don't know what is slowing down my program?

Discussion in 'Perl Misc' started by Harry, Sep 16, 2008.

  1. Harry

    Harry Guest

    Hello,

    I am trying to match tags in an html file, roughly as follows:

    1
    2 # Global variable.
    3 my $lines = "slurped file content here";
    4
    5 sub rule_for_tag_a {
    6
    7 pos($lines) = 0;
    8
    9 while ($lines =~ m@ complex_pattern_without_any_\G_anchors
    @gsix) {
    10
    11 if(some_condition) {
    12
    13 # pos($lines) == x at this point.
    14
    15 # Retrace / backtrack the pos by a small enough y,
    where y >= 0.
    16 pos($lines) = x - y;
    17
    18 next; # <--- STEPPING OVER THIS BECOMES **VERY**
    SLOW AFTER SOMETIME!
    19 }
    20
    21 adhoc_processing_for_tag_a();
    22 }
    23 }
    24 ...

    There are rules for other tags 'b', 'c', 'd', etc that *very* similar
    to the rule for tag 'a' (and, you can trust me on this one)... they
    differ only in the
    'adhoc_processing_for_tag_*()'
    subroutines.

    I call these rules one after the other as follows:

    25 rule_for_tag_d ();
    26 rule_for_tag_c ();
    27 rule_for_tag_b ();
    28 rule_for_tag_a ();
    29 # Everything runs very slowly now and then from this point on!
    30 ...

    Now, what I'm noticing is that, after several of the tag rules (for,
    let's say, tags 'd', 'c', and 'b') have run with the usual (and as
    expected) very high speed, something suddenly causes the program to
    slow down substantially! I have only been able narrow down the problem
    to one particular statement -- the 'next;' statement on line 18:
    Stepping over line 18 and arriving at line 11 takes longer than
    'expected' (roughly, about 2 to 3 seconds, which is a lot compared to
    other iterations)! The next 2 or 3 iterations after the slowdown run
    fine before the slowdown surfaces once again. This slowdown-fine-
    slowdown-fine drama continues from this point on till the end of the
    program.

    Could it be that the complexity of my regex pattern and/or the nature
    of the input data ($lines) is causing Perl's Garbage Collector to
    suddenly kick in?

    Don't know what else to try now?
    /HS
     
    Harry, Sep 16, 2008
    #1
    1. Advertising

  2. Harry <> wrote in
    news::

    > Hello,
    >
    > I am trying to match tags in an html file, roughly as follows:


    You give no information of value.

    What purpose do the line numbers below serve but to create clutter?

    > 2 # Global variable.
    > 3 my $lines = "slurped file content here";
    > 4
    > 5 sub rule_for_tag_a {
    > 6
    > 7 pos($lines) = 0;
    > 8
    > 9 while ($lines =~ m@
    > complex_pattern_without_any_\G_anchors
    > @gsix) {
    > 10
    > 11 if(some_condition) {
    > 12
    > 13 # pos($lines) == x at this point.
    > 14
    > 15 # Retrace / backtrack the pos by a small enough y,
    > where y >= 0.
    > 16 pos($lines) = x - y;
    > 17
    > 18 next; # <--- STEPPING OVER THIS BECOMES **VERY**
    > SLOW AFTER SOMETIME!
    > 19 }


    Maybe it is still lightning fast but it is happening waaaaaaaayyyy to
    many times?

    > 21 adhoc_processing_for_tag_a();


    Not very illuminating. This really does not help us help you.

    > Could it be that the complexity of my regex pattern and/or the nature
    > of the input data ($lines) is causing Perl's Garbage Collector to
    > suddenly kick in?


    I don't think the garbage collector works that way in Perl. AFAIK, it is
    a simple reference counting scheme.

    I would recommend that you adopt a proper HTML parser. Given the
    structure of your code, HTML::TokeParser may be particularly
    appropriate.

    > Don't know what else to try now?


    First, read the posting guidelines for this group to find out how to
    help others help you.

    Sinan

    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://www.rehabitation.com/clpmisc/
     
    A. Sinan Unur, Sep 16, 2008
    #2
    1. Advertising

  3. Harry

    brian d foy Guest

    In article
    <>,
    Harry <> wrote:

    > Hello,
    >
    > I am trying to match tags in an html file, roughly as follows:


    Try running the program under a code profiler that can measure who's
    doing what for how long. For instance, Devel::NYTProf is handy:

    http://search.cpan.org/dist/Devel-NYTProf

    Good luck :)
     
    brian d foy, Sep 16, 2008
    #3
  4. Harry <> wrote:
    >I am trying to match tags in an html file, roughly as follows:


    Which is A Bad Idea(TM).
    [...]

    >Don't know what else to try now?


    Use a tool that is meant to parse HTML, REs are not.
    There are several good HTML parsers on CPAN.

    jue
     
    Jürgen Exner, Sep 16, 2008
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. DP
    Replies:
    0
    Views:
    1,156
  2. Andy Oakey

    ASP.NET application slowing down

    Andy Oakey, Aug 12, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    387
    S. Justin Gengo
    Aug 12, 2003
  3. Bruce Hodge

    Slow Connections Slowing Down Site.

    Bruce Hodge, Nov 16, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    356
    =?Utf-8?B?RURGIFNvZnR3YXJl?=
    Nov 29, 2004
  4. Dan Stromberg
    Replies:
    3
    Views:
    273
    Dieter Maurer
    Jun 25, 2006
  5. Andries

    I know, I know, I don't know

    Andries, Apr 23, 2004, in forum: Perl Misc
    Replies:
    3
    Views:
    253
    Gregory Toomey
    Apr 23, 2004
Loading...

Share This Page