Don't know what is slowing down my program?

Harry · Sep 16, 2008

Hello,

I am trying to match tags in an html file, roughly as follows:

1
2 # Global variable.
3 my $lines = "slurped file content here";
4
5 sub rule_for_tag_a {
6
7 pos($lines) = 0;
8
9 while ($lines =~ m@ complex_pattern_without_any_\G_anchors
@gsix) {
10
11 if(some_condition) {
12
13 # pos($lines) == x at this point.
14
15 # Retrace / backtrack the pos by a small enough y,
where y >= 0.
16 pos($lines) = x - y;
17
18 next; # <--- STEPPING OVER THIS BECOMES **VERY**
SLOW AFTER SOMETIME!
19 }
20
21 adhoc_processing_for_tag_a();
22 }
23 }
24 ...

There are rules for other tags 'b', 'c', 'd', etc that *very* similar
to the rule for tag 'a' (and, you can trust me on this one)... they
differ only in the
'adhoc_processing_for_tag_*()'
subroutines.

I call these rules one after the other as follows:

25 rule_for_tag_d ();
26 rule_for_tag_c ();
27 rule_for_tag_b ();
28 rule_for_tag_a ();
29 # Everything runs very slowly now and then from this point on!
30 ...

Now, what I'm noticing is that, after several of the tag rules (for,
let's say, tags 'd', 'c', and 'b') have run with the usual (and as
expected) very high speed, something suddenly causes the program to
slow down substantially! I have only been able narrow down the problem
to one particular statement -- the 'next;' statement on line 18:
Stepping over line 18 and arriving at line 11 takes longer than
'expected' (roughly, about 2 to 3 seconds, which is a lot compared to
other iterations)! The next 2 or 3 iterations after the slowdown run
fine before the slowdown surfaces once again. This slowdown-fine-
slowdown-fine drama continues from this point on till the end of the
program.

Could it be that the complexity of my regex pattern and/or the nature
of the input data ($lines) is causing Perl's Garbage Collector to
suddenly kick in?

Don't know what else to try now?
/HS

A. Sinan Unur · Sep 16, 2008

Hello,

I am trying to match tags in an html file, roughly as follows:

You give no information of value.

What purpose do the line numbers below serve but to create clutter?

2 # Global variable.
3 my $lines = "slurped file content here";
4
5 sub rule_for_tag_a {
6
7 pos($lines) = 0;
8
9 while ($lines =~ m@
complex_pattern_without_any_\G_anchors
@gsix) {
10
11 if(some_condition) {
12
13 # pos($lines) == x at this point.
14
15 # Retrace / backtrack the pos by a small enough y,
where y >= 0.
16 pos($lines) = x - y;
17
18 next; # <--- STEPPING OVER THIS BECOMES **VERY**
SLOW AFTER SOMETIME!
19 }

Maybe it is still lightning fast but it is happening waaaaaaaayyyy to
many times?

21 adhoc_processing_for_tag_a();

Not very illuminating. This really does not help us help you.

Could it be that the complexity of my regex pattern and/or the nature
of the input data ($lines) is causing Perl's Garbage Collector to
suddenly kick in?

I don't think the garbage collector works that way in Perl. AFAIK, it is
a simple reference counting scheme.

I would recommend that you adopt a proper HTML parser. Given the
structure of your code, HTML::TokeParser may be particularly
appropriate.

Don't know what else to try now?

First, read the posting guidelines for this group to find out how to
help others help you.

Sinan

--
A. Sinan Unur <[email protected]>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/

brian d foy · Sep 16, 2008

Harry said:
Hello,

I am trying to match tags in an html file, roughly as follows:

Try running the program under a code profiler that can measure who's
doing what for how long. For instance, Devel::NYTProf is handy:

http://search.cpan.org/dist/Devel-NYTProf

Good luck

Jürgen Exner · Sep 16, 2008

Harry said:
I am trying to match tags in an html file, roughly as follows:

Which is A Bad Idea(TM).
[...]

Don't know what else to try now?

Use a tool that is meant to parse HTML, REs are not.
There are several good HTML parsers on CPAN.

jue

Problem with displaying character that code number is 219 (after SetConsoleTextAttribute)?	3	Jan 9, 2023
self studying: this program slows down after repeated runs on my machine.	10	Sep 2, 2013
Wierd behavior with files. Can anyone explain what is happening?	4	Nov 27, 2005
program wont compile...	5	Feb 2, 2006
Verctor/List what is best for my design?	14	Jan 19, 2004
Help with my 1st Tkinter program	0	Oct 19, 2004
How bad is $'? (Was: "Get substring of line")	4	Jan 18, 2005
My OPE & the Euclicidean TSP	92	Aug 17, 2008

Don't know what is slowing down my program?

Harry

A. Sinan Unur

brian d foy

Jürgen Exner

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads