Devel::SmallProf claims "return 1" needs much time !?

W

whumann

I'm trying to do some profiling on DBM::Deep. I started with
Devel::DProf and Devel::profile but for some subs I had no idea *why*
they took long, so I tried Devel::SmallProf for additional detail.
Much of what I see makes sense but results like this puzzle me (these
are the subs that do the low-level disc read and write):

count wall tm cpu time line
0 0.00000 0.00000 103:sub print_at {
314000 0.96161 3.67000 104: my $self = shift;
314000 0.88398 3.34000 105: my $loc = shift;
0 0.00000 0.00000 106:
314000 1.51351 3.89000 107: local ($/,$\);
0 0.00000 0.00000 108:
314000 1.07561 3.94000 109: my $fh = $self->{fh};
314000 4.39820 7.83000 110: if ( defined $loc ) {
0 0.00000 0.00000 111: seek( $fh, $loc + $self-
0 0.00000 0.00000 112: }
0 0.00000 0.00000 113:
314000 3.46850 6.32000 114: print( $fh @_ ) or die
"Internal Error
0 0.00000 0.00000 115:
314000 6.65324 9.86000 116: return 1;
0 0.00000 0.00000 117:}
0 0.00000 0.00000 118:
0 0.00000 0.00000 119:sub read_at {
500507 1.43996 6.05000 120: my $self = shift;
500507 1.60207 5.78000 121: my ($loc, $size) = @_;
0 0.00000 0.00000 122:
500507 2.44265 7.01000 123: local ($/,$\);
0 0.00000 0.00000 124:
500507 1.43653 5.88000 125: my $fh = $self->{fh};
500507 7.14111 11.73000 126: if ( defined $loc ) {
0 0.00000 0.00000 127: seek( $fh, $loc + $self-
0 0.00000 0.00000 128: }
0 0.00000 0.00000 129:
500507 1.41859 6.08000 130: my $buffer;
500507 4.36185 8.77000 131: read( $fh, $buffer, $size);
0 0.00000 0.00000 132:
500507 7.96107 12.04000 133: return $buffer;
0 0.00000 0.00000 134:}

Much of the time is spent in "seek", "print" and "read" -- as expected
(although I'm surprised it's only tenths of microseconds per call -- I
suppose disc caches work their wonders...). But even more time is
spent in the two "return" statements, and I don't know why! I thought
that maybe they contain the overhead of function calling/returning but
coudn't verify that in a simple testscript. Is this an artifact of the
perl debugger? Or what else is the cause for this?
Thanks for any help understanding this.

Wolfram
 
X

xhoster

whumann said:
I'm trying to do some profiling on DBM::Deep. I started with
Devel::DProf and Devel::profile but for some subs I had no idea *why*
they took long, so I tried Devel::SmallProf for additional detail.
Much of what I see makes sense but results like this puzzle me (these
are the subs that do the low-level disc read and write):

I often find SmallProf to be unreliable, especially when trying to profile
code portions which are fast on each execution but are executed very often.
count wall tm cpu time line
0 0.00000 0.00000 103:sub print_at {
314000 0.96161 3.67000 104: my $self = shift;
314000 0.88398 3.34000 105: my $loc = shift;
0 0.00000 0.00000 106:
314000 1.51351 3.89000 107: local ($/,$\);
0 0.00000 0.00000 108:
314000 1.07561 3.94000 109: my $fh = $self->{fh};
314000 4.39820 7.83000 110: if ( defined $loc ) {

This just seems weird. My 3GHz machine does an if defined test 32 times
faster, so unless you have an old computer I would say that this casts
doubt on the entire reliability of the SmallProf output.
....
0 0.00000 0.00000 115:
314000 6.65324 9.86000 116: return 1;
0 0.00000 0.00000 117:} ....

Much of the time is spent in "seek", "print" and "read" -- as expected
(although I'm surprised it's only tenths of microseconds per call -- I
suppose disc caches work their wonders...). But even more time is
spent in the two "return" statements, and I don't know why! I thought
that maybe they contain the overhead of function calling/returning but
coudn't verify that in a simple testscript.

Your return is only taking ~50-100% longer than your "if defined". While
both of them are taking absurdly long, that ratio is about what I find in a
simple test script. Is your machine old and slow? Is it perhaps swapping
itself to death during this test, or heavily loaded with other people's
processes?

If you can post the entire harness (provided it is small) you are using to
profile DBM::Deep, I'll play with it a bit.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
W

Wolfram Humann

This just seems weird.  My 3GHz machine does an if defined test 32 times
faster, so unless you have an old computer I would say that this casts
doubt on the entire reliability of the SmallProf output.

I think you got somthing wrong here. The profiler runs on the comiled
and optimized code where several source-lines may have become one. If
you look at the "count"-column, you will see that line 110 includes
the time for seek() in the next line. Im my experience, the shortest-
running lines in a script (e.g. "my $self = shift;") will indeed vary
a lot from run to run because of the nulltime-compensation im
SmallProf. The lines we're talking about are fairly stable.
If you can post the entire harness (provided it is small) you are using to
profile DBM::Deep, I'll play with it a bit.

Every script that imports heavily in DBM::Deep will do. Mine looks
like this:

BEGIN
{
$DB::profile = 0;
%DB::packages = ( 'DBM::Deep::Engine' => 1,
'DBM::Deep::Engine::Sector' => 1, 'DBM::Deep::File' => 1 );
}

use strict;
use warnings;
use DBM::Deep;

my $text = 'this is always the same dummy text';
my $inner = 500;
my $outer = 2;
my $profilefile = "prof.out";

my $db = DBM::Deep->new( "deeptest2.db" );

for my $j (1..$outer)
{
my $data;

for my $i (1..$inner)
{
my %hash = map { +"subkey$_" => $text } (12..30);
$data->{"key$i"} = \%hash;
}

$DB::profile = 1 if $j == $outer;

my $t = time();
$db->import($data);
print "Import duration: ", time() - $t, "\n";
}

But I can't say exactly how $inner and $outer were set when I created
the posted profile. Also, I already did some changes in DBM::Deep that
might affect performance.

Wolfram
 
X

xhoster

Wolfram Humann said:
I think you got somthing wrong here. The profiler runs on the comiled
and optimized code where several source-lines may have become one. If
you look at the "count"-column, you will see that line 110 includes
the time for seek() in the next line.

While I can't rule that out, I've never seen that done in this setting.
I've seen the condition of an elsif reported as if it were the condition of
the preceding if, but I've never seen that type of conflation happen
between the if condition and if block (except when the block contents are
on the same line as the condition). And I can't reproduce it with test
cases.

I assumed the behavior of the count column was simply because $loc is never
defined in the use-case you used, and therefore line 111 is never executed.
But I could be wrong.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
W

Wolfram Humann

I assumed the behavior of the count column was simply because $loc is never
defined in the use-case you used, and therefore line 111 is never executed..
But I could be wrong.

Well, I'm also not *that* sure about my claim, I should better check
that :)

Wolfram
 
X

xhoster

Wolfram Humann said:
Every script that imports heavily in DBM::Deep will do. Mine looks
like this:

BEGIN
{
$DB::profile = 0;
%DB::packages = ( 'DBM::Deep::Engine' => 1,
'DBM::Deep::Engine::Sector' => 1, 'DBM::Deep::File' => 1 );
}

I'm not sure, but I think that by restricting your packages that way, all
the time spent in a non-monitored package will get attributed to the
most recently executed statement which is in one of the monitored packages.
That statement is likely to be a "return".

I tried profiling your program, but the profiled code looked nothing like
yours. I realized I have an old DBM::Deep. I installed the current
version in a test directory, and holy cow is it slow compared to the old
version. If it ever finishes, I'll see what the profile looks like.

It looks like DBM::Deep is trying to change from a module to tie hashes to
disk with as little differences as possible (behvior-wise) from a regular
Perl hash; and instead turn into a full-fledged ACID database. I think
that that is unfortunate. Perhaps a code fork is in order.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
W

Wolfram Humann

I'm not sure, but I think that by restricting your packages that way, all
the time spent in a non-monitored package will get attributed to the
most recently executed statement which is in one of the monitored packages..
That statement is likely to be a "return".

I never thought of that but it sounds quite possible. Thanks for the
hint.
It looks like DBM::Deep is trying to change from a module to tie hashes to
disk with as little differences as possible (behvior-wise) from a regular
Perl hash; and instead turn into a full-fledged ACID database.  I think
that that is unfortunate.  Perhaps a code fork is in order.

I rather think that during the major rework to version 1.000 a few
things got introduced that make the module unnecessary slow. I posted
here http://groups.google.com/group/DBM-Deep/browse_thread/thread/9ae2ae30130a49f7
what I think is one major problem. I would appreciate comments if you
have the time to look into that.

If my profiling results are worth anything, code like "sub engine
{ $_[0]{engine} }" (in Engine.pm) is also a bad idea because it's
called so often. And then "$e = $self->engine" is just two characters
less than "$e = $self->{engine}" and probably much slower. But I still
need to ask Rob Kinyon if there are any reasons I don't know for these
subs.

Wolfram
 
S

salva

Much of the time is spent in "seek", "print" and "read" -- as expected
(although I'm surprised it's only tenths of microseconds per call -- I
suppose disc caches work their wonders...). But even more time is
spent in the two "return" statements, and I don't know why! I thought
that maybe they contain the overhead of function calling/returning but
coudn't verify that in a simple testscript. Is this an artifact of the
perl debugger? Or what else is the cause for this?
Thanks for any help understanding this.

Hi, I am the current maintainer of Devel::SmallProf and I would be
grateful if you could send to me a bug report for that, including the
details about how to exactly reproduce the problem

Cheers,

- Salva
 
X

xhoster

salva said:
Hi, I am the current maintainer of Devel::SmallProf and I would be
grateful if you could send to me a bug report for that, including the
details about how to exactly reproduce the problem

Cheers,

Hi Salva,

Here is an example:

$ cat foo.pl
use strict;
use warnings;
use constant foobar =>1;

BEGIN
{
%DB::packages = ( 'package' => 1 );
}

foreach (1..5) {
package::bar();
foo();
};

exit;

sub foo {
sleep 1;
return 1;
};

package package;
sub bar {
sleep 1;
return 1;
};
__END__

All the time spent in the (unmonitored) foo is attributed
to the last statement in (monitored) bar.

0 0.00000 0.00000 16:
5 0.00000 0.00000 17:sub foo {
0 0.00000 0.00000 18: sleep 1;
0 0.00000 0.00000 19: return 1;
0 0.00000 0.00000 20:};
0 0.00000 0.00000 21:
0 0.00000 0.00000 22:package package;
5 0.00000 0.00000 23:sub bar {
5 5.01095 0.00000 24: sleep 1;
5 5.00817 0.00000 25: return 1;
0 0.00000 0.00000 26:};
0 0.00000 0.00000 27:

Posted and mailed.

Cheers,

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
X

xhoster

Wolfram Humann said:
I never thought of that but it sounds quite possible. Thanks for the
hint.

And I've verified that this is indeed the case.
I rather think that during the major rework to version 1.000 a few
things got introduced that make the module unnecessary slow. I posted
here
http://groups.google.com/group/DBM-Deep/browse_thread/thread/9ae2ae3013=
0a49f7 what I think is one major problem. I would appreciate comments if
you have the time to look into that.

It looks like your proposed change should work, unless $orig_loc or
$old_loc are objects with operator overloading, which doesn't seem to be
the case. (But I do have to wonder why it was done the way it was in the
first place. It is certainly an oddly contorted way of doing things if it
wasn't done like that for a reason.)

Anyway, I ran the self-tests that come with DBM-Deep with your proposed
change in place and it passes all tests, so I'd call it good. And it does
make it much faster, but as you note, it still isn't all that speedy.
Reverting back to DBM version 0.983 is 20 times faster than even your
improved version.
If my profiling results are worth anything, code like "sub engine
{ $_[0]{engine} }" (in Engine.pm) is also a bad idea because it's
called so often. And then "$e =3D $self->engine" is just two characters
less than "$e =3D $self->{engine}" and probably much slower. But I still
need to ask Rob Kinyon if there are any reasons I don't know for these
subs.

That is classic OO programming to support inheritance. Some subclass might
want to override engine(), which of course it can't do if
"$e = $self->{engine}" is used instead of "$e = $self->engine"

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top