perl hash speed and memory efficiency

O

odigity

I'm running some tests to try to gauge the speed of perl's hashing
abilities, as well as the memory footprint. I wrote this function:

sub buildhash
{
my %hash;
foreach my $foo (1..100_000) {
foreach my $bar ('a'..'z') {
$hash{"$foo.$bar"} = 1;
}
}
undef %hash;
}

Then I threw in the Benchmark module, like this:

timethis( 1, "buildhash()" );

It seems to use about 200MB of memory for 2.6 million small key/value
pairs, which is pretty efficient (~80bytes/pair). However, it doesn't
release the memory after the undef (I checked by stopping execution at
that point with a sleep statement and studying memory usage with
`ps`).

It either takes 11 seconds or 75 seconds depending on how I execute
it. Let me explain.

I first tried running it once, and it took 11 seconds. I tried twice,
and it took 86. This didn't make any sense to me. Here's the command
I used:

timethis( 2, "buildhash()" );

Then I tried unrolling it like this:

timethis( 1, "buildhash()" );
timethis( 1, "buildhash()" );

And that took 22 seconds (11/each), as I expected the first time.

So the question that is most driving me crazy is: For the sake of
Pete, why the difference!?

-ofer
 
B

Ben Morrow

Quoth (e-mail address removed) (odigity):
I'm running some tests to try to gauge the speed of perl's hashing
abilities, as well as the memory footprint. I wrote this function:

sub buildhash
{
my %hash;
foreach my $foo (1..100_000) {
foreach my $bar ('a'..'z') {
$hash{"$foo.$bar"} = 1;
}
}
undef %hash;

There is no need for this. %hash will be deallocated at end-of-scope.
}

Then I threw in the Benchmark module, like this:

timethis( 1, "buildhash()" );

It seems to use about 200MB of memory for 2.6 million small key/value
pairs, which is pretty efficient (~80bytes/pair). However, it doesn't
release the memory after the undef (I checked by stopping execution at
that point with a sleep statement and studying memory usage with
`ps`).

<sigh>

Under most circumstances, memory once allocated to a process can never
be released. The memory goes into a process-internal free pool, so if
you were to run the sub again memory usage would not go up any further.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top