Using reference for performance gain?

howa · Jul 5, 2006

hi,

consider the following codes...

it is interested that there is no performance gain when we use return
by reference...or is that perl already optimized them for us?

#--------------------------------------------------------------------
sub test {
my %h = (
"a" => "asdsaefsayd7asdsdsdsdsdsdsatd7as6fdsa76smndfkjtpyty",
"b" => "vdfdgdgregrehgregdsa6td76satd7as6fdsa76smndfdfdgfhf",
"c" => "zxczxrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76hthth",
"d" => "jhthfgdsadsadhsayd7asysgdsa6td76satd7as6fdsahrhrhrh",
"e" => "sfeghhfhsadsadhsayd7asysgdsa6td76satd7as6fdsahththt",
"f" => "jgjgjgsadsadhsayd7asysgdsa6td76satd7as6fdseretttttg",
"g" => "ekreorrsadsadhsayd7asysgdsa6td76satd7as6fdsadgdgggg",
"h" => "sadsadhsayd7asysgdsa6td76satd7as6fdsa76smndfkjdfdfd",
"i" => "mhjghhtsadsadhsayd7asysgdsa6td76satd7ahghghrrtrtrrt",
"j" => "ewtyhrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76smndf",
"k" => "ykiyuiyutdhsayd7asysgdsa6td76satd7as6fdsa76smndfkjd"
);

# return %h;
return \%h;
}

my $startTime = time();

for ($count=1; $count<500000; $count++) {
my $tmp = test();
}

my $endTime = time();
print $endTime - $startTime;

anno4000 · Jul 5, 2006

howa said:
hi,

consider the following codes...

it is interested that there is no performance gain when we use return
by reference...or is that perl already optimized them for us?

For timing Perl code there is the standard module Benchmark.

#--------------------------------------------------------------------
sub test {
my %h = (
"a" => "asdsaefsayd7asdsdsdsdsdsdsatd7as6fdsa76smndfkjtpyty",
"b" => "vdfdgdgregrehgregdsa6td76satd7as6fdsa76smndfdfdgfhf",
"c" => "zxczxrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76hthth",
"d" => "jhthfgdsadsadhsayd7asysgdsa6td76satd7as6fdsahrhrhrh",
"e" => "sfeghhfhsadsadhsayd7asysgdsa6td76satd7as6fdsahththt",
"f" => "jgjgjgsadsadhsayd7asysgdsa6td76satd7as6fdseretttttg",
"g" => "ekreorrsadsadhsayd7asysgdsa6td76satd7as6fdsadgdgggg",
"h" => "sadsadhsayd7asysgdsa6td76satd7as6fdsa76smndfkjdfdfd",
"i" => "mhjghhtsadsadhsayd7asysgdsa6td76satd7ahghghrrtrtrrt",
"j" => "ewtyhrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76smndf",
"k" => "ykiyuiyutdhsayd7asysgdsa6td76satd7as6fdsa76smndfkjd"
);

# return %h;
return \%h;
}

my $startTime = time();

for ($count=1; $count<500000; $count++) {
my $tmp = test();
}

my $endTime = time();
print $endTime - $startTime;

Your benchmarks don't tell much about the actual performance of
returning a list vs. returning a reference. For one, you are building
the hash on each call to the routine. That is going to swamp out
other performance differences. Secondly, you are assigning the list
that is returned in one case to a scalar. That means that only one
scalar assignment is actually done, the others will be more or less
efficiently optimized away.

Here is a better benchmark:

use Benchmark qw( cmpthese);

my %h = (
"a" => "asdsaefsayd7asdsdsdsdsdsdsatd7as6fdsa76smndfkjtpyty",
"b" => "vdfdgdgregrehgregdsa6td76satd7as6fdsa76smndfdfdgfhf",
"c" => "zxczxrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76hthth",
"d" => "jhthfgdsadsadhsayd7asysgdsa6td76satd7as6fdsahrhrhrh",
"e" => "sfeghhfhsadsadhsayd7asysgdsa6td76satd7as6fdsahththt",
"f" => "jgjgjgsadsadhsayd7asysgdsa6td76satd7as6fdseretttttg",
"g" => "ekreorrsadsadhsayd7asysgdsa6td76satd7as6fdsadgdgggg",
"h" => "sadsadhsayd7asysgdsa6td76satd7as6fdsa76smndfkjdfdfd",
"i" => "mhjghhtsadsadhsayd7asysgdsa6td76satd7ahghghrrtrtrrt",
"j" => "ewtyhrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76smndf",
"k" => "ykiyuiyutdhsayd7asysgdsa6td76satd7as6fdsa76smndfkjd"
);

sub list { %h }
sub ref { \ %h }

cmpthese -1, {
list_ret => 'my %x = list()',
ref_ret => 'my $x = ref()',
};

....which prints

Rate list_ret ref_ret
list_ret 61837/s -- -98%
ref_ret 3633678/s 5776% --

In other words, in this case returning a reference is about 60 times
faster than returning the list.

Anno

howa · Jul 5, 2006

(e-mail address removed)-berlin.de å¯«é“ï¼š

Your benchmarks don't tell much about the actual performance of
returning a list vs. returning a reference. For one, you are building
the hash on each call to the routine. That is going to swamp out
other performance differences.

i have considered this, but this just reflect the real world suitation.

Secondly, you are assigning the list

that is returned in one case to a scalar. That means that only one
scalar assignment is actually done, the others will be more or less
efficiently optimized away.

yes, you are right.

thanks.

Uri Guttman · Jul 5, 2006

h> (e-mail address removed)-berlin.de

h> i have considered this, but this just reflect the real world suitation.

what real world? if you are building constant hashes in each call, then
your real world is very slow. and as anno said, it will swamp out any
return overhead.

you need to show real world code if you want to get real world
optimizations.

uri

howa · Jul 6, 2006

Uri Guttman å¯«é“ï¼š

h> (e-mail address removed)-berlin.de

h> i have considered this, but this just reflect the real world suitation.

what real world? if you are building constant hashes in each call, then
your real world is very slow. and as anno said, it will swamp out any
return overhead.

in real world, complex data are always generated and returning from a
function, and this is the reason why we need to consider using
reference instead.

of coz purely comparing the speed of reference & value might be a
factor of 100, but when used in real world suitation, this might be
just a factor of 2 as you must have some min. overhead of other things
else.

thanks anyway.

Uri Guttman · Jul 6, 2006

h> Uri Guttman å¯«é“ï¼š

h> in real world, complex data are always generated and returning from a
h> function, and this is the reason why we need to consider using
h> reference instead.

h> of coz purely comparing the speed of reference & value might be a
h> factor of 100, but when used in real world suitation, this might be
h> just a factor of 2 as you must have some min. overhead of other things
h> else.

you arent' getting it but i can't think of any way to explain it to
you. optimization is a skill in itself and it requires understanding of
when things happen. loading large CONSTANT hashes inside a sub is a
killer of cpu power. it also will distort any benchmarks you are
doing. as for the real world, the whole point of a benchmark is to try
to simulate real world conditions. your benchmark was useless for that
and for isolating whether returning a hash or a reference was faster
(besides it being broken in how it returned stuff).

uri

Help for my project in the last minute	0	Apr 23, 2022
Difference between using "let" in a "for" loop	0	Jul 3, 2022
Need to understand tradeoff between array and vector	9	Nov 22, 2008
An empty initializer is invalid for an array with unspecified bound	0	Jul 1, 2020
A performance issue when using default value	4	Feb 1, 2010
Performance of hand-optimised assembly	99	Dec 23, 2011
Concurrent code performance	10	Oct 9, 2011
Template rvalue reference binds to an lvalue, but ordinary rvaluereference does not	4	Jul 17, 2013

Using reference for performance gain?

howa

anno4000

howa

Uri Guttman

howa

Uri Guttman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads