initialize a hash

Dr.Ruud · Aug 15, 2006

I am investigating what is the fastest way to set up a hash, with the
keys coming from an array, and all values set to a constant like 1.

#!/usr/bin/perl
use warnings ;
use strict ;

use Benchmark qw

hireswallclock cmpthese) ;

my @list = 1..1000 ;

cmpthese( -3, {
mapE => sub { my %a = map(( $_ => 1 ), @list)},
mapB => sub { my %a = map { $_ => 1 } @list },
for => sub { my %a ; $a{ $_ } = 1 for @list },
list => sub { my %a ; @a{ @list } = @list },
range => sub { my %a ; @a{ @list } = 1..@list },
empty => sub { my %a ; @a{ @list } = () },
} ) ;
__END__

Rate mapB mapE list for range empty
mapB 509/s -- -6% -31% -37% -50% -66%
mapE 544/s 7% -- -26% -33% -46% -64%
list 734/s 44% 35% -- -10% -27% -51%
for 812/s 60% 49% 11% -- -20% -46%
range 1012/s 99% 86% 38% 25% -- -32%
empty 1492/s 193% 174% 103% 84% 47% --

From this, I deduce that I need an unlimited supply of ones (in list
context), something like (invalid code follows) C<@1> or C<1...0>.
Suggestions anyone?

Paul Lalli · Aug 15, 2006

Dr.Ruud said:
I am investigating what is the fastest way to set up a hash, with the
keys coming from an array, and all values set to a constant like 1.

#!/usr/bin/perl
use warnings ;
use strict ;

use Benchmark qwhireswallclock cmpthese) ;

my @list = 1..1000 ;

cmpthese( -3, {
mapE => sub { my %a = map(( $_ => 1 ), @list)},
mapB => sub { my %a = map { $_ => 1 } @list },
for => sub { my %a ; $a{ $_ } = 1 for @list },
list => sub { my %a ; @a{ @list } = @list },
range => sub { my %a ; @a{ @list } = 1..@list },
empty => sub { my %a ; @a{ @list } = () },
} ) ;
__END__

Rate mapB mapE list for range empty
mapB 509/s -- -6% -31% -37% -50% -66%
mapE 544/s 7% -- -26% -33% -46% -64%
list 734/s 44% 35% -- -10% -27% -51%
for 812/s 60% 49% 11% -- -20% -46%
range 1012/s 99% 86% 38% 25% -- -32%
empty 1492/s 193% 174% 103% 84% 47% --

From this, I deduce that I need an unlimited supply of ones (in list
context), something like (invalid code follows) C<@1> or C<1...0>.
Suggestions anyone?

Not sure if this is what you meant, but it doesn't seem to be
significantly faster than the 'range' above...:

cmpthese( -3, {
mapE => sub { my %a = map(( $_ => 1 ), @list)},
mapB => sub { my %a = map { $_ => 1 } @list },
for => sub { my %a ; $a{ $_ } = 1 for @list },
list => sub { my %a ; @a{ @list } = @list },
range => sub { my %a ; @a{ @list } = 1..@list },
empty => sub { my %a ; @a{ @list } = () },
x1s => sub { my %a ; @a{ @list } = (1) x @list },
} ) ;
__END__

Rate mapB mapE list for x1s range empty
mapB 243/s -- -0% -9% -41% -48% -48% -62%
mapE 243/s 0% -- -8% -41% -48% -48% -62%
list 266/s 10% 9% -- -36% -43% -43% -58%
for 412/s 70% 69% 55% -- -11% -12% -35%
x1s 464/s 91% 90% 74% 12% -- -1% -27%
range 467/s 93% 92% 76% 13% 1% -- -27%
empty 638/s 163% 162% 140% 55% 38% 37% --

Paul Lalli

Uri Guttman · Aug 15, 2006

R> I am investigating what is the fastest way to set up a hash, with the
R> keys coming from an array, and all values set to a constant like 1.

R> #!/usr/bin/perl
R> use warnings ;
R> use strict ;

R> use Benchmark qw

hireswallclock cmpthese) ;

R> my @list = 1..1000 ;

R> cmpthese( -3, {
R> mapE => sub { my %a = map(( $_ => 1 ), @list)},
R> mapB => sub { my %a = map { $_ => 1 } @list },
R> for => sub { my %a ; $a{ $_ } = 1 for @list },
R> list => sub { my %a ; @a{ @list } = @list },
R> range => sub { my %a ; @a{ @list } = 1..@list },
R> empty => sub { my %a ; @a{ @list } = () },
R> } ) ;

you missed this one:

x => sub { my %a ; @a{ @list } = (1) x @list },

for basic boolean set testing i have used the empty style and exists or
the x style. if i want to have only a single statement i have used the
map styles (you didn't try the variation of using undef for the values
but that needs exists for testing again.).

also try different size lists as that will affect many things. so this
could be run in a loop with different list sizes.

R> Rate mapB mapE list for range empty
R> mapB 509/s -- -6% -31% -37% -50% -66%
R> mapE 544/s 7% -- -26% -33% -46% -64%
R> list 734/s 44% 35% -- -10% -27% -51%
R> for 812/s 60% 49% 11% -- -20% -46%
R> range 1012/s 99% 86% 38% 25% -- -32%
R> empty 1492/s 193% 174% 103% 84% 47% --

R> From this, I deduce that I need an unlimited supply of ones (in list
R> context), something like (invalid code follows) C<@1> or C<1...0>.
R> Suggestions anyone?

i would have guessed empty to be the fastest style as it doesn't
actually allocate an sv for each hash value, just marks them with the
undef flag. this is different than if you actually assigned undef in the
map styles which would allocate a full sv and mark it as undef.

uri

anno4000 · Aug 15, 2006

Dr.Ruud said:
I am investigating what is the fastest way to set up a hash, with the
keys coming from an array, and all values set to a constant like 1.

#!/usr/bin/perl
use warnings ;
use strict ;

use Benchmark qwhireswallclock cmpthese) ;

my @list = 1..1000 ;

cmpthese( -3, {
mapE => sub { my %a = map(( $_ => 1 ), @list)},
mapB => sub { my %a = map { $_ => 1 } @list },
for => sub { my %a ; $a{ $_ } = 1 for @list },
list => sub { my %a ; @a{ @list } = @list },
range => sub { my %a ; @a{ @list } = 1..@list },
empty => sub { my %a ; @a{ @list } = () },
} ) ;
__END__

Rate mapB mapE list for range empty
mapB 509/s -- -6% -31% -37% -50% -66%
mapE 544/s 7% -- -26% -33% -46% -64%
list 734/s 44% 35% -- -10% -27% -51%
for 812/s 60% 49% 11% -- -20% -46%
range 1012/s 99% 86% 38% 25% -- -32%
empty 1492/s 193% 174% 103% 84% 47% --

From this, I deduce that I need an unlimited supply of ones (in list
context), something like (invalid code follows) C<@1> or C<1...0>.
Suggestions anyone?

The repetition operator. Adding

repet => sub { my %a ; @a{ @list } = (1) x @list },

to your benchmark shows it as fast as "range" on my machine, setting
all values to one. Can't beat "empty", though.

Anno

Dr.Ruud · Aug 15, 2006

Uri Guttman schreef:

i would have guessed empty to be the fastest style as it doesn't
actually allocate an sv for each hash value, just marks them with the
undef flag. this is different than if you actually assigned undef in
the map styles which would allocate a full sv and mark it as undef.

Yes, I put it there to get an idea of the speed limit.

Liz Mattijsen came up with these:

empty => sub { my %a ; @a{ @list } = () },
empty1 => sub { my %a ; @a{ @list } = (undef) },
emptyN => sub { my %a ; @a{ @list } = (undef) x @list },

and empty1 is even faster than empty. <g>
(I assume that these just set the undef flag as well.)

Dr.Ruud · Aug 15, 2006

Paul Lalli schreef:

Dr.Ruud:

x1s => sub { my %a ; @a{ @list } = (1) x @list },

Yes, that's more what I am looking for. Thanks, Uri and Anno too of
course.

My 'list' and 'range' (and 'empty') were meant as speed references; they
don't set all values to the same (and defined) constant.

Rate mapB mapE list for x1s range empty
mapB 243/s -- -0% -9% -41% -48% -48% -62%
mapE 243/s 0% -- -8% -41% -48% -48% -62%
list 266/s 10% 9% -- -36% -43% -43% -58%
for 412/s 70% 69% 55% -- -11% -12% -35%
x1s 464/s 91% 90% 74% 12% -- -1% -27%
range 467/s 93% 92% 76% 13% 1% -- -27%
empty 638/s 163% 162% 140% 55% 38% 37% --

On your system, 'list' isn't as near to 'for' as on the freebsd with
perl 5.8.6 here.

I still wish there were a construct that would deliver as many
<constant>s as requested.
Something like C<1...undef> maybe, but no: that would break existing
code.

This one is (just 1 or 2%) faster than "empty":

empty1 => sub { my %a ; @a{ @list } = undef },

Uri Guttman · Aug 16, 2006

R> Uri Guttman schreef:
R> Yes, I put it there to get an idea of the speed limit.

R> Liz Mattijsen came up with these:

R> empty => sub { my %a ; @a{ @list } = () },
R> empty1 => sub { my %a ; @a{ @list } = (undef) },
R> emptyN => sub { my %a ; @a{ @list } = (undef) x @list },

R> and empty1 is even faster than empty. <g>
R> (I assume that these just set the undef flag as well.)

hmm, i would expect the emptyN to assign real undefs to the
elements. the 1 version should assign a real one to the first
slot. there is a way to check for this (i recall with pseudohashes this
was important and maybe defined will help somehow) afaik. for sure some
devel:: module will be able to peek in and tell you what is there.

uri

Brian McCauley · Aug 16, 2006

Dr.Ruud said:
I still wish there were a construct that would deliver as many
<constant>s as requested.
Something like C<1...undef> maybe, but no: that would break existing
code.

The whole design of the internals of Perl5 does not lend itself to
infinite lists. To handle infinite lists you need lazy evaluation
which is a radical overhall of how a language words.

Perl6 is just such a radical overhaul and, no coincidently, has lazy
evaluation to support infinte lists.

Dr.Ruud · Aug 16, 2006

Uri Guttman schreef:

Ruud:

Rate mapE mapB list for range rept emptyN empty
empty1
mapE
534/s# -- -0% -30% -38% -47% -47% -56% -65% -66%
mapB 534/s#
0% -- -30% -38% -47% -47% -56% -65% -66%
list 763/s 43%
43% -- -11% -24% -25% -38% -50% -51%
for 862/s# 62% 61%
13% -- -15% -15% -30% -44% -45%
range 1009/s 89% 89% 32%
17% -- -1% -18% -34% -35%
rept 1016/s# 90% 90% 33% 18%
1% -- -17% -34% -35%
emptyN 1225/s 130% 129% 61% 42% 21%
21% -- -20% -21%
empty 1530/s 187% 186% 100% 78% 52% 51%
25% -- -2%
empty1 1556/s 192% 191% 104% 81% 54% 53% 27%
2% --

The ones with a # actually set all values to a defined constant.

hmm, i would expect the emptyN to assign real undefs to the
elements.

I think so too now (but haven't checked), from the benchmark numbers,
where "emptyN" comes right after "rept" (the "(1) x @list" variant).

the 1 version should assign a real one to the first slot.

Yes. It is still remarkable that "empty1" is a tad faster than "empty".

Dr.Ruud · Aug 16, 2006

Dr.Ruud schreef:

Rate mapE mapB list for range rept emptyN empty
empty1

Oops.

#!/usr/bin/perl
use warnings ;
use strict ;

use Benchmark qw

hireswallclock cmpthese) ;

my @list = 1..1000 ;

cmpthese( -3, {
### mapB => sub { my %a = map { $_ => 1 } @list },
mapE => sub { my %a = map(( $_ => 1 ), @list)},
for => sub { my %a ; $a{ $_ } = 1 for @list },
list => sub { my %a ; @a{ @list } = @list },
range => sub { my %a ; @a{ @list } = 1..@list },
rept => sub { my %a ; @a{ @list } = (1) x @list },
empty => sub { my %a ; @a{ @list } = () },
empty1 => sub { my %a ; @a{ @list } = undef },
emptyN => sub { my %a ; @a{ @list } = (undef) x @list },
} ) ;

Rate mapE list for rept range emptyN empty empty1
mapE 548/s# -- -28% -36% -46% -46% -55% -64% -65%
list 755/s 38% -- -11% -25% -26% -38% -51% -51%
for 853/s# 56% 13% -- -15% -16% -30% -45% -45%
rept 1008/s# 84% 33% 18% -- -1% -18% -35% -35%
range 1021/s 86% 35% 20% 1% -- -17% -34% -34%
emptyN 1225/s 124% 62% 44% 21% 20% -- -20% -21%
empty 1540/s 181% 104% 81% 53% 51% 26% -- -0%
empty1 1545/s 182% 105% 81% 53% 51% 26% 0% --

Hash	2	Mar 15, 2011
My subroutine won't call within a for and if...is this normal?	1	Jan 25, 2022
Read a hash	5	Apr 28, 2011
traversing a hash structure	3	Aug 29, 2011
how do I create a hash from the list	9	Aug 19, 2011
Push regex search result into hash with multiple values	14	May 19, 2014
hash of arrays	1	Sep 13, 2012
data to hash	1	Jul 29, 2011

initialize a hash

Dr.Ruud

Paul Lalli

Uri Guttman

anno4000

Dr.Ruud

Dr.Ruud

Uri Guttman

Brian McCauley

Dr.Ruud

Dr.Ruud

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads