Perl hash and rehash seeds; deterministic hash ordering

O

ozgune

Hi -

I just upgraded to Perl 5.8.6, and am evaluating the effect of random
hash ordering on my systems. Surprisingly, I seem *unaffected* by Perl
hash seeds (The following is a RHEL 3 box).

% perl -v
This is perl, v5.8.6 built for Linux-2.4c2.3-i686-64int

% perl -le '@a="a".."z";@a{@a}=();print keys %a'
HASH_SEED = 10308426221955932222
wraxdjyukhgftienvmslcpqbzo

% perl -le '@a="a".."z";@a{@a}=();print keys %a'
HASH_SEED = 2190859344420079461
wraxdjyukhgftienvmslcpqbzo

I tried this on a Fedora 3 box (5.8.6), and got the same behavior. On
OSX, the ordering is random as expected.

Looking into the source a little deeper, I see two different hash seeds
and two hash functions: PL_hash_seed (used by PERL_HASH) and
PL_rehash_seed (used by PERL_HASH_INTERNAL). PL_rehash_seed is
initialized in perl.c, but PL_hash_seed isn't.

"hv.c" seems to be calling PERL_HASH which relies on the uninitialized
seed. I added some printfs to the code, and ran perl:

% ./perl5.8.8 -le '@a="a".."z";@a{@a}=();print keys %a'
PL_hash_seed = 0
PL_rehash_seed = 17891139134198775779
IN HASH (NOT INTERNAL): PL_hash_seed = 0
IN HASH (NOT INTERNAL): PL_hash_seed = 0
..........
wraxdjyukhgftienvmslcpqbzo


Why are there two different hash seeds and hash functions? How can I
know for sure which one is going to be called?

Thanks,

Ozgun.
 
G

Guest

Hi -

I just upgraded to Perl 5.8.6, and am evaluating the effect of random
hash ordering on my systems. Surprisingly, I seem *unaffected* by Perl
hash seeds (The following is a RHEL 3 box).

% perl -v
This is perl, v5.8.6 built for Linux-2.4c2.3-i686-64int

% perl -le '@a="a".."z";@a{@a}=();print keys %a'
HASH_SEED = 10308426221955932222
wraxdjyukhgftienvmslcpqbzo

% perl -le '@a="a".."z";@a{@a}=();print keys %a'
HASH_SEED = 2190859344420079461
wraxdjyukhgftienvmslcpqbzo
[...]

FWIW, I'm getting the same behavior on Debian 3.1 i386 and Perl 5.9.4.
 
O

ozgune

Any ideas if there is a better group to post this?

From
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2003-06/msg00584.html

../perl -le '@a="a".."z";@a{@a}=();print keys %a'

gives a different result for each run. That _really_ should teach
people
not to expect any particular ordering of hash keys...

On our systems however:

% perl -le '@a="a".."z";@a{@a}=();print keys %a'
wraxdjyukhgftienvmslcpqbzo

% perl -le '@a="a".."z";@a{@a}=();print keys %a'
wraxdjyukhgftienvmslcpqbzo
 
B

Bo Lindbergh

There's a flag in each hash that controls whether the random or the
non-random hash seed is used. Hashes start out non-random and switch to
random if and when the hsplit function thinks it's a good idea. Search for
"Pathological data" in hv.c for the details.

/Bo Lindbergh
 
O

ozgune

Thanks, that clears it; I wrongfully thought the original post on the
issue (Patch #22371) cited the final implementation behavior.

In summary, Perl gives no guarantees about the ordering of an hash's
key value pairs, and documents this behavior. Perl 5.8.8's
implementation will give a deterministic order until the hash split
can't redistribute the entries evenly. (More specifically, after a
split, if the longest chain in the hash still has a lot of entries,
Perl switches to random seeds.)

Ozgun.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top