hash() algorithm

  • Thread starter =?iso-8859-1?q?Beno=EEt_Dejean?=
  • Start date
?

=?iso-8859-1?q?Beno=EEt_Dejean?=

hi. Is the hash() algorithm standard ? Does hash(some_string) will always
return the same hash code on every arch ?

i need to use a ~checksum function, like md5, but i was also thinking
about hash() which is obviously simpler. So i can safely rely on hash()
behaviour so i can use it to generate ~strong and portable
identifier/checksum ?

thank you
 
K

Kristofer Pettijohn

Beno?t Dejean said:
hi. Is the hash() algorithm standard ? Does hash(some_string) will always
return the same hash code on every arch ?

i need to use a ~checksum function, like md5, but i was also thinking
about hash() which is obviously simpler. So i can safely rely on hash()
behaviour so i can use it to generate ~strong and portable
identifier/checksum ?

I'm not an expert, but I believe so. I just tried three machines:

OS X 10.4: (Python 2.3)1308370872

Solaris: (Python 1.6)1308370872

FreeBSD 5.2.1: (Python 2.3)
 
D

David Bolen

Benoît Dejean said:
i need to use a ~checksum function, like md5, but i was also thinking
about hash() which is obviously simpler. So i can safely rely on hash()
behaviour so i can use it to generate ~strong and portable
identifier/checksum ?

I don't believe it's changed since at least 1.5.2, but I'm also pretty
sure there are no guarantees that it will remain the same going forward.

Also, how strong do you want your checksum to be? That is, how much
of a guarantee do you want that you'll be able to detect a change in
the data by a change in the checksum? MD5 will give you a really
strong guarantee, hash() - whether stable/portable or not - will give
you a reasonably weak guarantee since it's not built to be collision
free.

-- David
 
T

Tim Peters

[Benoît Dejean]
hi. Is the hash() algorithm standard ? Does hash(some_string) will always
return the same hash code on every arch ?

No, and in fact it's almost certain to deliver a different hash on a
32-bit machine than on a 64-bit machine (Python hash codes are the
same size as the native platform C "long" type). Python doesn't
promise to deliver the same hash codes across releases either
(although it usually does anyway).
i need to use a ~checksum function, like md5, but i was also thinking
about hash() which is obviously simpler. So i can safely rely on hash()
behaviour so i can use it to generate ~strong and portable
identifier/checksum ?

It's not strong. It's easy to find distinct strings with the same
Python hash; it's widely thought to be intractable to do the same wrt
MD5 or SHA hashes.
 
P

Paul Rubin

Benoît Dejean said:
hi. Is the hash() algorithm standard ? Does hash(some_string) will always
return the same hash code on every arch ?

I'd say you should not rely on that.
i need to use a ~checksum function, like md5, but i was also
thinking about hash() which is obviously simpler. So i can safely
rely on hash() behaviour so i can use it to generate ~strong and
portable identifier/checksum ?

I don't know what you mean by "strong". I'm sure you can find collisions
in hash() without much effort. It's much harder to do that for md5.
 
C

Christopher T King

i need to use a ~checksum function, like md5, but i was also thinking
about hash() which is obviously simpler.

md5 is actually very easy to use on Python:
41499123188802761002464065009245263231L

This is a little more verbose than hash(), but it's just as
straightforward, and can more easily be used with large messages (see the
.update() method of the md5 object returned by new()).
 
?

=?iso-8859-1?q?Beno=EEt_Dejean?=

Le Wed, 21 Jul 2004 16:34:16 -0400, David Bolen a écrit :
I don't believe it's changed since at least 1.5.2, but I'm also pretty
sure there are no guarantees that it will remain the same going forward.
ok

Also, how strong do you want your checksum to be? That is, how much
of a guarantee do you want that you'll be able to detect a change in
the data by a change in the checksum? MD5 will give you a really
strong guarantee, hash() - whether stable/portable or not - will give
you a reasonably weak guarantee since it's not built to be collision
free.

i know this. i've been using md5 for a long time, i was just wondering if
.... thank you all.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top