MD5 hash of object

L

lczancanella

Hi,

in hashlib the hash methods have as parameters just string, i want to
know how can i digest an object to get a md5 hash of them.

Thankz

Luiz
 
C

Chris Rebert

Hi,

in hashlib the hash methods have as parameters just string, i want to
know how can i digest an object to get a md5 hash of them.

Hashes are only defined to operate on bytestrings. Since Python is a
high-level language and doesn't permit you to view the internal binary
representation of objects, you're going to have to properly convert
the object to a bytestring first, a process called "serialization".
The `pickle` and `json` serialization modules are included in the
standard library. These modules can convert objects to bytestrings and
back again.
Once you've done the bytestring conversion, just run the hash method
on the bytestring.

Be careful when serializing dictionaries and sets though, because they
are arbitrarily ordered, so two dictionaries containing the same items
and which compare equal may have a different internal ordering, thus
different serializations, and thus different hashes.

Cheers,
Chris
 
J

Jeff McNeil

Hashes are only defined to operate on bytestrings. Since Python is a
high-level language and doesn't permit you to view the internal binary
representation of objects, you're going to have to properly convert
the object to a bytestring first, a process called "serialization".
The `pickle` and `json` serialization modules are included in the
standard library. These modules can convert objects to bytestrings and
back again.
Once you've done the bytestring conversion, just run the hash method
on the bytestring.

Be careful when serializing dictionaries and sets though, because they
are arbitrarily ordered, so two dictionaries containing the same items
and which compare equal may have a different internal ordering, thus
different serializations, and thus different hashes.

Cheers,
Chris
--http://blog.rebertia.com

I'd think that using the hash of the pickled representation of an
object might be problematic, no? The pickle protocol handles object
graphs in a way that allows it to preserve references back to
identical objects. Consider the following (contrived) example:

import pickle
from hashlib import md5

class Value(object):
def __init__(self, v):
self._v = v

class P1(object):
def __init__(self, name):
self.name = Value(name)
self.other_name = self.name

class P2(object):
def __init__(self, name):
self.name = Value(name)
self.other_name = Value(name)

h1 = md5(pickle.dumps(P1('sabres'))).hexdigest()
h2 = md5(pickle.dumps(P2('sabres'))).hexdigest()

print h1 == h2
Just something to be aware of. Depending on what you're trying to
accomplish, it may make sense to simply define a method which
generates a byte string representation of your object's state and just
return the hash of that value.

Thanks,

-Jeff
mcjeff.blogspot.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top