P
Phlip
Pythonistas:
Consider this hashing code:
import hashlib
file = open(path)
m = hashlib.md5()
m.update(file.read())
digest = m.hexdigest()
file.close()
If the file were huge, the file.read() would allocate a big string and
thrash memory. (Yes, in 2011 that's still a problem, because these
files could be movies and whatnot.)
So if I do the stream trick - read one byte, update one byte, in a
loop, then I'm essentially dragging that movie thru 8 bits of a 64 bit
CPU. So that's the same problem; it would still be slow.
So now I try this:
sum = os.popen('sha256sum %r' % path).read()
Those of you who like to lie awake at night thinking of new ways to
flame abusers of 'eval()' may have a good vent, there.
Does hashlib have a file-ready mode, to hide the streaming inside some
clever DMA operations?
Prematurely optimizingly y'rs
Consider this hashing code:
import hashlib
file = open(path)
m = hashlib.md5()
m.update(file.read())
digest = m.hexdigest()
file.close()
If the file were huge, the file.read() would allocate a big string and
thrash memory. (Yes, in 2011 that's still a problem, because these
files could be movies and whatnot.)
So if I do the stream trick - read one byte, update one byte, in a
loop, then I'm essentially dragging that movie thru 8 bits of a 64 bit
CPU. So that's the same problem; it would still be slow.
So now I try this:
sum = os.popen('sha256sum %r' % path).read()
Those of you who like to lie awake at night thinking of new ways to
flame abusers of 'eval()' may have a good vent, there.
Does hashlib have a file-ready mode, to hide the streaming inside some
clever DMA operations?
Prematurely optimizingly y'rs