encrypting files + filestreams?

P

per9000

Hi python people,

I am trying to figure out the best way to encrypt files in python.

I've build a small script (see below) that encrypts the ubuntu 7.04
iso file in 2 minutes (I like python :) ).

But I have some thoughts about it. By pure luck (?) this file happened
to be N*512 bytes long so I do not have to add crap at the end - but
on files of the size N*512 + M (M != 521) I will add some crap to make
it fit in the algorithm. When I later decrypt I will have the stuff I
do not want. How do people solve this? (By writing the number of
relevant bytes in readable text in the beginning of the file?)

Also I wonder if this can be solved with filestreams (Are there
streams in python? The only python file streams I found in the evil
search engine was stuff in other forums.)


Other comments are of course also welcome,
Per


# crypto_hardcoded.py starts here

from Crypto.Cipher import AES

def encrypt2(cryptor, infile, outfile):
"""enly encrypt a few bytes at a time"""

size = 512
bytes = infile.read(size)

seek = 0
interval = 97
ctr = 0

while len(bytes) == size:
seek += size
if ctr % interval == 0:
print '\r%15d bytes completed' % (seek),
ctr += 1

outfile.write(cryptor.encrypt(bytes))
# change to this to decrypt
# outfile.write(cryptor.decrypt(bytes))
bytes = infile.read(size)

if len(bytes) != 0:
bytes += "#" * (size - len(bytes))
outfile.write(cryptor.encrypt(bytes))
seek += len(bytes)

print '\r%15d bytes completed' % (seek)

if __name__ == "__main__":
crptz = AES.new("my-secret_passwd")
ifile = file('/tmp/ubuntu-7.04-desktop-i386.iso','r')
ofile = file('/tmp/ubuntu-7.04-desktop-i386.iso.out','w')

encrypt2(crptz, ifile, ofile)
ifile.close()
ofile.close()

# crypto_hardcoded.py ends here
 
L

Larry Bates

per9000 said:
Hi python people,

I am trying to figure out the best way to encrypt files in python.

I've build a small script (see below) that encrypts the ubuntu 7.04
iso file in 2 minutes (I like python :) ).

But I have some thoughts about it. By pure luck (?) this file happened
to be N*512 bytes long so I do not have to add crap at the end - but
on files of the size N*512 + M (M != 521) I will add some crap to make
it fit in the algorithm. When I later decrypt I will have the stuff I
do not want. How do people solve this? (By writing the number of
relevant bytes in readable text in the beginning of the file?)

Also I wonder if this can be solved with filestreams (Are there
streams in python? The only python file streams I found in the evil
search engine was stuff in other forums.)


Other comments are of course also welcome,
Per


# crypto_hardcoded.py starts here

from Crypto.Cipher import AES

def encrypt2(cryptor, infile, outfile):
"""enly encrypt a few bytes at a time"""

size = 512
bytes = infile.read(size)

seek = 0
interval = 97
ctr = 0

while len(bytes) == size:
seek += size
if ctr % interval == 0:
print '\r%15d bytes completed' % (seek),
ctr += 1

outfile.write(cryptor.encrypt(bytes))
# change to this to decrypt
# outfile.write(cryptor.decrypt(bytes))
bytes = infile.read(size)

if len(bytes) != 0:
bytes += "#" * (size - len(bytes))
outfile.write(cryptor.encrypt(bytes))
seek += len(bytes)

print '\r%15d bytes completed' % (seek)

if __name__ == "__main__":
crptz = AES.new("my-secret_passwd")
ifile = file('/tmp/ubuntu-7.04-desktop-i386.iso','r')
ofile = file('/tmp/ubuntu-7.04-desktop-i386.iso.out','w')

encrypt2(crptz, ifile, ofile)
ifile.close()
ofile.close()

# crypto_hardcoded.py ends here

Padding and keeping information in a header is how I solved the problem.

-Larry
 
M

Marshall T. Vandegrift

per9000 said:
I am trying to figure out the best way to encrypt files in python.

Looking at your code and questions, you probably want to pick up a
cryptography handbook of some sort (I'd recommend /Practical
Cryptography/) and give it a read.
But I have some thoughts about it. By pure luck (?) this file happened
to be N*512 bytes long so I do not have to add crap at the end - but
on files of the size N*512 + M (M != 521) I will add some crap to make
it fit in the algorithm.

BTW, AES has a block size of 16, not 512.
When I later decrypt I will have the stuff I do not want. How do
people solve this? (By writing the number of relevant bytes in
readable text in the beginning of the file?)

There are three basic ways of solving the problem with block ciphers.
Like you suggest, you can somehow store the actual size of the encrypted
data. The second option is to store the number of padding bytes
appended to the end of the data. The third is to use the block cipher
in cipher feedback (CFB) or output feedback (OFB) modes, both of which
transform the block cipher into a stream cipher. The simplest choice
coding-wise is to just use CFB mode, but the "best" choice depends upon
the requirements of your project.
Also I wonder if this can be solved with filestreams (Are there
streams in python? The only python file streams I found in the evil
search engine was stuff in other forums.)

Try looking for information on "file-like objects." Depending on the
needs of your application, one general solution would be to implement a
file-like object which decorates another file-like object with
encryption on its IO operations.
crptz = AES.new("my-secret_passwd")

I realize this is just toy code, but this is almost certainly not what
you want:

- You'll get a much higher quality key -- and allow arbitrary length
passphrases -- by producing the key from the passphrase instead of
using it directly as the key. For example, taking the SHA-256 hash
of the passphrase will produce a much higher entropy key of the
correct size for AES.

- Instantiating the cipher without specifying a mode and
initialization vector will cause the resulting cipher object to use
ECB (electronic codebook) mode. This causes each identical block in
the input stream to result in an identical block in the output
stream, which opens the door for all sorts of attacks.

Hope this helps!

-Marshall
 
D

David Wahler

Hi python people,

I am trying to figure out the best way to encrypt files in python.

I've build a small script (see below) that encrypts the ubuntu 7.04
iso file in 2 minutes (I like python :) ).

But I have some thoughts about it. By pure luck (?) this file happened
to be N*512 bytes long so I do not have to add crap at the end - but
on files of the size N*512 + M (M != 521) I will add some crap to make
it fit in the algorithm. When I later decrypt I will have the stuff I
do not want. How do people solve this? (By writing the number of
relevant bytes in readable text in the beginning of the file?)

The term you're looking for is "padding". See
http://en.wikipedia.org/wiki/Padding_(cryptography) for a brief
introduction, and especially the two RFCs mentioned about halfway
down.
Also I wonder if this can be solved with filestreams (Are there
streams in python? The only python file streams I found in the evil
search engine was stuff in other forums.)

I don't know how to answer this, because it's not clear what you mean
by "file streams". Python's file objects act similarly to things
called streams in other languages, such as Java's InputStreams and
Readers, if that's what you're asking.
Other comments are of course also welcome,
Per


# crypto_hardcoded.py starts here
[snip]

I notice there's some repeated code in your main loop. This generally
means there's room for improvement in your program flow. Here's one
possible way you could structure it: separate out the file reading and
padding logic into a generator function that takes a filename or file
object, and yields blocks one at a time, padded to the correct block
size. Then your main loop can be simplified to something like this:

for plaintext_block in read_as_blocks(in_file, block_size):
ciphertext_block = cryptor.encrypt(plaintext_block)
out_file.write(ciphertext_block)

Techniques like these can make it easier to follow what's going on,
especially as your programs get more complicated. Don't be afraid to
experiment!

-- David
 
J

James Stroud

per9000 said:
Also I wonder if this can be solved with filestreams (Are there
streams in python? The only python file streams I found in the evil
search engine was stuff in other forums.)

Check the source to: http://passerby.sf.net

In it you will find the jenncrypt module that makes a file-like wrapper
around streams for encrypting and handles padding, etc. It is designed
for a block cipher. This is not a trivial task, actually, but all the
code is in there. It is not intensely well documented. It is somewhat
well organized but this was my first full size attempt at a python
application. Please read the notes to the "SecureRandom" module inside
if you decide to use that module as the "name is misleading" under
certain circumstances. It is used for the padding, although the
randomness of the padding in a block cipher is in principle not a
practical security issue in most cases. Other than the pad generation,
the encryption algorithm is drop-in and I use the pycrypto
implementation of AES.

Read at least Schneier if you want to get started with such things as
there are many caveats to using cryptographic systems.

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
L

Lawrence D'Oliveiro

per9000 said:
crptz = AES.new("my-secret_passwd")

You're using ECB mode. Never use ECB mode. At a minimum, use CBC mode.

Also, another common thing is, don't use the actual password to encrypt the
entire file. Instead, randomly generate a "session key" to use for the
actual encryption, and only use the password to encrypt that.
def encrypt2(cryptor, infile, outfile):
"""enly encrypt a few bytes at a time"""

size = 512
bytes = infile.read(size)

seek = 0
interval = 97
ctr = 0

while len(bytes) == size:
seek += size
if ctr % interval == 0:
print '\r%15d bytes completed' % (seek),
ctr += 1

outfile.write(cryptor.encrypt(bytes))
# change to this to decrypt
# outfile.write(cryptor.decrypt(bytes))
bytes = infile.read(size)

if len(bytes) != 0:
bytes += "#" * (size - len(bytes))
outfile.write(cryptor.encrypt(bytes))
seek += len(bytes)

Finally, it is recommended that you also compute and encrypt a cryptographic
hash of the plaintext. That way, you can check that still matches after
decryption, to guard against tampering.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,043
Latest member
CannalabsCBDReview

Latest Threads

Top