encrypting files + filestreams?

Discussion in 'Python' started by per9000, Aug 15, 2007.

  1. per9000

    per9000 Guest

    Hi python people,

    I am trying to figure out the best way to encrypt files in python.

    I've build a small script (see below) that encrypts the ubuntu 7.04
    iso file in 2 minutes (I like python :) ).

    But I have some thoughts about it. By pure luck (?) this file happened
    to be N*512 bytes long so I do not have to add crap at the end - but
    on files of the size N*512 + M (M != 521) I will add some crap to make
    it fit in the algorithm. When I later decrypt I will have the stuff I
    do not want. How do people solve this? (By writing the number of
    relevant bytes in readable text in the beginning of the file?)

    Also I wonder if this can be solved with filestreams (Are there
    streams in python? The only python file streams I found in the evil
    search engine was stuff in other forums.)


    Other comments are of course also welcome,
    Per


    # crypto_hardcoded.py starts here

    from Crypto.Cipher import AES

    def encrypt2(cryptor, infile, outfile):
    """enly encrypt a few bytes at a time"""

    size = 512
    bytes = infile.read(size)

    seek = 0
    interval = 97
    ctr = 0

    while len(bytes) == size:
    seek += size
    if ctr % interval == 0:
    print '\r%15d bytes completed' % (seek),
    ctr += 1

    outfile.write(cryptor.encrypt(bytes))
    # change to this to decrypt
    # outfile.write(cryptor.decrypt(bytes))
    bytes = infile.read(size)

    if len(bytes) != 0:
    bytes += "#" * (size - len(bytes))
    outfile.write(cryptor.encrypt(bytes))
    seek += len(bytes)

    print '\r%15d bytes completed' % (seek)

    if __name__ == "__main__":
    crptz = AES.new("my-secret_passwd")
    ifile = file('/tmp/ubuntu-7.04-desktop-i386.iso','r')
    ofile = file('/tmp/ubuntu-7.04-desktop-i386.iso.out','w')

    encrypt2(crptz, ifile, ofile)
    ifile.close()
    ofile.close()

    # crypto_hardcoded.py ends here
     
    per9000, Aug 15, 2007
    #1
    1. Advertising

  2. per9000

    Larry Bates Guest

    per9000 wrote:
    > Hi python people,
    >
    > I am trying to figure out the best way to encrypt files in python.
    >
    > I've build a small script (see below) that encrypts the ubuntu 7.04
    > iso file in 2 minutes (I like python :) ).
    >
    > But I have some thoughts about it. By pure luck (?) this file happened
    > to be N*512 bytes long so I do not have to add crap at the end - but
    > on files of the size N*512 + M (M != 521) I will add some crap to make
    > it fit in the algorithm. When I later decrypt I will have the stuff I
    > do not want. How do people solve this? (By writing the number of
    > relevant bytes in readable text in the beginning of the file?)
    >
    > Also I wonder if this can be solved with filestreams (Are there
    > streams in python? The only python file streams I found in the evil
    > search engine was stuff in other forums.)
    >
    >
    > Other comments are of course also welcome,
    > Per
    >
    >
    > # crypto_hardcoded.py starts here
    >
    > from Crypto.Cipher import AES
    >
    > def encrypt2(cryptor, infile, outfile):
    > """enly encrypt a few bytes at a time"""
    >
    > size = 512
    > bytes = infile.read(size)
    >
    > seek = 0
    > interval = 97
    > ctr = 0
    >
    > while len(bytes) == size:
    > seek += size
    > if ctr % interval == 0:
    > print '\r%15d bytes completed' % (seek),
    > ctr += 1
    >
    > outfile.write(cryptor.encrypt(bytes))
    > # change to this to decrypt
    > # outfile.write(cryptor.decrypt(bytes))
    > bytes = infile.read(size)
    >
    > if len(bytes) != 0:
    > bytes += "#" * (size - len(bytes))
    > outfile.write(cryptor.encrypt(bytes))
    > seek += len(bytes)
    >
    > print '\r%15d bytes completed' % (seek)
    >
    > if __name__ == "__main__":
    > crptz = AES.new("my-secret_passwd")
    > ifile = file('/tmp/ubuntu-7.04-desktop-i386.iso','r')
    > ofile = file('/tmp/ubuntu-7.04-desktop-i386.iso.out','w')
    >
    > encrypt2(crptz, ifile, ofile)
    > ifile.close()
    > ofile.close()
    >
    > # crypto_hardcoded.py ends here
    >


    Padding and keeping information in a header is how I solved the problem.

    -Larry
     
    Larry Bates, Aug 15, 2007
    #2
    1. Advertising

  3. per9000 <> writes:

    > I am trying to figure out the best way to encrypt files in python.


    Looking at your code and questions, you probably want to pick up a
    cryptography handbook of some sort (I'd recommend /Practical
    Cryptography/) and give it a read.

    > But I have some thoughts about it. By pure luck (?) this file happened
    > to be N*512 bytes long so I do not have to add crap at the end - but
    > on files of the size N*512 + M (M != 521) I will add some crap to make
    > it fit in the algorithm.


    BTW, AES has a block size of 16, not 512.

    > When I later decrypt I will have the stuff I do not want. How do
    > people solve this? (By writing the number of relevant bytes in
    > readable text in the beginning of the file?)


    There are three basic ways of solving the problem with block ciphers.
    Like you suggest, you can somehow store the actual size of the encrypted
    data. The second option is to store the number of padding bytes
    appended to the end of the data. The third is to use the block cipher
    in cipher feedback (CFB) or output feedback (OFB) modes, both of which
    transform the block cipher into a stream cipher. The simplest choice
    coding-wise is to just use CFB mode, but the "best" choice depends upon
    the requirements of your project.

    > Also I wonder if this can be solved with filestreams (Are there
    > streams in python? The only python file streams I found in the evil
    > search engine was stuff in other forums.)


    Try looking for information on "file-like objects." Depending on the
    needs of your application, one general solution would be to implement a
    file-like object which decorates another file-like object with
    encryption on its IO operations.

    > crptz = AES.new("my-secret_passwd")


    I realize this is just toy code, but this is almost certainly not what
    you want:

    - You'll get a much higher quality key -- and allow arbitrary length
    passphrases -- by producing the key from the passphrase instead of
    using it directly as the key. For example, taking the SHA-256 hash
    of the passphrase will produce a much higher entropy key of the
    correct size for AES.

    - Instantiating the cipher without specifying a mode and
    initialization vector will cause the resulting cipher object to use
    ECB (electronic codebook) mode. This causes each identical block in
    the input stream to result in an identical block in the output
    stream, which opens the door for all sorts of attacks.

    Hope this helps!

    -Marshall
     
    Marshall T. Vandegrift, Aug 15, 2007
    #3
  4. per9000

    David Wahler Guest

    On 8/15/07, per9000 <> wrote:
    > Hi python people,
    >
    > I am trying to figure out the best way to encrypt files in python.
    >
    > I've build a small script (see below) that encrypts the ubuntu 7.04
    > iso file in 2 minutes (I like python :) ).
    >
    > But I have some thoughts about it. By pure luck (?) this file happened
    > to be N*512 bytes long so I do not have to add crap at the end - but
    > on files of the size N*512 + M (M != 521) I will add some crap to make
    > it fit in the algorithm. When I later decrypt I will have the stuff I
    > do not want. How do people solve this? (By writing the number of
    > relevant bytes in readable text in the beginning of the file?)


    The term you're looking for is "padding". See
    http://en.wikipedia.org/wiki/Padding_(cryptography) for a brief
    introduction, and especially the two RFCs mentioned about halfway
    down.

    > Also I wonder if this can be solved with filestreams (Are there
    > streams in python? The only python file streams I found in the evil
    > search engine was stuff in other forums.)


    I don't know how to answer this, because it's not clear what you mean
    by "file streams". Python's file objects act similarly to things
    called streams in other languages, such as Java's InputStreams and
    Readers, if that's what you're asking.

    > Other comments are of course also welcome,
    > Per
    >
    >
    > # crypto_hardcoded.py starts here

    [snip]

    I notice there's some repeated code in your main loop. This generally
    means there's room for improvement in your program flow. Here's one
    possible way you could structure it: separate out the file reading and
    padding logic into a generator function that takes a filename or file
    object, and yields blocks one at a time, padded to the correct block
    size. Then your main loop can be simplified to something like this:

    for plaintext_block in read_as_blocks(in_file, block_size):
    ciphertext_block = cryptor.encrypt(plaintext_block)
    out_file.write(ciphertext_block)

    Techniques like these can make it easier to follow what's going on,
    especially as your programs get more complicated. Don't be afraid to
    experiment!

    -- David
     
    David Wahler, Aug 15, 2007
    #4
  5. per9000

    James Stroud Guest

    per9000 wrote:
    > Also I wonder if this can be solved with filestreams (Are there
    > streams in python? The only python file streams I found in the evil
    > search engine was stuff in other forums.)


    Check the source to: http://passerby.sf.net

    In it you will find the jenncrypt module that makes a file-like wrapper
    around streams for encrypting and handles padding, etc. It is designed
    for a block cipher. This is not a trivial task, actually, but all the
    code is in there. It is not intensely well documented. It is somewhat
    well organized but this was my first full size attempt at a python
    application. Please read the notes to the "SecureRandom" module inside
    if you decide to use that module as the "name is misleading" under
    certain circumstances. It is used for the padding, although the
    randomness of the padding in a block cipher is in principle not a
    practical security issue in most cases. Other than the pad generation,
    the encryption algorithm is drop-in and I use the pycrypto
    implementation of AES.

    Read at least Schneier if you want to get started with such things as
    there are many caveats to using cryptographic systems.

    James

    --
    James Stroud
    UCLA-DOE Institute for Genomics and Proteomics
    Box 951570
    Los Angeles, CA 90095

    http://www.jamesstroud.com/
     
    James Stroud, Aug 15, 2007
    #5
  6. In message <>, per9000
    wrote:

    > crptz = AES.new("my-secret_passwd")


    You're using ECB mode. Never use ECB mode. At a minimum, use CBC mode.

    Also, another common thing is, don't use the actual password to encrypt the
    entire file. Instead, randomly generate a "session key" to use for the
    actual encryption, and only use the password to encrypt that.

    > def encrypt2(cryptor, infile, outfile):
    > """enly encrypt a few bytes at a time"""
    >
    > size = 512
    > bytes = infile.read(size)
    >
    > seek = 0
    > interval = 97
    > ctr = 0
    >
    > while len(bytes) == size:
    > seek += size
    > if ctr % interval == 0:
    > print '\r%15d bytes completed' % (seek),
    > ctr += 1
    >
    > outfile.write(cryptor.encrypt(bytes))
    > # change to this to decrypt
    > # outfile.write(cryptor.decrypt(bytes))
    > bytes = infile.read(size)
    >
    > if len(bytes) != 0:
    > bytes += "#" * (size - len(bytes))
    > outfile.write(cryptor.encrypt(bytes))
    > seek += len(bytes)


    Finally, it is recommended that you also compute and encrypt a cryptographic
    hash of the plaintext. That way, you can check that still matches after
    decryption, to guard against tampering.
     
    Lawrence D'Oliveiro, Aug 18, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    4
    Views:
    341
    David Harmon
    Dec 13, 2005
  2. CsaaGuy

    WCF FileStreams

    CsaaGuy, Feb 29, 2008, in forum: ASP .Net
    Replies:
    4
    Views:
    1,214
    CsaaGuy
    Mar 1, 2008
  3. Ollie Riches
    Replies:
    1
    Views:
    1,675
    Gregory A. Beamer
    Dec 4, 2008
  4. MCM

    Encrypting .config files

    MCM, Sep 16, 2009, in forum: ASP .Net Security
    Replies:
    6
    Views:
    867
  5. Demec

    Encrypting files in Ruby

    Demec, Apr 29, 2010, in forum: Ruby
    Replies:
    1
    Views:
    290
    Dejan Dimic
    Apr 30, 2010
Loading...

Share This Page