read text file byte by byte

Discussion in 'Python' started by daved170, Dec 12, 2009.

  1. daved170

    daved170 Guest

    Hello everybody,
    I need to read a text file byte after byte.
    Eache byte is sent to a function that scramble it
    and I need to write the result to binary file.

    I've got some questions -
    1) How do I read the file byte by byte
    2) Should I use streams? If so and I get my entire scrambled text in
    stream can I just write it to the binary file?

    Thanks
    Dave
     
    daved170, Dec 12, 2009
    #1
    1. Advertising

  2. daved170

    census Guest

    daved170 wrote:

    > Hello everybody,
    > I need to read a text file byte after byte.
    > Eache byte is sent to a function that scramble it
    > and I need to write the result to binary file.
    >
    > I've got some questions -
    > 1) How do I read the file byte by byte
    > 2) Should I use streams? If so and I get my entire scrambled text in
    > stream can I just write it to the binary file?
    >
    > Thanks
    > Dave


    f = open ("binaryfile", "r")
    bytearray = map (ord, f.read () )

    Stores the content of binaryfile in the list bytearray.
     
    census, Dec 12, 2009
    #2
    1. Advertising

  3. daved170

    census Guest

    daved170 wrote:

    > Hello everybody,
    > I need to read a text file byte after byte.
    > Eache byte is sent to a function that scramble it
    > and I need to write the result to binary file.
    >
    > I've got some questions -
    > 1) How do I read the file byte by byte
    > 2) Should I use streams? If so and I get my entire scrambled text in
    > stream can I just write it to the binary file?
    >
    > Thanks
    > Dave


    OK, now here a complete code to read a file byte by byte, scramble each
    byte (with a really complex algorithm in my example) and write the
    output to another file.

    def scramble (a): return (a + 13) % 256

    infile = open ("binaryfile1", "r")
    outfile = open ("binaryfile2", "w")

    bytearray = map (ord, infile.read () )
    scrambled = map (scramble, bytearray)
    map (lambda x : outfile.write (chr (x) ), scrambled)

    infile.close ()
    outfile.flush ()
    outfile.close ()
     
    census, Dec 12, 2009
    #3
  4. On Sat, 12 Dec 2009 10:35:55 +0100, census wrote:

    >> I've got some questions -
    >> 1) How do I read the file byte by byte 2) Should I use streams? If so
    >> and I get my entire scrambled text in stream can I just write it to the
    >> binary file?
    >>
    >> Thanks
    >> Dave

    >
    > f = open ("binaryfile", "r")
    > bytearray = map (ord, f.read () )
    >
    > Stores the content of binaryfile in the list bytearray.


    If it's a binary file, you should open it in binary mode:

    f = open ("binaryfile", "rb")



    --
    Steven
     
    Steven D'Aprano, Dec 12, 2009
    #4
  5. On Fri, 11 Dec 2009 23:16:42 -0800, daved170 wrote:

    > Hello everybody,
    > I need to read a text file byte after byte. Eache byte is sent to a
    > function that scramble it and I need to write the result to binary file.
    >
    > I've got some questions -
    > 1) How do I read the file byte by byte


    f = open(filename, 'rb')
    f.read(1)

    will read a single byte.




    2) Should I use streams?

    What do you mean by "streams"?



    --
    Steven


    --
    Steven
     
    Steven D'Aprano, Dec 12, 2009
    #5
  6. daved170

    census Guest

    Steven D'Aprano wrote:

    > On Sat, 12 Dec 2009 10:35:55 +0100, census wrote:
    >
    >>> I've got some questions -
    >>> 1) How do I read the file byte by byte 2) Should I use streams? If so
    >>> and I get my entire scrambled text in stream can I just write it to the
    >>> binary file?
    >>>
    >>> Thanks
    >>> Dave

    >>
    >> f = open ("binaryfile", "r")
    >> bytearray = map (ord, f.read () )
    >>
    >> Stores the content of binaryfile in the list bytearray.

    >
    > If it's a binary file, you should open it in binary mode:
    >
    > f = open ("binaryfile", "rb")
    >
    >
    >


    Add the "b" flag to both in and out file if you prefer it:

    def scramble (a): return (a + 13) % 256

    infile = open ("binin", "rb")
    outfile = open ("binout", "wb")

    bytearray = map (ord, infile.read () )
    scrambled = map (scramble, bytearray)
    map (lambda x : outfile.write (chr (x) ), scrambled)

    infile.close ()
    outfile.flush ()
    outfile.close ()
     
    census, Dec 12, 2009
    #6
  7. On Sat, 12 Dec 2009 10:46:01 +0100, census <>
    declaimed the following in gmane.comp.python.general:

    >
    > def scramble (a): return (a + 13) % 256
    >

    I'll see your modulo rot 13 and raise with a exclusive or...

    -=-=-=-=-

    import sys

    def scramble(block, key="don't look"):
    copies = int(len(block) / len(key)) + 1
    keystring = key * copies
    return "".join([ chr( ord(block)
    ^ ord(keystring))
    for i in range(len(block))])


    def process(fin, fout, key=None):
    din = open(fin, "rb")
    dout = open(fout, "wb")
    while True:
    block = din.read(1024)
    if not block: break
    if key is None:
    block = scramble(block)
    else:
    block = scramble(block, key)
    dout.write(block)
    dout.close()
    din.close()

    if __name__ == "__main__":
    fin = sys.argv[1]
    fout = sys.argv[2]
    if len(sys.argv) > 3:
    key = sys.argv[3]
    else:
    key = None
    process(fin, fout, key)
    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Dec 13, 2009
    #7
  8. daved170

    Tim Chase Guest

    Steven D'Aprano wrote:
    >> 2) Should I use streams?

    >
    > What do you mean by "streams"?


    they're what come out of proton packs...just don't cross them.
    It would be bad.

    -tkc

    (I suspect the OP is a Java/C++ programmer where "streams" are
    somewhat akin to generators, but less powerful; so the answer is
    "you can if you want, but it may not get you what you think you
    want")
     
    Tim Chase, Dec 13, 2009
    #8
  9. On 2009-12-13, Tim Chase <> wrote:
    > Steven D'Aprano wrote:
    >>> 2) Should I use streams?

    >>
    >> What do you mean by "streams"?

    >
    > they're what come out of proton packs...just don't cross them.
    > It would be bad.


    No kidding. Try to imagine all life as you know it stopping
    instantaneously, and every molecule in your body exploding at
    the speed of light.

    I hate when that happens.

    --
    Grant
     
    Grant Edwards, Dec 13, 2009
    #9
  10. daved170

    daved170 Guest

    On Dec 13, 2:34 am, Dennis Lee Bieber <> wrote:
    > On Sat, 12 Dec 2009 10:46:01 +0100, census <>
    > declaimed the following in gmane.comp.python.general:
    >
    >
    >
    > > def scramble (a): return (a + 13) % 256

    >
    >         I'll see your modulo rot 13 and raise with a exclusive or....
    >
    > -=-=-=-=-
    >
    > import sys
    >
    > def scramble(block, key="don't look"):
    >     copies = int(len(block) / len(key)) + 1
    >     keystring = key * copies
    >     return "".join([ chr( ord(block)
    >                           ^ ord(keystring))
    >                      for i in range(len(block))])
    >
    > def process(fin, fout, key=None):
    >     din = open(fin, "rb")
    >     dout = open(fout, "wb")
    >     while True:
    >         block = din.read(1024)
    >         if not block: break
    >         if key is None:
    >             block = scramble(block)
    >         else:
    >             block = scramble(block, key)
    >         dout.write(block)
    >     dout.close()
    >     din.close()
    >
    > if __name__ == "__main__":
    >     fin = sys.argv[1]
    >     fout = sys.argv[2]
    >     if len(sys.argv) > 3:
    >         key = sys.argv[3]
    >     else:
    >         key = None
    >     process(fin, fout, key)
    > --
    >         Wulfraed         Dennis Lee Bieber               KD6MOG
    >              HTTP://wlfraed.home.netcom.com/



    Thank you all.
    Dennis I really liked you solution for the issue but I have two
    question about it:
    1) My origin file is Text file and not binary
    2) I need to read each time 1 byte. I didn't see that on your example
    code.
    Thanks again All of you
    Dave
     
    daved170, Dec 13, 2009
    #10
  11. On Sat, 12 Dec 2009 22:15:50 -0800, daved170 wrote:

    > Thank you all.
    > Dennis I really liked you solution for the issue but I have two question
    > about it:
    > 1) My origin file is Text file and not binary


    That's a statement, not a question.


    > 2) I need to read each time 1 byte.


    f = open(filename, 'r') # open in text mode
    f.read(1) # read one byte


    --
    Steven
     
    Steven D'Aprano, Dec 13, 2009
    #11
  12. daved170

    Lie Ryan Guest

    On 12/13/2009 5:15 PM, daved170 wrote:
    > Thank you all.
    > Dennis I really liked you solution for the issue but I have two
    > question about it:
    > 1) My origin file is Text file and not binary
    > 2) I need to read each time 1 byte. I didn't see that on your example
    > code.


    That's where you're confusing things. The counting unit in text is
    characters, not bytes. Text is binary as well, it's just binary encoded
    in specific way (like ASCII or UTF-8), and computers decoded that binary
    stream into characters.

    What you actually need? Reading the text character-per-character OR
    treating an encoded text as binary and reading it byte-per-byte.

    Rather, why don't you explain the problem you're trying to solve so we
    can see which you actually need.
     
    Lie Ryan, Dec 13, 2009
    #12
  13. daved170

    Dave Angel Guest

    daved170 wrote:
    > On Dec 13, 2:34 am, Dennis Lee Bieber <> wrote:
    >
    >> On Sat, 12 Dec 2009 10:46:01 +0100, census <>
    >> declaimed the following in gmane.comp.python.general:
    >>
    >>
    >>
    >>
    >>> def scramble (a): return (a + 13) % 256
    >>>

    >> I'll see your modulo rot 13 and raise with a exclusive or...
    >>
    >> -=-=-=-
    >>
    >> import sys
    >>
    >> def scramble(block, key=on't look"):
    >> copies =nt(len(block) / len(key)) + 1
    >> keystring =ey * copies
    >> return "".join([ chr( ord(block)
    >> ^ ord(keystring))
    >> for i in range(len(block))])
    >>
    >> def process(fin, fout, key=ne):
    >> din =pen(fin, "rb")
    >> dout =pen(fout, "wb")
    >> while True:
    >> block =in.read(1024)
    >> if not block: break
    >> if key is None:
    >> block =cramble(block)
    >> else:
    >> block =cramble(block, key)
    >> dout.write(block)
    >> dout.close()
    >> din.close()
    >>
    >> if __name__ ="__main__":
    >> fin =ys.argv[1]
    >> fout =ys.argv[2]
    >> if len(sys.argv) > 3:
    >> key =ys.argv[3]
    >> else:
    >> key =one
    >> process(fin, fout, key)
    >> --
    >> Wulfraed Dennis Lee Bieber KD6MOG
    >> HTTP://wlfraed.home.netcom.com/
    >>

    >
    >
    > Thank you all.
    > Dennis I really liked you solution for the issue but I have two
    > question about it:
    > 1) My origin file is Text file and not binary
    > 2) I need to read each time 1 byte. I didn't see that on your example
    > code.
    > Thanks again All of you
    > Dave
    >
    >

    If you really need to see each byte of the file, you need to open it as
    binary. You can then decide that the bytes represent text, in some
    encoding. If you don't use the "b" flag, the library may change the
    newline characters out from under you. You have to decide if that matters.

    I may have missed it, but I don't think you ever explained why you
    insist on the data being read one byte at a time. Usually, it's more
    efficient to read it into a buffer, and process that one byte at a
    time. But in any case, you can supply your own count to read().
    Instead of using 1024, use 1.

    DaveA
     
    Dave Angel, Dec 13, 2009
    #13
  14. Hi!

    > If it's a binary file...


    OK, but... what is a "binary" file?

    @+
    --
    Michel Claveau
     
    Michel Claveau - MVP, Dec 13, 2009
    #14
  15. daved170

    Chris Rebert Guest

    Chris Rebert, Dec 13, 2009
    #15
  16. On 2009-12-13, Michel Claveau - MVP <> wrote:
    > Hi!
    >
    >> If it's a binary file...

    >
    > OK, but... what is a "binary" file?


    One containing data encoded in base-2.

    --
    Grant
     
    Grant Edwards, Dec 13, 2009
    #16
  17. daved170

    Tim Chase Guest

    Grant Edwards wrote:
    >>> If it's a binary file...

    >> OK, but... what is a "binary" file?

    >
    > One containing data encoded in base-2.


    Or one of a system of two files that orbits around a common
    center of mass? So if you see two files orbiting around a
    cathedral, they're binary files.

    f.open('binaryfile.bin', 'wb')
    f.write(data.encode('binary'))
    f.close()

    :)

    -tkc
     
    Tim Chase, Dec 13, 2009
    #17
  18. On Sat, 12 Dec 2009 22:15:50 -0800 (PST), daved170 <>
    declaimed the following in gmane.comp.python.general:


    > Thank you all.
    > Dennis I really liked you solution for the issue but I have two
    > question about it:
    > 1) My origin file is Text file and not binary


    Do you need to process the bytes in the file as they are? Or do you
    accept changes in line-endings (M$ Windows "text" files use <cr><lf> as
    line ending, but if you read it in Python as "text" <cr><lf> is
    converted to a single <lf>.

    > 2) I need to read each time 1 byte. I didn't see that on your example
    > code.


    You've never explained why you need to READ 1 byte at a time, vs
    reading a block (I chose 1KB) and processing each byte IN THE BLOCK.
    After all, if you do use 1 byte I/O, your program is going to be very
    slow, as each read is blocking (suspends) while asking the O/S for the
    next character in the file (this depends upon the underlying I/O library
    implementation -- I suspect any modern I/O system is still reading some
    block size [256 to 4K] and then returning parts of that block as
    needed). OTOH, reading a block at a time makes for one suspension and
    then a lot of data to be processed however you want.

    You originally stated that you want to "scramble" the bytes -- if
    you mean to implement some sort of encryption algorithm you should know
    that most of them work in blocks as the "key" is longer than one byte.

    My sample reads in chunks, then the scramble function XORs each byte
    with the corresponding byte in the supplied key string, finally
    rejoining all the now individual bytes into a single chunk for
    subsequent output.
    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Dec 13, 2009
    #18
  19. daved170

    Nobody Guest

    On Sun, 13 Dec 2009 12:39:26 -0800, Dennis Lee Bieber wrote:

    > You originally stated that you want to "scramble" the bytes -- if
    > you mean to implement some sort of encryption algorithm you should know
    > that most of them work in blocks as the "key" is longer than one byte.


    Block ciphers work in blocks. Stream ciphers work on bytes, regardless of
    the length of the key.
     
    Nobody, Dec 14, 2009
    #19
  20. daved170

    MRAB Guest

    Nobody wrote:
    > On Sun, 13 Dec 2009 12:39:26 -0800, Dennis Lee Bieber wrote:
    >
    >> You originally stated that you want to "scramble" the bytes -- if
    >> you mean to implement some sort of encryption algorithm you should know
    >> that most of them work in blocks as the "key" is longer than one byte.

    >
    > Block ciphers work in blocks. Stream ciphers work on bytes, regardless of
    > the length of the key.
    >

    It's still more efficient to read in blocks, even if you're going to
    process the bytes one at a time.
     
    MRAB, Dec 14, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Krish
    Replies:
    1
    Views:
    1,092
    =?Utf-8?B?Q3VydF9DIFtNVlBd?=
    Oct 20, 2005
  2. crash.test.dummy
    Replies:
    1
    Views:
    953
    Knute Johnson
    Feb 17, 2006
  3. Deep
    Replies:
    6
    Views:
    506
    Nick Keighley
    Feb 28, 2007
  4. Mmcolli00 Mom
    Replies:
    2
    Views:
    201
    Mmcolli00 Mom
    Jan 27, 2009
  5. Alex Dowad
    Replies:
    4
    Views:
    283
    Michel Demazure
    May 1, 2010
Loading...

Share This Page