read text file byte by byte

D

daved170

Hello everybody,
I need to read a text file byte after byte.
Eache byte is sent to a function that scramble it
and I need to write the result to binary file.

I've got some questions -
1) How do I read the file byte by byte
2) Should I use streams? If so and I get my entire scrambled text in
stream can I just write it to the binary file?

Thanks
Dave
 
C

census

daved170 said:
Hello everybody,
I need to read a text file byte after byte.
Eache byte is sent to a function that scramble it
and I need to write the result to binary file.

I've got some questions -
1) How do I read the file byte by byte
2) Should I use streams? If so and I get my entire scrambled text in
stream can I just write it to the binary file?

Thanks
Dave

f = open ("binaryfile", "r")
bytearray = map (ord, f.read () )

Stores the content of binaryfile in the list bytearray.
 
C

census

daved170 said:
Hello everybody,
I need to read a text file byte after byte.
Eache byte is sent to a function that scramble it
and I need to write the result to binary file.

I've got some questions -
1) How do I read the file byte by byte
2) Should I use streams? If so and I get my entire scrambled text in
stream can I just write it to the binary file?

Thanks
Dave

OK, now here a complete code to read a file byte by byte, scramble each
byte (with a really complex algorithm in my example) and write the
output to another file.

def scramble (a): return (a + 13) % 256

infile = open ("binaryfile1", "r")
outfile = open ("binaryfile2", "w")

bytearray = map (ord, infile.read () )
scrambled = map (scramble, bytearray)
map (lambda x : outfile.write (chr (x) ), scrambled)

infile.close ()
outfile.flush ()
outfile.close ()
 
S

Steven D'Aprano

f = open ("binaryfile", "r")
bytearray = map (ord, f.read () )

Stores the content of binaryfile in the list bytearray.

If it's a binary file, you should open it in binary mode:

f = open ("binaryfile", "rb")
 
S

Steven D'Aprano

Hello everybody,
I need to read a text file byte after byte. Eache byte is sent to a
function that scramble it and I need to write the result to binary file.

I've got some questions -
1) How do I read the file byte by byte

f = open(filename, 'rb')
f.read(1)

will read a single byte.




2) Should I use streams?

What do you mean by "streams"?
 
C

census

Steven said:
If it's a binary file, you should open it in binary mode:

f = open ("binaryfile", "rb")

Add the "b" flag to both in and out file if you prefer it:

def scramble (a): return (a + 13) % 256

infile = open ("binin", "rb")
outfile = open ("binout", "wb")

bytearray = map (ord, infile.read () )
scrambled = map (scramble, bytearray)
map (lambda x : outfile.write (chr (x) ), scrambled)

infile.close ()
outfile.flush ()
outfile.close ()
 
D

Dennis Lee Bieber

def scramble (a): return (a + 13) % 256
I'll see your modulo rot 13 and raise with a exclusive or...

-=-=-=-=-

import sys

def scramble(block, key="don't look"):
copies = int(len(block) / len(key)) + 1
keystring = key * copies
return "".join([ chr( ord(block)
^ ord(keystring))
for i in range(len(block))])


def process(fin, fout, key=None):
din = open(fin, "rb")
dout = open(fout, "wb")
while True:
block = din.read(1024)
if not block: break
if key is None:
block = scramble(block)
else:
block = scramble(block, key)
dout.write(block)
dout.close()
din.close()

if __name__ == "__main__":
fin = sys.argv[1]
fout = sys.argv[2]
if len(sys.argv) > 3:
key = sys.argv[3]
else:
key = None
process(fin, fout, key)
 
T

Tim Chase

Steven said:
What do you mean by "streams"?

they're what come out of proton packs...just don't cross them.
It would be bad.

-tkc

(I suspect the OP is a Java/C++ programmer where "streams" are
somewhat akin to generators, but less powerful; so the answer is
"you can if you want, but it may not get you what you think you
want")
 
G

Grant Edwards

they're what come out of proton packs...just don't cross them.
It would be bad.

No kidding. Try to imagine all life as you know it stopping
instantaneously, and every molecule in your body exploding at
the speed of light.

I hate when that happens.
 
D

daved170

def scramble (a): return (a + 13) % 256

        I'll see your modulo rot 13 and raise with a exclusive or....

-=-=-=-=-

import sys

def scramble(block, key="don't look"):
    copies = int(len(block) / len(key)) + 1
    keystring = key * copies
    return "".join([ chr( ord(block)
                          ^ ord(keystring))
                     for i in range(len(block))])

def process(fin, fout, key=None):
    din = open(fin, "rb")
    dout = open(fout, "wb")
    while True:
        block = din.read(1024)
        if not block: break
        if key is None:
            block = scramble(block)
        else:
            block = scramble(block, key)
        dout.write(block)
    dout.close()
    din.close()

if __name__ == "__main__":
    fin = sys.argv[1]
    fout = sys.argv[2]
    if len(sys.argv) > 3:
        key = sys.argv[3]
    else:
        key = None
    process(fin, fout, key)



Thank you all.
Dennis I really liked you solution for the issue but I have two
question about it:
1) My origin file is Text file and not binary
2) I need to read each time 1 byte. I didn't see that on your example
code.
Thanks again All of you
Dave
 
S

Steven D'Aprano

Thank you all.
Dennis I really liked you solution for the issue but I have two question
about it:
1) My origin file is Text file and not binary

That's a statement, not a question.

2) I need to read each time 1 byte.

f = open(filename, 'r') # open in text mode
f.read(1) # read one byte
 
L

Lie Ryan

Thank you all.
Dennis I really liked you solution for the issue but I have two
question about it:
1) My origin file is Text file and not binary
2) I need to read each time 1 byte. I didn't see that on your example
code.

That's where you're confusing things. The counting unit in text is
characters, not bytes. Text is binary as well, it's just binary encoded
in specific way (like ASCII or UTF-8), and computers decoded that binary
stream into characters.

What you actually need? Reading the text character-per-character OR
treating an encoded text as binary and reading it byte-per-byte.

Rather, why don't you explain the problem you're trying to solve so we
can see which you actually need.
 
D

Dave Angel

daved170 said:
def scramble (a): return (a + 13) % 256
I'll see your modulo rot 13 and raise with a exclusive or...

-=-=-=-

import sys

def scramble(block, key=on't look"):
copies =nt(len(block) / len(key)) + 1
keystring =ey * copies
return "".join([ chr( ord(block)
^ ord(keystring))
for i in range(len(block))])

def process(fin, fout, key=ne):
din =pen(fin, "rb")
dout =pen(fout, "wb")
while True:
block =in.read(1024)
if not block: break
if key is None:
block =cramble(block)
else:
block =cramble(block, key)
dout.write(block)
dout.close()
din.close()

if __name__ ="__main__":
fin =ys.argv[1]
fout =ys.argv[2]
if len(sys.argv) > 3:
key =ys.argv[3]
else:
key =one
process(fin, fout, key)



Thank you all.
Dennis I really liked you solution for the issue but I have two
question about it:
1) My origin file is Text file and not binary
2) I need to read each time 1 byte. I didn't see that on your example
code.
Thanks again All of you
Dave

If you really need to see each byte of the file, you need to open it as
binary. You can then decide that the bytes represent text, in some
encoding. If you don't use the "b" flag, the library may change the
newline characters out from under you. You have to decide if that matters.

I may have missed it, but I don't think you ever explained why you
insist on the data being read one byte at a time. Usually, it's more
efficient to read it into a buffer, and process that one byte at a
time. But in any case, you can supply your own count to read().
Instead of using 1024, use 1.

DaveA
 
T

Tim Chase

Grant said:
One containing data encoded in base-2.

Or one of a system of two files that orbits around a common
center of mass? So if you see two files orbiting around a
cathedral, they're binary files.

f.open('binaryfile.bin', 'wb')
f.write(data.encode('binary'))
f.close()

:)

-tkc
 
D

Dennis Lee Bieber

Thank you all.
Dennis I really liked you solution for the issue but I have two
question about it:
1) My origin file is Text file and not binary

Do you need to process the bytes in the file as they are? Or do you
accept changes in line-endings (M$ Windows "text" files use <cr><lf> as
line ending, but if you read it in Python as "text" <cr><lf> is
converted to a single said:
2) I need to read each time 1 byte. I didn't see that on your example
code.

You've never explained why you need to READ 1 byte at a time, vs
reading a block (I chose 1KB) and processing each byte IN THE BLOCK.
After all, if you do use 1 byte I/O, your program is going to be very
slow, as each read is blocking (suspends) while asking the O/S for the
next character in the file (this depends upon the underlying I/O library
implementation -- I suspect any modern I/O system is still reading some
block size [256 to 4K] and then returning parts of that block as
needed). OTOH, reading a block at a time makes for one suspension and
then a lot of data to be processed however you want.

You originally stated that you want to "scramble" the bytes -- if
you mean to implement some sort of encryption algorithm you should know
that most of them work in blocks as the "key" is longer than one byte.

My sample reads in chunks, then the scramble function XORs each byte
with the corresponding byte in the supplied key string, finally
rejoining all the now individual bytes into a single chunk for
subsequent output.
 
N

Nobody

You originally stated that you want to "scramble" the bytes -- if
you mean to implement some sort of encryption algorithm you should know
that most of them work in blocks as the "key" is longer than one byte.

Block ciphers work in blocks. Stream ciphers work on bytes, regardless of
the length of the key.
 
M

MRAB

Nobody said:
Block ciphers work in blocks. Stream ciphers work on bytes, regardless of
the length of the key.
It's still more efficient to read in blocks, even if you're going to
process the bytes one at a time.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,888
Messages
2,569,964
Members
46,293
Latest member
BonnieHamb

Latest Threads

Top