reading and removing first x bytes of a file

B

bart_nessux

Hello,

I have some Macbinary files on a PC. I want to recursively read these
files and remove the first 128 bytes of the files if they contain the
macbinary header info. I know how to read directories recursively, but
how would I read the first 128 bytes of each file in the path?

Thanks,
Bart
 
P

Peter Hansen

bart_nessux said:
I have some Macbinary files on a PC. I want to recursively read these
files and remove the first 128 bytes of the files if they contain the
macbinary header info. I know how to read directories recursively, but
how would I read the first 128 bytes of each file in the path?

os.listdir() can return a list of the names in the path. You
can use os.path.isfile() to check that a given name doesn't
refer to a subdirectory.

Once you've opened one of the files for reading, just use .read(128)
to get a string containing the first 128 bytes from the file.

-Peter
 
B

Brian Gough

bart_nessux said:
I have some Macbinary files on a PC. I want to recursively read these
files and remove the first 128 bytes of the files if they contain the
macbinary header info. I know how to read directories recursively, but
how would I read the first 128 bytes of each file in the path?

You can use the file object .seek() and .read() methods to move around
in the file and read parts of it.

There is an example in the "Input and Output" chapter of the Python
Tutorial.
 
B

bart_nessux

Peter said:
os.listdir() can return a list of the names in the path. You
can use os.path.isfile() to check that a given name doesn't
refer to a subdirectory.

Once you've opened one of the files for reading, just use .read(128)
to get a string containing the first 128 bytes from the file.

-Peter

Thanks Peter, the below code recursively read the first 128B... am I
right in saying that? If so, now that I can read these bytes from all
..bin files in a directory, what would be the safest and fastest way of
removing them?

Bart

import os

def pc_macbinary_fix(path):
for root, dirs, files in os.walk(path):
for fname in files:
bin = os.path.splitext(fname)
if bin[1] == '.bin':
macbinary = file(os.path.join(root,fname), 'rb').read(128)
print "The file named:", fname, "contains: ", macbinary, "\n."

path = raw_input("Absolute path to the directory that contains the bin
files: ")
pc_macbinary_fix(path)
 
P

Peter Hansen

bart_nessux said:
Thanks Peter, the below code recursively read the first 128B... am I
right in saying that?

Well, your indentation is screwed up, for one thing, so I can't
guarantee the code does what you want. I'll leave actually testing
it up to you...
If so, now that I can read these bytes from all
.bin files in a directory, what would be the safest and fastest way of
removing them?

Define 'safe' and describe how fast you want it to run. <wink>

Anyway, you can't actually "remove" bytes from the files, so
what you really need to do is then read the *rest* of the bytes
(i.e. keep the file open after the first read, and do a .read()
of the rest of the data) and then write the shortened data to
a temporary file (module tempfile can be helpful here), then
once that's worked, use os.remove to remove the old file, and
os.rename to rename the temp file to the same name as the old
file.

-Peter
 
T

Terry Reedy

Peter Hansen said:
Anyway, you can't actually "remove" bytes from the files, so
what you really need to do is then read the *rest* of the bytes
(i.e. keep the file open after the first read, and do a .read()
of the rest of the data) and then write the shortened data to
a temporary file (module tempfile can be helpful here), then
once that's worked, use os.remove to remove the old file, and
os.rename to rename the temp file to the same name as the old
file.

IF the remainder of the file fits in memory, then it can 'moved up' by
rewriting same file after seeking to beginning of file. Don't know how
safe on MAC OSes (ie, degree to which write is atomic). Of course, if data
are important, one should have at least a few minutes of battery backup.

TJR
 
P

Peter Hansen

Terry said:
IF the remainder of the file fits in memory, then it can 'moved up' by
rewriting same file after seeking to beginning of file. Don't know how
safe on MAC OSes (ie, degree to which write is atomic). Of course, if data
are important, one should have at least a few minutes of battery backup.

I think it's a fair assumption that when the OP asked for a "safe"
way, he meant at least safer than that. If power failed in the
middle of the update, or a system crash occurred, the file could
well be unrecoverable...

-Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top