Can I change one line in a file without rewriting the whole thing?

J

J. J. Ramsey

In Perl, there is a module called "Tie::File". What it does is tie a
list to each line of a file. Change the list, and the file is
automatically changed, and on top of this, only the bits of the file
that need to be changed are written to disk. At least, that's the
general idea.

I was wondering if something roughly similar could be done in Python,
or at the very least, if I can avoid doing what amounts to reading the
whole file into memory, changing the copy in memory, and writing it
all out again.
 
D

David Wahler

In Perl, there is a module called "Tie::File". What it does is tie a
list to each line of a file. Change the list, and the file is
automatically changed, and on top of this, only the bits of the file
that need to be changed are written to disk. At least, that's the
general idea.

I was wondering if something roughly similar could be done in Python,
or at the very least, if I can avoid doing what amounts to reading the
whole file into memory, changing the copy in memory, and writing it
all out again.

The mechanism behind Perl's ties -- an array that, when read from or
written to, passes control to a user function -- is easy to implement
in Python. See the documentation of the mapping protocol
.... def __getitem__(self, index):
.... return "Line %d" % index
'Line 42'
From the documentation for Tie::File, it doesn't look like a trivial
piece of code; for example, it has to maintain a table in memory
containing the offset of each newline character it's seen for fast
seeking, and it has to handle moving large chunks of the file if the
length of a line changes. All this could be implemented in Python, but
I don't know of a ready-made version off the top of my head.

If all you want is to read the file line-by-line without having the
whole thing in memory at once, you can do "
 
G

Gabriel Genellina

In Perl, there is a module called "Tie::File". What it does is tie a
list to each line of a file. Change the list, and the file is
automatically changed, and on top of this, only the bits of the file
that need to be changed are written to disk. At least, that's the
general idea.

That usually means, rewriting from the first modified line to the end of
the file.
I was wondering if something roughly similar could be done in Python,
or at the very least, if I can avoid doing what amounts to reading the
whole file into memory, changing the copy in memory, and writing it
all out again.

Simplest aproach:

lines = list(open("myfile.txt"))
del lines[13]
lines[42] = "Look ma! Replacing line 42!\n"
open("myfile.txt","w").writelines(lines)

This of course reads the whole file in memory, but it's a compact way if
you require random line access.
If you can serialize the file operations, try using the fileinput module
with inplace=1.

(Having a true Tie::File implementation for Python would be a nice
addition to the available tools...)
 
G

greg

J. J. Ramsey said:
if I can avoid doing what amounts to reading the
whole file into memory, changing the copy in memory, and writing it
all out again.

Except in very special circumstances, not really.
If you do anything that makes a line longer or
shorter, everything after that line in the file
needs to be re-written, one way or another.

If you're doing this sort of thing a lot, and
need it to be faster than reading and rewriting the
file, you may need to look into using a more
sophisticated format on disk than a plain file.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top