interaction of mode 'r+', file.write(), and file.tell(): a bug orundefined behavior?

L

Lie Ryan

In the code:

"""
f = open('input.txt', 'r+')
for line in f:
s = line.replace('python', 'PYTHON')
# f.tell()
f.write(s)
"""

When f.tell() is commented, 'input.txt' does not change; but when
uncommented, the f.write() succeeded writing into the 'input.txt'
(surprisingly, but not entirely unexpected, at the end of the file).


$ #####################################
$
$ cp orig.txt input.txt
$ cat input.txt
abcde
abc python abc
python abc python
$ python
Python 2.6.4 (r264:75706, Jan 12 2010, 05:24:27)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information..... s = line.replace('python', 'PYTHON')
.... f.write(s)
....$ cat input.txt
abcde
abc python abc
python abc python
$
$ #####################################
$
$ cp orig.txt input.txt
$ cat input.txt
abcde
abc python abc
python abc python
$ python
Python 2.6.4 (r264:75706, Jan 12 2010, 05:24:27)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information..... s = line.replace('python', 'PYTHON')
.... f.tell()
.... f.write(s)
....
39
45
60$ cat input.txt
abcde
abc python abc
python abc python
abcde
abc PYTHON abc
$
$ #####################################



Do you think this should be a bug or undefined behavior governed by the
underlying OS and C library? Shouldn't file.tell() be purely
informational, and not have side effect?




The machine is Gentoo (amd64, gcc-4.3.4, glibc-2.10.1-r1), Linux
(2.6.31-gentoo-r6), and Python 2.6.4
 
A

Anthony Tolle

In the code:

"""
f = open('input.txt', 'r+')
for line in f:
    s = line.replace('python', 'PYTHON')
    # f.tell()
    f.write(s)
"""
[snip]

My guess is that there are a few possible problems:

1) In this case, writing to file opened with 'r+' without an explicit
f.seek is probably not a good idea. The file iterator (for line in f)
uses a readahead buffer, which means you can't guarantee what the
current file position will be.

2) It may be necessary to do an explicit f.flush or f.close when
writing to an 'r+' file. In your case, the close should automatically
happen when the f object falls out of scope, which tells me that were
still looking at some other problem, like not using f.seek

3) It is possible that f.tell implicitly flushes buffers used by the
file object. That would explain why uncommenting the f.tell causes
the writes to show up.


What are you trying to accomplish? Overwrite the original file, or
append to it? If you want to overwrite the file, it may be better to
generate a new file, delete the old one, then rename the new one. If
you want to append, then it would be better to open the file with
append mode ('a')
 
A

Alf P. Steinbach

* Anthony Tolle:
In the code:

"""
f = open('input.txt', 'r+')
for line in f:
s = line.replace('python', 'PYTHON')
# f.tell()
f.write(s)
"""
[snip]

My guess is that there are a few possible problems:

1) In this case, writing to file opened with 'r+' without an explicit
f.seek is probably not a good idea. The file iterator (for line in f)
uses a readahead buffer, which means you can't guarantee what the
current file position will be.

2) It may be necessary to do an explicit f.flush or f.close when
writing to an 'r+' file. In your case, the close should automatically
happen when the f object falls out of scope, which tells me that were
still looking at some other problem, like not using f.seek

3) It is possible that f.tell implicitly flushes buffers used by the
file object. That would explain why uncommenting the f.tell causes
the writes to show up.

As far as I understand it the behavior stems from CPython file operations being
implemented fairly directly as forwarding to C library FILE* operations, and the
C standard prescribes Undefined Behavior to the case above.

I think the Python language/library specification should specify the effect
(perhaps just as UB, but anyway, specified).

For as it is, it may/will be different with different Python implementations,
meaning that code that works OK with one implementation may fail with another
implementation.


What are you trying to accomplish? Overwrite the original file, or
append to it? If you want to overwrite the file, it may be better to
generate a new file, delete the old one, then rename the new one. If
you want to append, then it would be better to open the file with
append mode ('a')

Cheers,

- Alf
 
A

Aahz

f = open('input.txt', 'r+')
for line in f:
s = line.replace('python', 'PYTHON')
# f.tell()
f.write(s)

When f.tell() is commented, 'input.txt' does not change; but when
uncommented, the f.write() succeeded writing into the 'input.txt'
(surprisingly, but not entirely unexpected, at the end of the file).

Another possible issue is that using a file iterator is generally not
compatible with direct file operations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top