Caleb said:
Peter
Is there disk access on every iteration? I'm guessing yes? It
shouldn't be an issue in the vast majority of cases, but I'm naturally
curious
Short answer:
No, it's buffered.
Long answer:
This buffer is actually what causes the problems in interactions between
uses of the next method and readline, seek, etc:
py> f = file('temp.txt')
py> for line in f:
.... print line,
.... break
....
line 1
py> f.read()
''
py> for line in f:
.... print line,
....
line 2
line 3
Using the iteration protocol (specificaly, when file.next is called)
causes the file object to read part of the file into a buffer for the
iterator. The read method doesn't access the same buffer, and sees that
(because the file is so small) we've already seeked to the end of the
file, so it returns '' to signal that the entire file has been read,
even though we have not finished iterating. The iterator however, which
has access to the buffer, can still complete its iteration.
The moral of the story is that, in general, you should only use the file
as an iterator after you are done calling read, readline, etc. unless
you want to keep track of the file position and do an appropriate
file.seek() call after each use of the iterator.
Steve