Re: iterating over the lines of a file - difference between Python2.7 and 3?

Discussion in 'Python' started by Peter Otten, Jan 17, 2013.

  1. Peter Otten

    Peter Otten Guest

    Wolfgang Maier wrote:

    > I just came across an unexpected behavior in Python 3.3, which has to do
    > with file iterators and their interplay with other methods of file/IO
    > class methods, like readline() and tell(): Basically, I got used to the
    > fact that it is a bad idea to mix them because the iterator would use that
    > hidden read-ahead buffer, so what you got with subsequent calls to
    > readline() or tell() was what was beyond that buffer, but not the next
    > thing after what the iterator just returned.
    >
    > Example:
    >
    > in_file_object=open(‘some_file’,’rb’)
    >
    > for line in in_file_object:
    >
    > print (line)
    >
    > if in_file_object.tell() > 300:
    >
    > # assuming that individual lines are
    > # shorter
    >
    > break
    >
    >
    >
    > This wouldn´t print anything in Python 2.7 since next(in_file_object)
    > would read ahead beyond the 300 position immediately, as evidenced by a
    > subsequent call to in_file_object.tell() (returning 8192 on my system).
    >
    > However, I find that under Python 3.3 this same code works: it prints some
    > lines from my file and after completing in_file_object.tell() returns a
    > quite reasonable 314 as the current position in the file.
    >
    > I couldn´t find this difference anywhere in the documentation. Is the 3.3
    > behavior official, and if so, when was it introduced and how is it
    > implemented? I assume the read-ahead buffer still exists?


    You can get the Python 3 behaviour with io.open() in Python 2.7. There is an
    implementation in Python in _pyio.py:

    def tell(self):
    return _BufferedIOMixin.tell(self) - len(self._read_buf) +
    self._read_pos


    > By the way, the 3.3 behavior only works in binary mode. In text mode, the
    > code will raise an OSError: telling position disabled by next() call. In
    > Python 2.7 there was no difference between the binary and text mode
    > behavior. Could not find this documented either.
     
    Peter Otten, Jan 17, 2013
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jakk
    Replies:
    4
    Views:
    12,393
  2. Richard
    Replies:
    5
    Views:
    328
    Hari Pulapaka
    Aug 13, 2004
  3. Terry Reedy
    Replies:
    0
    Views:
    160
    Terry Reedy
    Jan 17, 2013
  4. Wolfgang Maier
    Replies:
    0
    Views:
    148
    Wolfgang Maier
    Jan 17, 2013
  5. Peter Otten
    Replies:
    0
    Views:
    148
    Peter Otten
    Jan 17, 2013
Loading...

Share This Page