Re: iterating over the lines of a file - difference between Python2.7 and 3?

Discussion in 'Python' started by Terry Reedy, Jan 17, 2013.

  1. Terry Reedy

    Terry Reedy Guest

    On 1/17/2013 7:04 AM, Peter Otten wrote:
    > Wolfgang Maier wrote:
    >
    >> I just came across an unexpected behavior in Python 3.3, which has to do
    >> with file iterators and their interplay with other methods of file/IO
    >> class methods, like readline() and tell(): Basically, I got used to the
    >> fact that it is a bad idea to mix them because the iterator would use that
    >> hidden read-ahead buffer, so what you got with subsequent calls to
    >> readline() or tell() was what was beyond that buffer, but not the next
    >> thing after what the iterator just returned.
    >>
    >> Example:
    >>
    >> in_file_object=open(‘some_file’,’rb’)
    >>
    >> for line in in_file_object:
    >>
    >> print (line)
    >>
    >> if in_file_object.tell() > 300:
    >>
    >> # assuming that individual lines are
    >> # shorter
    >>
    >> break
    >>
    >>
    >>
    >> This wouldn´t print anything in Python 2.7 since next(in_file_object)
    >> would read ahead beyond the 300 position immediately, as evidenced by a
    >> subsequent call to in_file_object.tell() (returning 8192 on my system).
    >>
    >> However, I find that under Python 3.3 this same code works: it prints some
    >> lines from my file and after completing in_file_object.tell() returns a
    >> quite reasonable 314 as the current position in the file.
    >>
    >> I couldn´t find this difference anywhere in the documentation. Isthe 3.3
    >> behavior official, and if so, when was it introduced and how is it
    >> implemented? I assume the read-ahead buffer still exists?

    >
    > You can get the Python 3 behaviour with io.open() in Python 2.7. There is an
    > implementation in Python in _pyio.py:
    >
    > def tell(self):
    > return _BufferedIOMixin.tell(self) - len(self._read_buf) +
    > self._read_pos


    In 2.7, open returns file object, which is a thin wrapper of the
    particular (proprietary) C compiler stdio library. They vary because the
    C standard leaves some things implementation-defined, and people
    interpret differently (no official test suite, at least not originally),
    and people make mistakes. The io module is intended to bring more
    uniformity, and there is a test suite for other implementations to match
    actual behavior to.

    --
    Terry Jan Reedy
     
    Terry Reedy, Jan 17, 2013
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jakk
    Replies:
    4
    Views:
    12,385
  2. Richard
    Replies:
    5
    Views:
    328
    Hari Pulapaka
    Aug 13, 2004
  3. Peter Otten
    Replies:
    0
    Views:
    190
    Peter Otten
    Jan 17, 2013
  4. Wolfgang Maier
    Replies:
    0
    Views:
    147
    Wolfgang Maier
    Jan 17, 2013
  5. Peter Otten
    Replies:
    0
    Views:
    148
    Peter Otten
    Jan 17, 2013
Loading...

Share This Page