Re: proposal: another file iterator

Discussion in 'Python' started by Jean-Paul Calderone, Jan 16, 2006.

  1. On 15 Jan 2006 16:44:24 -0800, Paul Rubin <"http://phr.cx"@nospam.invalid> wrote:
    >I find pretty often that I want to loop through characters in a file:
    >
    > while True:
    > c = f.read(1)
    > if not c: break
    > ...
    >
    >or sometimes of some other blocksize instead of 1. It would sure
    >be easier to say something like:
    >
    > for c in f.iterbytes(): ...
    >
    >or
    >
    > for c in f.iterbytes(blocksize): ...
    >
    >this isn't anything terribly advanced but just seems like a matter of
    >having the built-in types keep up with language features. The current
    >built-in iterator (for line in file: ...) is useful for text files but
    >can potentially read strings of unbounded size, so it's inadvisable for
    >arbitrary files.
    >
    >Does anyone else like this idea?


    It's a pretty useful thing to do, but the edge-cases are somewhat complex. When I just want the dumb version, I tend to write this:

    for chunk in iter(lambda: f.read(blocksize), ''):
    ...

    Which is only very slightly longer than your version. I would like it even more if iter() had been written with the impending doom of lambda in mind, so that this would work:

    for chunk in iter('', f.read, blocksize):
    ...

    But it's a bit late now. Anyhow, here are some questions about your iterbytes():

    * Would it guarantee the chunks returned were read using a single read? If blocksize were a multiple of the filesystem block size, would it guarantee reads on block-boundaries (where possible)?

    * How would it handle EOF? Would it stop iterating immediately after the first short read or would it wait for an empty return?

    * What would the buffering behavior be? Could one interleave calls to .next() on whatever iterbytes() returns with calls to .read() on the file?

    Jean-Paul
     
    Jean-Paul Calderone, Jan 16, 2006
    #1
    1. Advertising

  2. Jean-Paul Calderone

    Paul Rubin Guest

    Jean-Paul Calderone <> writes:
    > Which is only very slightly longer than your version. I would like
    > it even more if iter() had been written with the impending doom of
    > lambda in mind, so that this would work:
    >
    > for chunk in iter('', f.read, blocksize):
    > ...
    >
    > But it's a bit late now.


    Well, iter could be extended to take *args and **kwargs parameters, so
    you'd say

    for chunk in iter(f.read, '', (blocksize,)): ...

    That leaves the params in a clumsy order for backwards compatibility,
    but it's not unbearable.

    > Anyhow, here are some questions about your iterbytes():
    >
    > * Would it guarantee the chunks returned were read using a single
    > read? If blocksize were a multiple of the filesystem block size,
    > would it guarantee reads on block-boundaries (where possible)?


    I expect that the iterator's .next() would just get the result
    of f.read(blocksize), which makes no such guarantees.

    > * How would it handle EOF? Would it stop iterating immediately
    >after the first short read or would it wait for an empty return?


    Wait for empty return.

    > * What would the buffering behavior be? Could one interleave
    > calls to .next() on whatever iterbytes() returns with calls to
    > .read() on the file?


    I don't see why not. Iterbytes would just call read() and yield the
    result. You could even have multiple iterators going at once.
     
    Paul Rubin, Jan 16, 2006
    #2
    1. Advertising

  3. On 15 Jan 2006 18:58:53 -0800, Paul Rubin <http://> wrote:

    >Jean-Paul Calderone <> writes:
    >> Which is only very slightly longer than your version. I would like
    >> it even more if iter() had been written with the impending doom of
    >> lambda in mind, so that this would work:
    >>
    >> for chunk in iter('', f.read, blocksize):
    >> ...
    >>
    >> But it's a bit late now.

    >
    >Well, iter could be extended to take *args and **kwargs parameters, so
    >you'd say
    >
    > for chunk in iter(f.read, '', (blocksize,)): ...
    >

    Whatever "Which" refers to got snipped or may be in a post that hasn't
    become visible for me yet, but I assume Jean-Paul was referring to lambda use
    as in e.g. (untested):

    for chunk in iter(lambda frd=open('yerf', 'rb').read:frd(blocksize), ''): ...

    Does it not do what you want?

    Regards,
    Bengt Richter
     
    Bengt Richter, Jan 16, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hendrik Maryns
    Replies:
    18
    Views:
    1,459
  2. greg
    Replies:
    6
    Views:
    476
    Dietmar Kuehl
    Jul 17, 2003
  3. Paul Rubin

    proposal: another file iterator

    Paul Rubin, Jan 16, 2006, in forum: Python
    Replies:
    1
    Views:
    274
    Raymond Hettinger
    Jan 16, 2006
  4. Replies:
    6
    Views:
    683
    Jim Langston
    Oct 30, 2005
  5. Steven D'Aprano

    What makes an iterator an iterator?

    Steven D'Aprano, Apr 18, 2007, in forum: Python
    Replies:
    28
    Views:
    1,244
    Steven D'Aprano
    Apr 20, 2007
Loading...

Share This Page