Iteration on file reading

Discussion in 'Python' started by Paul Watson, Oct 2, 2003.

  1. Paul Watson

    Paul Watson Guest

    for line in sys.stdin:

    Does this statement cause all of stdin to be read before the loop begins?

    I may need to read several GB and I do not want to swamp the machine's
    memory.
    Paul Watson, Oct 2, 2003
    #1
    1. Advertising

  2. Paul Watson

    Andrew Dalke Guest

    Paul Watson
    > for line in sys.stdin:
    >
    > Does this statement cause all of stdin to be read before the loop begins?


    No. It will read a block of text at a time and break that block
    into lines. This gives great performance and is scalable to
    large files (so long as you can can afford to keep that extra
    block around). However, it's lousy for interactive work.

    Andrew
    Andrew Dalke, Oct 3, 2003
    #2
    1. Advertising

  3. Paul Watson

    Paul McGuire Guest

    Try a generator. This will just read a line at a time.
    -- Paul

    <code>
    from sys import stdin

    def lineReader( strm ):
    while 1:
    yield strm.readline().rstrip("\n")

    for f in lineReader( stdin ):
    print ">>> " + f
    </code>

    "Paul Watson" <> wrote in message
    news:3f7ca9ea$...
    > for line in sys.stdin:
    >
    > Does this statement cause all of stdin to be read before the loop begins?
    >
    > I may need to read several GB and I do not want to swamp the machine's
    > memory.
    >
    >
    Paul McGuire, Oct 4, 2003
    #3
  4. Paul Watson

    Andrew Dalke Guest

    Paul McGuire:
    > def lineReader( strm ):
    > while 1:
    > yield strm.readline().rstrip("\n")
    >
    > for f in lineReader( stdin ):
    > print ">>> " + f


    You can simplify that with the iter builtin.

    for f in iter(stdin.readline, ""):
    print ">>> " + f

    (Hmm... maybe I should test it? Naaaaahhh.)

    Andrew
    Andrew Dalke, Oct 4, 2003
    #4
  5. Paul Watson

    Just Guest

    In article <3f7ca9ea$>,
    "Paul Watson" <> wrote:

    > for line in sys.stdin:
    >
    > Does this statement cause all of stdin to be read before the loop begins?


    Nope.

    Just
    Just, Oct 4, 2003
    #5
  6. Andrew Dalke wrote:

    > Paul McGuire:
    >> def lineReader( strm ):
    >> while 1:
    >> yield strm.readline().rstrip("\n")
    >>
    >> for f in lineReader( stdin ):
    >> print ">>> " + f

    >
    > You can simplify that with the iter builtin.
    >
    > for f in iter(stdin.readline, ""):
    > print ">>> " + f
    >
    > (Hmm... maybe I should test it? Naaaaahhh.)


    There is a difference in behavior: the readline method
    returns a line WITH a trailing \n, which then gets
    printed, giving a "double-spaced" effect. Sure, you
    can strip the \n in the loop body, but if you always
    want a sequence of newline-stipped lines, that is
    somewhat repetitious. If the use of readline is
    mandated (i.e., no direct looping on the file for one
    reason or another), my favourite way of expression is:

    def linesof(somefile):
    for line in iter(somefile.readline, ''):
    yield line.rstrip('\n')

    not as concise as either of the above, but, I think,
    a wee little bit clearer.


    Alex
    Alex Martelli, Oct 4, 2003
    #6
  7. "Paul Watson" <> wrote in message news:<3f7ca9ea$>...
    > for line in sys.stdin:
    >
    > Does this statement cause all of stdin to be read before the loop begins?
    >
    > I may need to read several GB and I do not want to swamp the machine's
    > memory.


    Have you considered simply inputting this into an interactive
    interpreter and seeing if it swamps the machine's memory?

    Jeremy
    Jeremy Fincher, Oct 4, 2003
    #7
  8. Paul Watson

    Andrew Dalke Guest

    Alex:
    > There is a difference in behavior: the readline method
    > returns a line WITH a trailing \n, which then gets
    > printed, giving a "double-spaced" effect. Sure, you
    > can strip the \n in the loop body, ....


    Quite true.

    As it turns out, the OP wanted to know about

    for line in sys.stdin:

    The post to which I replied changed the spec to
    remove the newline, but the main point was to
    use a generator ... which could if desired to extra
    work to get rid of the "\n". It could just have
    easily converted everything to uppercase or done
    rot13 conversion on the text.

    My reply meant to point out that the iter builtin
    can be used to turn a "function returns the next
    object each time it's called and a sentinel when
    it's done" into an iterable. I just left out the extra
    work his code did since it wasn't needed by the OP.

    Andrew
    Andrew Dalke, Oct 4, 2003
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Danny Anderson

    open new file each loop iteration

    Danny Anderson, Jan 21, 2004, in forum: C++
    Replies:
    0
    Views:
    429
    Danny Anderson
    Jan 21, 2004
  2. Dennis Schulz

    iteration through a file of structs

    Dennis Schulz, May 8, 2004, in forum: C Programming
    Replies:
    2
    Views:
    328
    -berlin.de
    May 8, 2004
  3. Rudi
    Replies:
    5
    Views:
    4,958
  4. Steven Demonnin
    Replies:
    7
    Views:
    107
    Robert Dober
    Jun 20, 2009
  5. Kyle Barbour
    Replies:
    10
    Views:
    571
    Marvin Gülker
    Aug 2, 2010
Loading...

Share This Page