Reading from stdin

Discussion in 'Python' started by Luis Zarrabeitia, Oct 7, 2008.

  1. I have a problem with this piece of code:

    ====
    import sys
    for line in sys.stdin:
    print "You said!", line
    ====

    Namely, it seems that the stdin buffers the input, so there is no reply until
    a huge amount of text has bin written. The iterator returned by xreadlines
    has the same behavior.

    The stdin.readline() function doesn't share that behaviour (it returns as soon
    as I hit 'enter').

    ??Is there any way to tell stdin's iterator not to buffer the input? Is it
    part of the standard file protocol?

    --
    Luis Zarrabeitia (aka Kyrie)
    Fac. de Matemática y Computación, UH.
    http://profesores.matcom.uh.cu/~kyrie
    Luis Zarrabeitia, Oct 7, 2008
    #1
    1. Advertising

  2. In message <>, Luis
    Zarrabeitia wrote:

    > I have a problem with this piece of code:
    >
    > ====
    > import sys
    > for line in sys.stdin:
    > print "You said!", line
    > ====
    >
    > Namely, it seems that the stdin buffers the input, so there is no reply
    > until a huge amount of text has bin written. The iterator returned by
    > xreadlines has the same behavior.
    >
    > The stdin.readline() function doesn't share that behaviour (it returns as
    > soon as I hit 'enter').


    Perhaps line-buffering simply doesn't apply when you use a file object as an
    iterator.
    Lawrence D'Oliveiro, Oct 7, 2008
    #2
    1. Advertising

  3. Luis Zarrabeitia wrote:
    > I have a problem with this piece of code:
    >
    > ====
    > import sys
    > for line in sys.stdin:
    > print "You said!", line
    > ====
    >
    > Namely, it seems that the stdin buffers the input, so there is no reply until
    > a huge amount of text has bin written. The iterator returned by xreadlines
    > has the same behavior.
    >
    > The stdin.readline() function doesn't share that behaviour (it returns as soon
    > as I hit 'enter').
    >
    > ??Is there any way to tell stdin's iterator not to buffer the input? Is it
    > part of the standard file protocol?


    Not an answer to your actual question, but you can keep the 'for' loop
    instead of rewriting it with 'while' using the iter(function,
    sentinel) idiom:

    for line in iter(sys.stdin.readline, ""):
    print "You said!", line

    George
    George Sakkis, Oct 7, 2008
    #3
  4. On Tuesday 07 October 2008 05:33:18 pm George Sakkis wrote:
    > Not an answer to your actual question, but you can keep the 'for' loop
    > instead of rewriting it with 'while' using the iter(function,
    > sentinel) idiom:
    >
    > for line in iter(sys.stdin.readline, ""):
    > print "You said!", line


    You're right, it's not an answer to my actual question, but I had completely
    forgotten about the 'sentinel' idiom. Many thanks... I was trying to do it
    with 'itertools', obviously with no luck.

    The question still stands (how to turn off the buffering), but this is a nice
    workaround until it gets answered.

    --
    Luis Zarrabeitia (aka Kyrie)
    Fac. de Matemática y Computación, UH.
    http://profesores.matcom.uh.cu/~kyrie
    Luis Zarrabeitia, Oct 8, 2008
    #4
  5. On Tuesday 07 October 2008 05:12:28 pm Lawrence D'Oliveiro wrote:
    > In message <>, Luis
    >
    > Zarrabeitia wrote:
    > > I have a problem with this piece of code:
    > >
    > > ====
    > > import sys
    > > for line in sys.stdin:
    > > print "You said!", line
    > > ====
    > >
    > > Namely, it seems that the stdin buffers the input, so there is no reply
    > > until a huge amount of text has bin written. The iterator returned by
    > > xreadlines has the same behavior.
    > >
    > > The stdin.readline() function doesn't share that behaviour (it returns as
    > > soon as I hit 'enter').

    >
    > Perhaps line-buffering simply doesn't apply when you use a file object as
    > an iterator.


    You cut out the question you replied to, but left the rest. I got a bit
    confused until I remembered that *I* wrote the email :D.

    Anyway, I changed the program to:

    ===
    buff = file("test")
    for line in buff:
    print "you said", line
    ===

    where 'test' is a named pipe (mkfifo test) to see if the line-buffering also
    happened with a file object, and it does. As with stdin, nothing gets printed
    until the end of the file or it receives a huge amount of lines, but
    using '.readline()' works immediately. So it seems that the buffering
    behavior happens by default on stdin and file. It makes sense, as type(stdin)
    is 'file'. I can't test it now, but I think the sockets also do input
    buffering. I guess one doesn't notice it on the general case because disk
    reading happens too fast to see the delay.

    That raises a related question: is there any use-case where is better to lock
    the input until a lot of data is received, even when the requested data is
    already available? Output buffering is understandable and desired (how do I
    turn it off, by the way?), and even that one wont lock unless requested to
    lock (flush), but I can't find examples where input buffering helps.

    (full example with pipes)
    $ mkfifo test
    $ cat > test
    [write data here]

    on another console, just execute the script.

    Oh, I forgot:
    Linux 2.6.24, python 2.5.2, Debian's standard build. I don't have windows at
    hand to try it.

    --
    Luis Zarrabeitia (aka Kyrie)
    Fac. de Matemática y Computación, UH.
    http://profesores.matcom.uh.cu/~kyrie
    Luis Zarrabeitia, Oct 8, 2008
    #5
  6. On Oct 7, 8:13 pm, Luis Zarrabeitia <> wrote:
    > On Tuesday 07 October 2008 05:33:18 pm George Sakkis wrote:
    >
    > > Not an answer to your actual question, but you can keep the 'for' loop
    > > instead of rewriting it with 'while' using the iter(function,
    > > sentinel) idiom:

    >
    > > for line in iter(sys.stdin.readline, ""):
    > >     print "You said!", line

    >
    > You're right, it's not an answer to my actual question, but I had completely
    > forgotten about the 'sentinel' idiom. Many thanks... I was trying to do it
    > with 'itertools', obviously with no luck.
    >
    > The question still stands (how to turn off the buffering), but this is a nice
    > workaround until it gets answered.


    The closest answer I found comes from the docs (http://docs.python.org/
    library/stdtypes.html#file-objects):

    """
    In order to make a for loop the most efficient way of looping over the
    lines of a file (a very common operation), the next() method uses a
    hidden read-ahead buffer. As a consequence of using a read-ahead
    buffer, combining next() with other file methods (like readline())
    does not work right.
    """

    I guess the phrasing "hidden read-ahead buffer" implies that buffering
    cannot be turned off (or at least it is not intended to even if it's
    somehow possible).

    George
    George Sakkis, Oct 8, 2008
    #6
  7. On Tuesday 07 October 2008 11:27:19 pm George Sakkis wrote:
    > """
    > In order to make a for loop the most efficient way of looping over the
    > lines of a file (a very common operation), the next() method uses a
    > hidden read-ahead buffer. As a consequence of using a read-ahead
    > buffer, combining next() with other file methods (like readline())
    > does not work right.
    > """
    >
    > I guess the phrasing "hidden read-ahead buffer" implies that buffering
    > cannot be turned off (or at least it is not intended to even if it's
    > somehow possible).


    Hmm. I wonder how those optimizations look like. Apparently, readline() cannot
    read from that read-ahead buffer, and that by itself sounds bad. Currently,
    if you loop a few times with next, you cannot use readline afterwards until
    you seek() to an absolute position.

    Actually, I think I may be replying to myself here. I imagine that 'next' will
    read a block instead of a character, and look for lines in there, and as the
    underlying OS likely blocks until the whole block is read, 'next' cannot
    avoid it. That doesn't explain, though, why readline() can't use next's
    buffer, why next doesn't have a sensible timeout for interactive sessions
    (unless the OS doesn't support it), and why the readahead cannot be turned
    off.

    I think I'll have to stick for now with the iter(function,sentinel) solution.

    And I may try to find next()'s implementation... I guess I'll be downloading
    python's source when my bandwidth allows it (or find it on a browseable
    repository)

    On a related note, help(file.read) shows:

    =====
    read(...)
    read([size]) -> read at most size bytes, returned as a string.

    If the size argument is negative or omitted, read until EOF is reached.
    Notice that when in non-blocking mode, less data than what was requested
    may be returned, even if no size parameter was given.
    =====

    But it doesn't say how to put the file object in non-blocking mode. (I was
    trying to put the file object in non-blocking mode to test next()'s
    behavior). ??Ideas?

    --
    Luis Zarrabeitia (aka Kyrie)
    Fac. de Matemática y Computación, UH.
    http://profesores.matcom.uh.cu/~kyrie
    Luis Zarrabeitia, Oct 8, 2008
    #7
  8. Gabriel Genellina, Oct 14, 2008
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Johnathan Doe

    peek at stdin, flush stdin

    Johnathan Doe, May 15, 2004, in forum: C Programming
    Replies:
    5
    Views:
    24,818
    Chatoyer
    May 17, 2013
  2. Charlie Zender

    Reading stdin once confuses second stdin read

    Charlie Zender, Jun 19, 2004, in forum: C Programming
    Replies:
    6
    Views:
    764
    Dan Pop
    Jun 21, 2004
  3. Ben
    Replies:
    2
    Views:
    1,323
    jacob navia
    Aug 29, 2009
  4. Terry Cooper
    Replies:
    7
    Views:
    418
    Janos Sebok
    Jun 9, 2009
  5. Stefano Sabatini
    Replies:
    6
    Views:
    280
    Stefano Sabatini
    Jul 29, 2007
Loading...

Share This Page