popen2 with large input

Discussion in 'Python' started by cherico, Jan 29, 2004.

  1. cherico

    cherico Guest

    from popen2 import popen2

    r, w = popen2 ( 'tr "[A-Z]" "[a-z]"' )
    w.write ( t ) # t is a text file of around 30k bytes
    w.close ()
    text = r.readlines ()
    print text
    r.close ()

    This simple script halted on

    w.write ( t )

    Anyone knows what the problem is?
     
    cherico, Jan 29, 2004
    #1
    1. Advertising

  2. cherico

    Eric Brunel Guest

    cherico wrote:
    > from popen2 import popen2
    >
    > r, w = popen2 ( 'tr "[A-Z]" "[a-z]"' )
    > w.write ( t ) # t is a text file of around 30k bytes
    > w.close ()
    > text = r.readlines ()
    > print text
    > r.close ()
    >
    > This simple script halted on
    >
    > w.write ( t )
    >
    > Anyone knows what the problem is?


    Yep: deadlock... Pipes are synchronized: you can't read from (resp. write to) a
    pipe if the process at the other end does not write to (resp. read from) it. If
    you try the command "tr '[A-Z]' '[a-z]'" interactively, you'll see that
    everytime tr receives a line, it outputs *immediately* the converted line. So if
    you write a file having several lines to the pipe, on the first \n, tr will try
    to write to its output, and will be stuck since your program is not reading from
    it. So it won't read on its input anymore, so your program will be stuck because
    it can't write to the pipe. And they'll wait for each other until the end of
    times...

    If you really want to use the "tr" command for this stuff, you'd better send
    your text lines by lines and read the result immediatly, like in:

    text = ''
    for line in text.splitlines(1):
    w.write(line)
    w.flush() # Mandatory because of output bufferization - see below
    text += r.readline()
    w.close()
    r.close()

    It *may* work better, but you cannot be sure: in fact, you just can't know
    exactly when tr will actually output the converted text. Even worse: since
    output is usually buffered, you'll only see the output from tr when its standard
    output is flushed, and you can't know when that will be...

    (BTW, the script above does not work on my Linux box: the first r.readline()
    never returns...)

    So the conclusion is: don't use pipes unless you're really forced to. They're a
    hell to use, since you never know how to synchronize them.

    BTW, if the problem you posted is your real problem, why on earth don't you do:
    text = t.lower()
    ???

    HTH
    --
    - Eric Brunel <eric dot brunel at pragmadev dot com> -
    PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com
     
    Eric Brunel, Jan 29, 2004
    #2
    1. Advertising

  3. cherico

    Jeff Epler Guest

    The connection to the child process created by the popen family have
    some inherent maximum size for data "in flight". I'm not sure how to
    find out what that value is, but it might be anywhere from a few bytes
    to a few K.

    So tr starts to write its output as it gets input, but you won't read
    its output before you've written all your output. If the size of tr's
    output is bigger than the size of the buffer for tr's unread output,
    you'll deadlock.

    As an aside, the particular problem you pose can be solved with Python's
    str.translate method. If the actual goal is to "work like tr", then use
    that instead and forget about popen.

    Anyway, to solve the popen2 problem, you'll need to write something like this:
    [untested, and as you can see there's lots of pseudocode]
    def getoutput( command, input ):
    r, w = popen2(command)
    rr = [r]; ww = [w]
    output = []
    set r and w nonblocking
    while 1:
    _r, _w, _ = select.select(rr, ww, [], 0)

    if _w:
    write some stuff from input to w
    if nothing left:
    w.close(); ww = []
    if _r:
    read some stuff into output
    if nothing to read:
    handle the fact that r was closed
    if w was closed: break
    else: probably an error condition
    return "".join(output)

    You could also write 'input' into a temporary file and use
    commands.getoutput() or os.popen(.., "r").

    Jeff
     
    Jeff Epler, Jan 29, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guy

    popen2

    Guy, Aug 12, 2003, in forum: Python
    Replies:
    1
    Views:
    2,866
    Donn Cave
    Aug 12, 2003
  2. Diez B. Roggisch

    popen2 trouble

    Diez B. Roggisch, Apr 2, 2004, in forum: Python
    Replies:
    2
    Views:
    336
    Diez B. Roggisch
    Apr 5, 2004
  3. A. Lloyd Flanagan

    Possible problem with popen2 module

    A. Lloyd Flanagan, Apr 30, 2004, in forum: Python
    Replies:
    2
    Views:
    331
    A. Lloyd Flanagan
    May 3, 2004
  4. Guest
    Replies:
    1
    Views:
    488
    Donn Cave
    Jun 15, 2004
  5. Jeffrey Barish

    Bug in popen2.Popen3?

    Jeffrey Barish, Jun 16, 2004, in forum: Python
    Replies:
    1
    Views:
    365
    Donn Cave
    Jun 17, 2004
Loading...

Share This Page