Capturing stdout incrementally

M

Moosebumps

I have a large set of Python scripts that interface with command line
utilities (primarily Perforce). I am currently capturing ALL the text
output in order to get results and such. I am using the popen functions to
get the stdout, stderr streams.

However, some of the operations take a really long time (copying large files
over the network). If you run Perforce directly (or through os.system,
which doesn't return text output), it shows which files are getting copied,
one at a time. However, if I'm calling it through Python's popen, it
appears to hang while it copies all the files, then suddenly all the text
output appears at once after the operation is done.

Does anyone know a way around this? It is problematic because people think
that the program has hung, when it is really just taking a long time. I
would like the normal stdout to be printed on the screen as it is normally
(but also captured by my Python script simultaneously).

I am on Windows by the way, so the utilities are printing to the windows
command shell.

thanks for any advice,
MB
 
J

Josiah Carlson

I am on Windows by the way, so the utilities are printing to the windows
command shell.

By default, popen in Windows buffers everything coming from the called
app. I believe you can use various Windows system calls in pywin32 in
order to get line buffered output, but I've never done it, so have
little additional advice other than "check out pywin32".

- Josiah
 
D

David Bolen

Josiah Carlson said:
By default, popen in Windows buffers everything coming from the called
app. I believe you can use various Windows system calls in pywin32 in
order to get line buffered output, but I've never done it, so have
little additional advice other than "check out pywin32".

Typically this behavior on Windows (and at least in my experience on
Unix) comes not from popen buffering its input from the called
program, but the called program buffering its output when it isn't a
normal console. I'm not sure if this is default behavior from the C
library, but it's particularly true of cross-platform code that often
uses isatty() to check.

You can even see this behavior with Python itself (e.g., if you pipe a
copy of python to run a script).

In general, the only way to fix this is through the program being
called, and not the program doing the calling. In some cases, the
problem may have a way to disable buffering (for example, the "-u"
command line option with Python), or if it's something you have source
for you can explicitly disable output buffering. For example, Perl
has no command line option, but you can add code to the script to
disable output buffering.

On Unix, a classic way around code for which you have no source is to
run it under expect or some other pty-simulating code rather than a
simple pipe with popen. I'm not sure if there's a good pty/expectish
module that works well under Windows (where simulating what a
"console" is can be tougher).

I've only got an evaluation copy of perforce lying around, but I don't
immediately see any way to control its buffering via command line
options.

-- David
 
C

Cameron Laird

.
.
.
In general, the only way to fix this is through the program being
called, and not the program doing the calling. In some cases, the
.
[still more
true stuff]
.
.
On Unix, a classic way around code for which you have no source is to
run it under expect or some other pty-simulating code rather than a
simple pipe with popen. I'm not sure if there's a good pty/expectish
module that works well under Windows (where simulating what a
"console" is can be tougher).
.
.
.
This press release just came today:
VANCOUVER, BC - April 6, 2004 - ActiveState, a
leading provider of professional tools for programmers,
today announced Expect for Windows 1.0, releasing the
real power of Expect to the Windows platform. ActiveState
Expect for Windows is up-to-date, quality assured, and ...
<URL: http://activestate.com/Products/Expect/ >.

It's not just a press release, by the way;
it's a real product, and a good one, if
limited by the difficulties of Windows.
 
J

Josiah Carlson

I am on Windows by the way, so the utilities are printing to the windows
Typically this behavior on Windows (and at least in my experience on
Unix) comes not from popen buffering its input from the called
program, but the called program buffering its output when it isn't a
normal console. I'm not sure if this is default behavior from the C
library, but it's particularly true of cross-platform code that often
uses isatty() to check.

Your experience in unix has colored your impression of popen on Windows.
The trick with Windows is that pipes going to/from apps are not real
file handles, nor do they support select calls (Windows select comes
from Windows' underlying socket library). If they did, then the Python
2.4 module Popen5 would not be required.

Popen5 is supposed to allow the combination of os.popen and os.system on
all platforms. You get pollable pipes and the signal that the program
ended with. As for how they did it on Windows, I believe they are using
pywin32 or ctypes.

- Josiah
 
D

David Bolen

Josiah Carlson said:
Your experience in unix has colored your impression of popen on
Windows. The trick with Windows is that pipes going to/from apps are
not real file handles, nor do they support select calls (Windows
select comes from Windows' underlying socket library). If they did,
then the Python 2.4 module Popen5 would not be required.

Pipes under Windows (at least for the built-in os.popen* calls) are
true OS file handles (in terms of Windows OS system handles), created
via a CreatePipe call which are connected to a child process created
with CreateProcess. You are correct that you can't select on them,
but that's not because they aren't real file handles, but because
Winsock under Windows is the odd man out. Sockets in Winsock aren't
equivalent to other native OS handles (they aren't the "real" file
handles), and select was only written to work with sockets. That's
also why sockets can't directly play in all of the other Windows
synchronization mechanisms (such as WaitFor[Multiple]Object) but
you have to tie a socket to a different OS handle first, and then use
that handle in the sychronization call.
Popen5 is supposed to allow the combination of os.popen and os.system
on all platforms. You get pollable pipes and the signal that the
program ended with. As for how they did it on Windows, I believe they
are using pywin32 or ctypes.

I'm certainly all for additional portability for child process
management - although the internal os.popen* calls under Windows
already give you the exit code of the child process (which does get
some unique values when the process terminates abruptly), just without
any simulated signal bits.

But implementing popen5 under Windows will still run into the same
problem (that of select simply not working for other OS system handles
other than sockets), so I agree that will be a challenge, since
presumably they'll want to return a handle that looks like a Python
file and thus does have to have the underlying OS handle for basic I/O
to work.

Last I saw about the module was in January with a PEP announcement on
python-dev, but the PEP still indicates no Windows support (the
example was built on top of the popen2 module), and the python-dev
discussion led to proposing a start with a pure Python module. I
can't find any code in the current CVS tree related to popen5, so I'm
not sure of the status.

Of course, none of this changes the original question in this thread
in that if the child process is going to select output buffering based
on the "tty" or "console" aspect of the pipe to which its output is
connected, you can't override that from the calling program, but have
to deal with the program being executed (or more properly fake it out
so that the controlling pipe appears more tty/console like). I doubt
any popen* changes will affect that.

-- David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top