A better popen2

P

P

I've written a couple of apps that required
running a command and grabbing the output,
and I've found the existing interfaces problematic for this.

I think the proliferation of functions and classes
in the popen2 module illustrates the problem
(popen2.{popen2,popen3,popen4,Popen3,Popen4})
Now if I want to read both stdout and stderr
seperately then it's awkward to say the least
to implement that without deadlocking using
the popen2 module. Also the multiplexing of
stdout and stderr in popen4 and commands.getoutput
is not usually what one requires IMHO.

There are external solutions like the getCommandOutput recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52296
which has problems that I've commented on there.
There are also very complex solutions like "subproc" from
Ken Manheimer and "task" from Rob Hooft

Therefore I bit the bullet and wrote my own,
with as simple an interface as I thought possible:
http://www.pixelbeat.org/libs/subProcess.py

Perhaps this could be included in commands.py for e.g.?

Any comments appreciated.

cheers,
Pádraig.

p.s. sorry about the previous case of trigger finger
 
D

Donn Cave

[email protected] wrote: said:
There are external solutions like the getCommandOutput recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52296
which has problems that I've commented on there.
There are also very complex solutions like "subproc" from
Ken Manheimer and "task" from Rob Hooft

Therefore I bit the bullet and wrote my own,
with as simple an interface as I thought possible:
http://www.pixelbeat.org/libs/subProcess.py

Perhaps this could be included in commands.py for e.g.?

It looks pretty good to me. A couple of minor points:

- dup2(a, sys.stdout.fileno())

You want 1, not sys.stdout.fileno(). They should be
the same, unless someone has been monkeying around
with stdout, and in that case you must use 1. (And
of course likewise with stderr and stdin, mutatis
mutando.) After the exec, whatever stdout was is
irrelevant and the new command will write to 1, read
from 0, etc.

- execvp('/bin/sh', ['sh', '-c', cmd])

I think that could be just execv(), since you've
supplied a path already. But I really prefer the
way Popen3 does it, where the input parameter is
argv, not a shell command (and you do use execvp.)
Of course you can supply ['sh', '-c', cmd] as that
parameter, but for me it's more often the case that
the parameters are already separate, and combining
them into a shell command is unnecessary and risky.
(Don't support both, like Popen3 does - that was a
mistake that seems to routinely leave people confused
about how it works.)

And I haven't really tried to verify the logic or anything,
but it does look like the right idea to me.

I will at least add it to my collection. I don't have
it at hand and haven't looked at it for a while, but
at one time I was thinking I would put together every
one of these things that has come across comp.lang.python,
might be a dozen or so but I don't think I have them all
yet. (I don't think I have "task", good find.)

The most conspicuous recent one was (I think) called
popen5, and rumor has it, will be standard with 2.4,
and will have the select functionality you need, not
only on UNIX but also on Windows, where it isn't so
trivial a feat.

Other than the platform dependency issues, I think
the main reason this wasn't in the core library from
the beginning is that the problem is more complicated
than the solution, if you know what I mean. Or, each
programmer's itch seems to be in a slightly different
place.

Donn Cave, (e-mail address removed)
 
P

P

Donn said:
It looks pretty good to me. A couple of minor points:

- dup2(a, sys.stdout.fileno()) -> dup2(a, 1)

doh! of course. I hate magic numbers
but what I did was certainly wrong.
- execvp('/bin/sh', ['sh', '-c', cmd])

I think that could be just execv(), since you've
supplied a path already. But I really prefer the
way Popen3 does it, where the input parameter is
argv, not a shell command (and you do use execvp.)

I just copied the way Popen3 did it?
Of course you can supply ['sh', '-c', cmd] as that
parameter, but for me it's more often the case that
the parameters are already separate, and combining
them into a shell command is unnecessary and risky.
(Don't support both, like Popen3 does - that was a
mistake that seems to routinely leave people confused
about how it works.)

fair enough.
And I haven't really tried to verify the logic or anything,
but it does look like the right idea to me.

I will at least add it to my collection. I don't have
it at hand and haven't looked at it for a while, but
at one time I was thinking I would put together every
one of these things that has come across comp.lang.python,
might be a dozen or so but I don't think I have them all
yet. (I don't think I have "task", good find.)

The most conspicuous recent one was (I think) called
popen5, and rumor has it, will be standard with 2.4,
and will have the select functionality you need, not
only on UNIX but also on Windows, where it isn't so
trivial a feat.

Cool! I missed that: http://www.python.org/peps/pep-0324.html
I had a 10 second look at process.Popen.communicate() and it seems
to do much the same as I did, however it's missing a timeout
parameter like I implemented which I think is important.
Other than the platform dependency issues, I think
the main reason this wasn't in the core library from
the beginning is that the problem is more complicated
than the solution, if you know what I mean. Or, each
programmer's itch seems to be in a slightly different
place.

I aggree that the appropriate interface is hard to define
however I think we all agree that popen2 is definitely not it.

thanks a million,

Pádraig.
 
D

Donn Cave

Donn Cave wrote: ....

doh! of course. I hate magic numbers
but what I did was certainly wrong.

Hm, I feel like there's a lot of not-working stuff out
there because of someone's inability to get comfortable
with the UNIX system - file descriptors, ioctls, etc.
It's beautifully elegant to me. Is an arbitrary name
any better than an arbitrary number?
Cool! I missed that: http://www.python.org/peps/pep-0324.html
I had a 10 second look at process.Popen.communicate() and it seems
to do much the same as I did, however it's missing a timeout
parameter like I implemented which I think is important.

Well, I can't think of a time when I have needed it,
right off hand, but I guess when you need it, you need it.

Too bad he changed the name to "process", since from a
quick read it looks to me to support only command spawning,
not processes in general.
I aggree that the appropriate interface is hard to define
however I think we all agree that popen2 is definitely not it.

I don't know. It has some problems that generate a
regular flow of questions on comp.lang.python, but it's
useful enough for the limited range of things that are
likely to work anyway. If you want to solve a really
intractable pipe problem, see if you can get your stuff
working with what existing pty support there is in the
core distribution, for situations where the problem is
that the spawned command block-buffers its output when
talking to a pipe and the whole system deadlocks.

Where I've been motivated to write my own, the result
that I've actually used more than once has been a
sort of getstatusoutput() function, but raising an
exception instead of returning a status. Like,

try:
text = cmdmodule.invoke(cmd)
except cmdmodule.Error, value:
print >> sys.stderr, repr(cmd[0]), 'aborted:', value
sys.exit(1)

If I'm not mistaken, that's how we do things in Python,
we don't return status and expect you to check it. But
you need select to do it (because the error value comes
from unit 2.) Should be possible to build that on top
of your module, haven't tried it though.

Donn Cave, (e-mail address removed)
 
P

Peter Astrand

Cool! I missed that: http://www.python.org/peps/pep-0324.html
I had a 10 second look at process.Popen.communicate() and it seems
to do much the same as I did, however it's missing a timeout
parameter like I implemented which I think is important.

The current communicate() only supports a "read and write everything"
mode. communicate() won't return until all data is exchanged, and the
process is finished. In this case, I guess a timeout isn't very useful.

I've recently got in contact with Mark Pettit, which was interested in a
communicate-like method which could be called more than once. In this
case, a timeout is useful. We haven't decided on anything yet, but I've
thought about how to make this functionality available in communicate().
One idea is to add a few new keyword arguments, so that it would look
like:

def communicate(input=None, timeout=None,
all=True, linemode=False):

If all is true, then it would behave just like before. If all is false, it
will return as soon as any stdout or stderr data is available, or the
timeout has gone off. Additionally, if linemode is true, then it will
return as soon as a complete line has been read, from stdout/stderr.

Comments? Should we make a separate method instead?

(I have a patch which implements the idea above, if anyone is interested.)

Btw, the PEP is not updated yet, so anyone interested in my process.py
should look at http://www.lysator.liu.se/~astrand/popen5/ instead.

/Peter Åstrand <[email protected]>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top