Snippet: The leanest Popen wrapper

P

Phlip

Groupies:

This is either a code snippet, if you like it, or a request for a
critique, if you don't.

I want to call a command and then treat the communication with that
command as an object. And I want to do it as application-specifically
as possible. Anyone could think of a way to productize this:

def command(*cmdz):

process = Popen( flatten(cmdz),
shell= True,
stdout= subprocess.PIPE,
stderr= subprocess.PIPE,
bufsize= 4096 )

def line():
return process.stdout.readline().rstrip()

def s():
while True:
l = line()
if not l: break
yield l

line.s = s

return line

That leads to some syntactic sugar. For example, one truly demented
way to stream in an entire block and then treat it as one big string
is this:

print '\n'.join(command('ls').s())

The point of the command() complex is the ability to start a long
command and then fetch out individual lines from it:

line = command('find', '../..')

print 'lines'
print line()
print line()
print line()

print 'all'
print list(line.s())

If you need different pipe abilities, such as stdin, you can trivially
add them to the contents of command() (it's not productized on
purpose).

So I can take the line() functor and, for example, pin it to a View
object, or put it in another thread now, right?
 
P

Peter Otten

Phlip said:
Groupies:

I smell a slight misperception of the audience you are addressing ;)
This is either a code snippet, if you like it, or a request for a
critique, if you don't.

I want to call a command and then treat the communication with that
command as an object. And I want to do it as application-specifically
as possible. Anyone could think of a way to productize this:

def command(*cmdz):

process = Popen( flatten(cmdz),
shell= True,
stdout= subprocess.PIPE,
stderr= subprocess.PIPE,
bufsize= 4096 )

Protect your environment, don't let stderr pollute the nearby river ;)
def line():
return process.stdout.readline().rstrip()

def s():
while True:
l = line()

At that point l may be empty because you have read the output completely or
because there was an empty line that you rstripped to look like the end of
file.
if not l: break
yield l

line.s = s

return line

I think you are overdoing that closure/function factory thing a bit...

Seriously, you should reread the subprocess documentation to learn how to
avoid deadlocks.
 
T

Thomas Jollans

Groupies:

This is either a code snippet, if you like it, or a request for a
critique, if you don't.

I want to call a command and then treat the communication with that
command as an object. And I want to do it as application-specifically
as possible. Anyone could think of a way to productize this:

def command(*cmdz):

process = Popen( flatten(cmdz),
shell= True,
stdout= subprocess.PIPE,
stderr= subprocess.PIPE,
bufsize= 4096 )

def line():
return process.stdout.readline().rstrip()

def s():
while True:
l = line()
if not l: break

This will ignore everything after a blank line. Intended?
It may be better not to use readline(), but to use the fact that it's an
iterable, and use next(process.stdout) to get each line. (and deal with
StopIteration accordingly -- or not)
 
C

Chris Rebert

Groupies:

This is either a code snippet, if you like it, or a request for a
critique, if you don't.

I want to call a command and then treat the communication with that
command as an object. And I want to do it as application-specifically
as possible. Anyone could think of a way to productize this:

def command(*cmdz):

   process = Popen( flatten(cmdz),

flatten() being defined as...?
                    shell= True,

I would strongly encourage you to avoid shell=True. You really don't
want to have to worry about doing proper shell escaping yourself.
                    stdout= subprocess.PIPE,
                    stderr= subprocess.PIPE,
                    bufsize= 4096 )

Cheers,
Chris
 
P

Phlip

flatten() being defined as...?

Python Cookbook recipe 4.6

def flatten(sequence): # CONSIDER: Reconcile with utils...
for item in sequence:
if isinstance(item, (list, tuple)):
for subitem in flatten(list(item)):
yield subitem
else:
yield item
I would strongly encourage you to avoid shell=True. You really don't
want to have to worry about doing proper shell escaping yourself.

Tx for helping me avoid reading up on it. I just copied it in.

I keep getting "fab not found" etc. when running 'fab command' thru
it. So then I ditch to os.system().

The long-term solution would be 'bash', '-c', 'yack yack yack' if you
want truly shelly things!
 
T

Terry Reedy

This is either a code snippet, if you like it, or a request for a
critique, if you don't.

A learning exercise but pretty useless otherwise. As others pointed out,
immediately stripping off \n is a bug relative to *your* function
description. Also, you yourself then give an example of joining with \n.
But that does not restore the final \n. The rest duplicates the
iteration ability of the .stdout file object.

For repeated execution of code like
process.stdout.readline()
you can once create a 'packetized' bound method object like so
cmdline = process.stdout.readline
and then, without repeating the attribute lookups, repeatedly call
cmdline()
..
 
T

Thomas Rachel

Am 03.08.2011 19:27 schrieb Chris Rebert:
I would strongly encourage you to avoid shell=True.

ACK, but not because it is hard, but because it is unnecessary and
unelegant at this point.
You really don't want to have to worry about doing proper shell escaping yourself.

That's nothing to really worry about - just doing

def shellquote(*strs):
return " ".join([
"'"+st.replace("'","'\\''")+"'"
for st in strs
])

would do perfectly: shellquote('echo', "'", '"', " ", "\n")
If you emit a command line over ssh, for example, you don't have another
simple choice.

There are only worries if there is a shell which better shouldn't be
named like this. As you generally cannot know what ugly things the user
of your program does, it is better to avoid the additional shell layer.

So generally agree to what you say, but it is not the proper shell
escaping one should worry about (it is so simple that one cannot call it
"worry"), but the type of shell one talks with.

Thomas
 
T

Thomas Rachel

Am 03.08.2011 17:29 schrieb Phlip:
Groupies:

This is either a code snippet, if you like it, or a request for a
critique, if you don't.

Well, at first, I cannot see the real point about it...

def command(*cmdz):

process = Popen( flatten(cmdz),
shell= True,
stdout= subprocess.PIPE,
stderr= subprocess.PIPE,
bufsize= 4096 )

def line():
return process.stdout.readline().rstrip()

def s():
while True:
l = line()
if not l: break
yield l

line.s = s

return line

I find it quite ugly. You get a function object with an attached
generator (which has a strange and non-verbose name and) which might
stop too early due to an empty line. Plus, you have no real control over
the communication; you have no access to stdin or stderr. The latter
might produce a lock if the process writes out too much on stderr.

Plus, you have no access to the exit code of the program.

And you lose information about if the stream ends with any combination
of whitespace.

That leads to some syntactic sugar. For example, one truly demented
way to stream in an entire block and then treat it as one big string
is this:

print '\n'.join(command('ls').s())

What would work as well via

print Popen( ['ls'], stdout= subprocess.PIPE).stdout.read()
or
print Popen( ['ls'], stdout= subprocess.PIPE).communicate()[0]

The point of the command() complex is the ability to start a long
command and then fetch out individual lines from it:

line = command('find', '../..')

sp = Popen( ['find', '../..'], stdout= subprocess.PIPE).stdout.read()
line = sp.stdout.readline
# if you want so: line = lambda: sp.stdout.readline().rstrip() - which
# might lose information as well...
print 'lines'
print line()
print line()
print line()

print 'all'
print list(line.s())
print list(iter(line, ''))


Thomas
 
C

Chris Rebert

Am 03.08.2011 19:27 schrieb Chris Rebert:
I would strongly encourage you to avoid shell=True.

ACK, but not because it is hard, but because it is unnecessary and unelegant
at this point.
You really don't want to have to worry about doing proper shell escaping
yourself.

That's nothing to really worry about - just doing

def shellquote(*strs):
       return " ".join([
               "'"+st.replace("'","'\\''")+"'"
               for st in strs
       ])

would do perfectly: shellquote('echo', "'", '"', " ", "\n")

I was considering the more general case where one of the strings may
have come from user input. You then need to also escape
$looks_like_a_var, `some_command`, and way more other such stuff that
your simple function doesn't cover. Even if the user is trusted, not
escaping such things can still lead to bizarre unintended
output/results.

If the commands are completely static, then yes, I agree that lack of
necessity then becomes the main argument against shell=True.

Cheers,
Chris
 
T

Thomas Rachel

Am 04.08.2011 10:42 schrieb Chris Rebert:
I was considering the more general case where one of the strings may
have come from user input. You then need to also escape
$looks_like_a_var, `some_command`, and way more other such stuff that
your simple function doesn't cover.

Even these things are harmless when included in ''s.

$ echo '`rm -rf .`' '$RANDOM'
`rm -rf .` $RANDOM

Thomas
 
Y

Yves-Gwenael Bourhis

Phlip a écrit ce mercredi 3 août 2011 17:29 dans <51b2d157-3fea-4f8e-80b4-
(e-mail address removed)> :

I made this because subprocess.Popen was giving me headaches ;-) :
http://pypi.python.org/pypi/commandwrapper/0.1

I use it here:
http://svn.mandriva.com/cgi-
bin/viewvc.cgi/soft/lab/doc4/django_doc4/doc4/utils/commandwrapper.py?revision=272275&view=markup
And here:
http://svn.deskolo.org/deskolo/browser/trunk/src/deskolo/util/commandwrapper.py

It's worth what it's worth, but I find it less headache_giving than
subprocess.Popen is (reason I made it)

It's on Pypi so you can:
easy_install commandwrapper


usage example1:
===============
Ls = WrapCommand( 'ls -l')
GrepPdf = WrapCommand( 'grep pdf')
Wc = WrapCommand( 'wc -l')
Wc(GrepPdf(Ls)) # <- gives the result of "ls -l | grep pdf | wc -l"

usage example2:
===============
Ls = WrapCommand( 'ls -l | grep pdf | wc -l', shell=True)
Ls()


or imagine that the above commands may take a lot of time (and in parallel
you want to do something else):

usage example3:
===============
Ls = WrapCommand( 'ls -l')
GrepPdf = WrapCommand( 'grep pdf')
Wc = WrapCommand( 'wc -l')
Wc.stdin = GrepPdf
GrepPdf.stdin = Ls
Wc.start( )
#Do stuff
....
#finished doing stuff and wait for the result of the commands we launched
Wc.join()
Wc.results


I have put a lot of docstring in it as guidelines:

Code to Latest version:
=======================

#! /usr/bin/env python
# -*- coding: utf-8 -*-
#
## This file may be used under the terms of the GNU General Public
## License version 2.0 as published by the Free Software Foundation
## and appearing in the file LICENSE included in the packaging of
## this file.
##
## This file is provided AS IS with NO WARRANTY OF ANY KIND, INCLUDING THE
## WARRANTY OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
##

import shlex, subprocess
from threading import Thread
#from multiprocessing import Process as Thread, Event
from time import sleep

class WrapCommand (Thread):
r"""Author : Yves-Gwenael Bourhis

==================================================
Wrap a shell comand into a python threaded object.
==================================================

Usage:
======

You want to launch the following bash commands in a thread::

[user@localhost ~]$ ls -l | grep pdf | wc -l
5

here is how you can do it::
('5\n', '')

the 'results' property is a tuple (stdoutdata, stderrdata)

You can also do it this way::
Ls = WrapCommand( 'ls -l | grep pdf | wc -l', shell=True)
Ls.start()
#Do stuff
Ls.join()
Ls.results[0]
'5\n'

You would need to specify 'shell=True' when the command
you wish to execute is actually built into the shell.
i.e.: on Windows if you use built in commands such as 'dir' or 'copy':
http://docs.python.org/library/subprocess.html#subprocess.Popen

The purpose of doing it in a thread is when the above commands may
take a few hours, and that you want to perform other tasks in the
meanwhile.
You can check the process is still running with::
False

'True' would be returned if still running.
To terminate it prematurely (i.e. it deadlocked) you have the
'terminate()', 'kill()' or 'send_signal(signal) methods which are
self speaking.
When you want to wait for the thread to end, use the 'join()' method:
http://docs.python.org/library/threading.html#threading.Thread.join


You want to launch the following bash commands without threading::

[user@localhost ~]$ ls -l | grep pdf | wc -l
5

here is how you can do it::
'5\n'

Avoid doing this for processes where a large amount of data is piped
between each command.

instead, do it this way::
'5\n'

Prefer the threaded method instead if this may take a long time and
that you want to perform other tasks in the meanwhile.


You can specify another shell for running commands::

Directory : C:\Users\Yves\python_tests

Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 27/01/2011 00:14 7006 commandwrapper.py
-a--- 27/01/2011 00:15 7048 commandwrapper.pyc


You can also use Context Management (with_item):
http://docs.python.org/reference/compound_stmts.html#grammar-token-
with_item

example::
... with WrapCommand( 'grep pdf') as GrepPdf:
... with WrapCommand( 'wc -l') as Wc:
... Wc.stdin = GrepPdf
... GrepPdf.stdin = Ls
... Wc.start( )
... #Do stuff
... Wc.join()
...('5\n', '')

You may also simply want to have a subprocess objet::

the returned object (`lscmd` in the example above) is a standard
subprocess.Popen object
"""

pid = None
returncode = None
results = [None, None]
exc_type = None
exc_value = None
traceback = None
sent_signal = None
dont_auto_communicate = False

stdin = None
stdout = subprocess.PIPE
stderr = subprocess.PIPE

shell = False
executable = None
command = None
commands = None

def __init__(self, command, shell=False, executable=None,
stdin=None, stdout=subprocess.PIPE,
stderr=subprocess.PIPE):
self.__parent = super(WrapCommand, self)
self.cmd = None
self.shell = shell
self.executable = executable
self.command = command
self.commands = shlex.split(command)
name = self.commands[0]
self.stdin = stdin
self.stdout = stdout
self.stderr = stderr
self.__parent.__init__(name=name)

def prepareToRun(self, dont_auto_communicate=False):
"""
returns the subprocess object::

'cmd' is a subprocess.Popen object
See http://docs.python.org/library/subprocess.html#subprocess.Popen

'result' is a tuple (stdoutdata, stderrdata).
See
http://docs.python.org/library/subprocess.html#subprocess.Popen.communicate
"""
self.dont_auto_communicate = dont_auto_communicate
if isinstance(self.stdin, str):
stdin = subprocess.PIPE
elif isinstance(self.stdin, self.__class__):
self.stdin.prepareToRun(True)
stdin = self.stdin.cmd.stdout
else:
stdin = self.stdin
if self.shell:
command = self.command
else:
command = self.commands
self.cmd = subprocess.Popen(command,
stdin=stdin,
stdout=self.stdout,
stderr=self.stderr,
shell=self.shell,
executable=self.executable)
self.pid = self.cmd.pid
return self.cmd

makeCmd = prepareToRun
make_cmd = prepareToRun

def __enter__(self):
return self

def __call__(self, stdin=None):
self.stdin = stdin
self.run()
return self.results[0]

def run(self):
self.prepareToRun()
if not self.dont_auto_communicate:
self.results = self.cmd.communicate(self.stdin)
self.returncode = self.cmd.returncode

def send_signal(self, signal):
if self.is_alive() and self.cmd is not None:
self.cmd.send_signal(signal)
self.sent_signal = signal

def terminate(self):
if self.is_alive() and self.cmd is not None:
self.cmd.terminate()
self.sent_signal = 'SIGTERM'
if hasattr(self.__parent, 'terminate'):
self.__parent.terminate()

def kill(self):
if self.is_alive() and self.cmd is not None:
self.cmd.kill()
self.sent_signal = 'SIGKILL'

def stop(self):
if self.is_alive()and self.cmd is not None:
self.terminate()
if self.is_alive() and self.cmd is not None:
sleep(1)
if self.is_alive() and self.cmd is not None:
self.kill()

def __exit__(self, exc_type=None, exc_value=None, traceback=None):
if self.is_alive() and self.cmd is not None:
self.stop()
self.exc_type = exc_type
self.exc_value = exc_value
self.traceback = traceback

def __del__(self):
self.__exit__()
if self.cmd is not None:
self.cmd.wait()
del self.cmd

class WrapOnceCommand(WrapCommand):
"""
Same as WrapCommand, but the cmd attribute which is a subprocess.Popen
object will be created once and for all
Therefore the run methode (or the object) can only be called once.
The goal it to launch a command in a thread, and to have this command
easily start/stopped from elsewhere.
"""
def run(self):
if self.cmd is None:
self.prepareToRun()
if not self.dont_auto_communicate:
self.results = self.cmd.communicate(self.stdin)
self.returncode = self.cmd.returncode


Ok, it's not perfect yet, but makes subprocess.Popen easyier
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top