should these be fixed for python 2.4?

  • Thread starter Alexander Schmolck
  • Start date
A

Alexander Schmolck

Two smallish things that have been bugging me; I'm not sure whether they're
actually broken or not, but anyway here it goes:

1. ``os.system`` (and co): Shouldn't ``os.system``'s signature really be
``os.system(cmd, arg1, arg2,...)`` rather than ``os.system(some_string)``?
Otherwise ``some_string`` almost certainly will be constructed by something
along the lines of ``os.system('convert %s %s' % (infile, outfile))`` which
of course is broken (e.g. ``infile = "with space"``).

I don't think it's even possible to do the right thing
platform-independently. I currently use the following hack (any improvement
suggestions welcome):

def _fixupCmdStr(cmd, args):
escape = lambda s: '"%s"' % re.sub(r'\"$!', r'\\\1',s) #FIXME
return " ".join([cmd] + map(escape, args))

def system(cmd,*args):
return os.system(_fixupCmdStr(cmd, args))


So why not just allow ``os.system('convert',infile, outfile)`` as a
backward compatible extension that "does the right thing"?

BTW, if I'm not mistaken it's something that perl got right a long time
ago.
> perl -e "system('touch', 'a b c', 'd')"
> ll
total 0
-rw-r--r-- 1 aschmolc phd 0 Oct 1 17:48 a b c
-rw-r--r-- 1 aschmolc phd 0 Oct 1 17:48 d


2. pydoc's handling of overriden methods without docstrings: I really think it
should show the documentation of the method in the baseclass; the current
behavior really sucks when working with things like backends.

Even fixing up subclass docstrings by hand is not trivial::

In [25]: class Bar(object):
....: def meth(x):
....: pass
....:

In [26]: Bar.meth.__doc__

In [27]: Bar.meth.__doc__ = "abc"
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)

/auto/home/black4/aschmolc/PhD/Bayes/<console>

AttributeError: 'instancemethod' object attribute '__doc__' is read-only

Going via dict will work (but that of course fails for the general case)::

In [28]: Bar.__dict__['meth'].__doc__ = "abc"


cheers,

'as


P.S. speaking of running external processes -- I'm also unaware of a good,
platform-independent (or even just unix) way to run a process that possibly
accepts input, but will just continue on EOF from stdin and getting back
stdout and stderr (if supported) as strings and the exit status.

I currently use this to run e.g. latex, but it doesn't seem to work absolutely
reliably and is unix only. I'm pretty sure there must be a better way.


def readProcess(cmd, *args):
r"""Like `os.popen3`, but returns 2 strings (stdin, stdout) and the exit
code (unlike popen2, exit is 0 if no problems occured (for some bizzarre
reason popen2 returns None... <sigh>). FIXME: only works for UNIX!"""

cmdStr = _fixupCmdStr(cmd, args)
popen = popen2.Popen3(cmdStr, capturestderr=True)
popen.tochild.close() # XXX make sure we're not waiting forever
exit = popen.wait()
out, err = map(slurpIn, [popen.fromchild, popen.childerr])
return out or "", err or "", exit


def slurpIn(file, binary=False):
if isString(file):
file = open(file, ("r", "rb")[bool(binary)])
try:
result = file.read()
return result
finally:
file.close()
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Alexander said:
1. ``os.system`` (and co): Shouldn't ``os.system``'s signature really be
``os.system(cmd, arg1, arg2,...)`` rather than ``os.system(some_string)``?

No. This is a direct exposition of the system(3) function of the C
library, and has precisely the same interface like this function.
Use os.spawnv for what you want.
2. pydoc's handling of overriden methods without docstrings: I really think it
should show the documentation of the method in the baseclass; the current
behavior really sucks when working with things like backends.

Feel free to contribute a patch.
I currently use this to run e.g. latex, but it doesn't seem to work absolutely
reliably and is unix only. I'm pretty sure there must be a better way.

I don't think there is.

Regards,
Martin
 
D

Donn Cave

Alexander Schmolck said:
Two smallish things that have been bugging me; I'm not sure whether they're
actually broken or not, but anyway here it goes:
1. ``os.system`` (and co): Shouldn't ``os.system``'s signature really be
``os.system(cmd, arg1, arg2,...)`` rather than ``os.system(some_string)``? ....
BTW, if I'm not mistaken it's something that perl got right a long time
ago.

total 0
-rw-r--r-- 1 aschmolc phd 0 Oct 1 17:48 a b c
-rw-r--r-- 1 aschmolc phd 0 Oct 1 17:48 d

To elaborate on Martin v. Lowis' response, the
spawnv() function, and presumably Perl's system
as shown above, are distinctly different from
system(3), in that they execute the command directly
with the indicated parameters, where system()
executes the shell to interpret shell command text.

In my opinion Python got this much righter than Perl
when it used a different name. Unlike the case with
Popen3, where the same function represents 2 different
APIs, both the string and the list - I see later in your
message you may be among the people who have been led
astray by that.
P.S. speaking of running external processes -- I'm also unaware of a good,
platform-independent (or even just unix) way to run a process that possibly
accepts input, but will just continue on EOF from stdin and getting back
stdout and stderr (if supported) as strings and the exit status.

I currently use this to run e.g. latex, but it doesn't seem to work
absolutely
reliably and is unix only. I'm pretty sure there must be a better way.

Yes, but either you use disk files or it's more work.

The unreliability, I imagine is due to limitations of pipes'
device buffer space. You're trying to execute the whole
thing within those limits - you wait() for process exit
before you have read a byte. It would be better to read
output first, since that will terminate when the process
exits, and then wait() to retrieve the status. A fully
reliable version needs to select on the two pipes to make
sure that the error pipe doesn't fill up while you're trying
to read all the output. Use os.read on the file descriptor
so you're on the same page with select().

Or on the other hand, temporary disk files don't have these
problems and are of course more portable.

Donn Cave, (e-mail address removed)
 
A

Alexander Schmolck

Donn Cave said:
spawnv() function, and presumably Perl's system
as shown above, are distinctly different from
system(3), in that they execute the command directly
with the indicated parameters, where system()
executes the shell to interpret shell command text.

In my opinion Python got this much righter than Perl
when it used a different name.

I think who got it righter depends the respective ratios of buggy code; I'd be
not be surprised if only a fraction of python code that simply attempts to run
an external command with certain arguments (and no shell preprocessing) worked
correctly.

I appreciate your point but I suspect that at the moment many people are
extremely likely to do the wrong thing, because it's much more obvious than
the right thing.

So even if changing system is the wrong step, I think there really ought to be
an obvious and simple way to accomplish what perl's system($cmd, @args) does.

Unlike the case with Popen3, where the same function represents 2 different
APIs, both the string and the list - I see later in your message you may be
among the people who have been led astray by that.

BTW what is one supposed to use in lieu of the various popen* commands which
all only accept a cmd string rather than?
Yes, but either you use disk files or it's more work.

The unreliability, I imagine is due to limitations of pipes'
device buffer space. You're trying to execute the whole
thing within those limits - you wait() for process exit
before you have read a byte. It would be better to read
output first, since that will terminate when the process
exits, and then wait() to retrieve the status.

Hm, I think that might be what I tried first, but that resulted in hangs.
A fully reliable version needs to select on the two pipes to make sure that
the error pipe doesn't fill up while you're trying to read all the output.
Use os.read on the file descriptor so you're on the same page with select().

Based on your suggestion I tried this, but it hangs because the files remain
ready for reading. Do I have to check if nothing has been read from both
stderr and stdout? (I'm not really particularly clued about low-level system
stuff):

def readProcess(cmd, *args):
r"""Like `os.popen3`, but returns 2 strings (stdin, stdout) and the exit
code (unlike popen2, exit is 0 if no problems occured (for some bizzarre
reason popen2 returns None... <sigh>). FIXME: only works for UNIX!"""
BUFSIZE=1024**2
import select
cmdStr = _fixupCmdStr(cmd, args)
popen = popen2.Popen3(cmdStr, capturestderr=True)
which = {id(popen.fromchild): [],
id(popen.childerr): []}
try:
popen.tochild.close() # XXX make sure we're not waiting forever
while True:
ready = select.select([popen.fromchild, popen.childerr],[],[])
for f in ready[0]:
while True:
read = os.read(f.fileno(), BUFSIZE)
if not read:
break
which[id(f)].append(read)
if not ready:
break
out, err = ["".join(which[id(f)])
for f in [popen.fromchild, popen.childerr]]
exit = popen.wait()
finally:
try:
popen.fromchild.close()
finally:
popen.childerr.close()
return out or "", err or "", exit

Or on the other hand, temporary disk files don't have these problems and are
of course more portable.

Can you elucidate a bit further? How do I portably redirect stdout and
stderr to tempfiles, or is that not what you meant?

thanks,

'as
 
G

Grant Edwards

1. ``os.system`` (and co): Shouldn't ``os.system``'s signature really be
``os.system(cmd, arg1, arg2,...)`` rather than ``os.system(some_string)``?

No. It's just a wrapper around system(), so why muck with the
signature?

------------------------------------------------------------------------
SYSTEM(3) Linux Programmer's Manual SYSTEM(3)

NAME
system - execute a shell command

SYNOPSIS
#include <stdlib.h>

int system(const char *string);

------------------------------------------------------------------------
Otherwise ``some_string`` almost certainly will be
constructed by something along the lines of
``os.system('convert %s %s' % (infile, outfile))`` which of
course is broken (e.g. ``infile = "with space"``).

Maybe, maybe not. Why add the command construction overhead to
os.system()?
I don't think it's even possible to do the right thing
platform-independently.

Sorry, I've no idea what you're talking about. os.system() is
supposed to hand a string to the shell. What that string means
is a matter for you and the shell to discuss. How that string
is constructed is up to you. os.system() has nothing to do
with it.
So why not just allow ``os.system('convert',infile, outfile)`` as a
backward compatible extension that "does the right thing"?

Because it's not possible to define what "the right thing" is.
You'd have to know both the semantics and syntax of the shell
and the intent of the programmer to figure out "the right
thing". That's a pretty tall order for something that's
supposed to be a very thin wrapper around Posix system().
 
A

Alexander Schmolck

Martin v. Löwis said:
No. This is a direct exposition of the system(3) function of the C
library, and has precisely the same interface like this function.
Use os.spawnv for what you want.

OK, but let me say that this is *not* obvious -- it is difficult to find in
the docs (I don't really want to spawn anything and it's burried amongst a
zillion cryptically named similar functions with a complex interface) and
difficult to use.

So I think many (most?) people will use os.system instead and in a way that is
broken. If changing system is a bad idea, how about introducing a convenience
function or adding a note to the os.system docs?

Feel free to contribute a patch.

I would if I hadn't trouble with sf.net. I'll try again over the weekend but
if someone else would like to submit it, it's appended below.
I don't think there is.

Regards,
Martin

*** pydoc.py.old Fri Oct 1 19:14:54 2004
--- pydoc.py Fri Oct 1 21:29:42 2004
***************
*** 72,78 ****
return dirs

def getdoc(object):
"""Get the doc string or comments for an object."""
! result = inspect.getdoc(object) or inspect.getcomments(object)
return result and re.sub('^ *\n', '', rstrip(result)) or ''

--- 72,103 ----
return dirs

+ _method_types = (staticmethod, classmethod, types.MethodType,
+ types.BuiltinMethodType, types.FunctionType)
+ def _getdoc_helper(thing):
+ """Like `inspect.getdoc` but for methods with no documentation it
+ retrieves the documentation from a baseclasses corresponding method, if
+ possible."""
+ if inspect.getdoc(thing):
+ return inspect.getdoc(thing)
+ else:
+ def find_method_doc(c, name):
+ v = c.__dict__.get(name)
+ return isinstance(v, _method_types) and inspect.getdoc(v)
+ if isinstance(thing, _method_types):
+ try:
+ the_class = thing.im_class
+ func_name = thing.im_func.func_name
+ for c in inspect.getmro(the_class)[1:]:
+ if find_method_doc(c, func_name):
+ return find_method_doc(c, func_name)
+ return None
+ except AttributeError, msg:
+ print >> sys.stderr,\
+ "Couldn't get doc from baseclass for %s: %s" % (thing, msg)
+ return None
+
def getdoc(object):
"""Get the doc string or comments for an object."""
! result = _getdoc_helper(object) or inspect.getcomments(object)
return result and re.sub('^ *\n', '', rstrip(result)) or ''
 
J

Jeff Shannon

Alexander said:
I think who got it righter depends the respective ratios of buggy code; I'd be
not be surprised if only a fraction of python code that simply attempts to run
an external command with certain arguments (and no shell preprocessing) worked
correctly.

I appreciate your point but I suspect that at the moment many people are
extremely likely to do the wrong thing, because it's much more obvious than
the right thing.

Personally, I suspect that the number of people who're running external
commands that are complex enough for these escaping issues to matter is
rather small. The vast majority of uses of os.system() (IMHO) are very
simple cases with one or two simple arguments. Relatively few of them
involve filenames that contain spaces, or other things that might cause
problems with the naive os.system('cmd %s %s' % (arg1, arg2)) sort of
approach.

At least, *I've* never personally found myself in a situation where I
wanted to execute an external command, and in which I needed to worry
about escaping/quoting the command string. It's an open question
whether your experience is more typical than mine, I guess...

Jeff Shannon
Technician/Programmer
Credit International
 
A

Andrew Dalke

Jeff said:
Personally, I suspect that the number of people who're running external
commands that are complex enough for these escaping issues to matter is
rather small. The vast majority of uses of os.system() (IMHO) are very
simple cases with one or two simple arguments.

I end up making a lot of system() and popen*() calls. Looking
through the 30 or so cases (our of about 6800 LOC) I see that most
of them use hard coded parameters that don't need escaping.
Some of them do you commands.mkarg (the only function I know of in the
standard library that does shell escaping). Those are passed
user-defined filenames.

I would rather not assume the user knows enough to not pass
filenames with a " " or a ";". Sounds like walking too shaky
a tightrope, and making the system open to subtle security
attacks.

Andrew
(e-mail address removed)
 
M

Michele Simionato

Jeff Shannon said:
Personally, I suspect that the number of people who're running external
commands that are complex enough for these escaping issues to matter is
rather small. The vast majority of uses of os.system() (IMHO) are very
simple cases with one or two simple arguments. Relatively few of them
involve filenames that contain spaces, or other things that might cause
problems with the naive os.system('cmd %s %s' % (arg1, arg2)) sort of
approach.

At least, *I've* never personally found myself in a situation where I
wanted to execute an external command, and in which I needed to worry
about escaping/quoting the command string. It's an open question
whether your experience is more typical than mine, I guess...

Well, if I may add a data point, it happened to me various times
of getting in trouble with os.system and quotes when managing
Windows files. I know about os.spawn*, but it is a knowledge that
keeps escaping from my mind. I would keep os.system as it is,
but I would modify the documentation (or maybe the docstring of
os.system) pointing out the existence of os.spawn*.
Just my 0.02c,


Michele Simionato
 
A

Alexander Schmolck

Alexander Schmolck said:
I would if I hadn't trouble with sf.net. I'll try again over the weekend but
if someone else would like to submit it, it's appended below.

patch succesfully submitted now.

'as
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Alexander said:
OK, but let me say that this is *not* obvious -- it is difficult to find in
the docs (I don't really want to spawn anything and it's burried amongst a
zillion cryptically named similar functions with a complex interface) and
difficult to use.

I fail to see the last point. What you proposed is equally difficult to
use. And yes, you do want to spawn a new process.
So I think many (most?) people will use os.system instead and in a way that is
broken. If changing system is a bad idea, how about introducing a convenience
function or adding a note to the os.system docs?

Adding notes to documentation is always possible. I don't see the point
of adding a convenience function, because os.spawn* is already there.

Regards,
Martin
 
A

Alexander Schmolck

Jeff Shannon said:
Personally, I suspect that the number of people who're running external
commands that are complex enough for these escaping issues to matter is rather
small.

The chances that if you call some external command you will want to pass it
some filename (or something else that might contain more than [a-zA-Z0-9_-.])
that you didn't necessarily (or carefully enough) create yourself seem really
quite high to me (a quick grep through my own stuff showed a ratio of about
20:1 needs-escaping:doesnt-need-escaping) [where needs-escaping of course more
means "should really use os.spawnv" or whatever, but that doesn't resolve the
issue for popen calls]
The vast majority of uses of os.system() (IMHO) are very simple cases
with one or two simple arguments. Relatively few of them involve filenames
that contain spaces, or other things that might cause problems with the naive
os.system('cmd %s %s' % (arg1, arg2)) sort of approach.

Do you mean they don't involve filenames (etc.) with space by necessity or by
chance (they are rare under unix)? By chance -- well certainly and there is
unix stuff out there already that breaks for 'funny' filenames, but I'd rather
python did not further contribute to that (especially since once python code
developed under unix then gets used under windows actual trouble strarts).

'as
 
A

Alexander Schmolck

Andrew Dalke said:
Some of them do you commands.mkarg (the only function I know of in the
standard library that does shell escaping).

Thanks -- I didn't know about that (small wonder -- it's undocumented; no
docstring and not in the manual).

'as
 
A

Alexander Schmolck

Grant Edwards said:
Because it's not possible to define what "the right thing" is.

It is trivial, but os.system was clearly not the right utility for my task and
maybe I even knew that the correct way would be using the os.spawn* thing at
some point but forgot again. That still leaves the popen family of commands,
though, and anyway, as I pointed out in other posts in this thread, although I
accept that os.system should not be changed the current state of affairs
(documentation-wise if nothing else) in all likelihood means plenty of broken
code gets written.

'as
 
N

Nick Craig-Wood

Andrew Dalke said:
I end up making a lot of system() and popen*() calls. Looking
through the 30 or so cases (our of about 6800 LOC) I see that most
of them use hard coded parameters that don't need escaping.
Some of them do you commands.mkarg

Interesting command!

It doesn't (under 2.3) seem to do the right thing under windows
though, eg
hello
0
'a'
0
"a b '\""
0

Wheras if you run that under linux you get what I was expecting
hello
0
a
0
a b '"
0

My experiences lead me to believe Windows is big on " -> "" escaping
in the shell...
 
A

Alexander Schmolck

Martin v. Löwis said:
I fail to see the last point. What you proposed is equally difficult to
use.

``os.run('touch', 'foo')`` [1]

vs.

``os.spawnvp(os.P_WAIT, 'touch', ['touch', 'foo'])``

If you don't think that second is not about 1 order of magnitude more
difficult to use for Joe Programmer, you are likely to be mistaken.

Even assuming that Joe, who presumably doesn't have a clue what spawning a
process might be and just wants to "run touch" (or whatever) has somehow
miraculously discovered that it's not `os.system` but spawning a process what
he really needs he's still faced with several hurdles (Which spawn* is the
right one? What does all that funny jargon in the docs mean? What were the
right arguments again? Ooops, I forgot to pass the command I wanted to run as
an argument... Finally -- what are the chances of getting all these right from
memory?)
And yes, you do want to spawn a new process.

I expressed myself poorly: what I meant is that me and other people who
mistakenly reach for `os.system' don't go to the docs thinking "I want to
spawn a new process, how do I best accomplish that?", but rather "I'd like to
run an external command for my utility script, how do I best accomplish that?".

This is conceptually simple and I'd imagine frequent desire; so I think it
should be accommodated by a simple function.
Adding notes to documentation is always possible. I don't see the point
of adding a convenience function, because os.spawn* is already there.

But it's not *convenient* (or simple to understand and remember; for example
you almost *never* want to pass a different name as first arg so it's very
easy to forget; it's also entirely non-obvious unless you happen to know that
under unix some ostensibly different commands are just symlinks to a single
exectuable that behaves differently based on what it finds in argv[0]).

Plus you equally want something with (cmd, *args) as opposed to (cmdstring)
semantics for popen-like stuff, I should think (and unless I'm again missing
something there is no function that already does it).

So for symmetry, I'd ideally like to see two additions, say `run` (returning
exit status) and `run_and_read` (or `read_process`) (returning exit status and
process (standard) output).

If that's asking too much I'd at least suggest:

1) putting something in the os.system heading that gives a simple example when
*not* to use os.system and why and what to do instead (see [1]).
2) Providing something like ``commands.mkarg`` that works reasonably reliably
and is documented


'as


Footnotes:
[1] Here's a mock implementation (untested):

def run(cmd, *args):
"""Run `cmd` (which is searched for in the executable path) with `args` and
return the exit status.

In general (unless you know what you're doing) use::

run('program', filename)

rather than::

os.system('program %s' % filename)

because the latter will not work as expected if `filename` contains
spaces or shell-metacharacters.

If you need more fine-grained control look at ``os.spawn*``.
"""
return spawnvp(P_WAIT, cmd, (cmd,) + args)
 
A

Andrew Dalke

Alexander said:
Thanks -- I didn't know about that (small wonder -- it's undocumented; no
docstring and not in the manual).

It's also cumbersome to use and I worry that some day
it will disapper. Does anyone actually use the 'commands'
module?

Andrew
(e-mail address removed)
 
A

Andrew Dalke

Alexander said:
``os.run('touch', 'foo')`` [1]

vs.

``os.spawnvp(os.P_WAIT, 'touch', ['touch', 'foo'])``

I for one would rather use the first. I know about the
spawn* functions but ...
(Which spawn* is the
right one? What does all that funny jargon in the docs mean? What were the
right arguments again? Ooops, I forgot to pass the command I wanted to run as
an argument... Finally -- what are the chances of getting all these right from
memory?)

Bingo. I'll add the above to my toolbox. Even then I'll
need to figure out which of

os.spawnl os.spawnlp os.spawnv os.spawnvp
os.spawnle os.spawnlpe os.spawnve os.spawnvpe

to use. Yes I know there's a mnemonic (p = path, e =
environment, v = something, l = something else) but I
forget them.

Andrew "Joe Programmer"
(e-mail address removed)
 
H

Hans-Joachim Widmaier

Am Sat, 02 Oct 2004 15:53:06 +0000 schrieb Andrew Dalke:
It's also cumbersome to use and I worry that some day
it will disapper. Does anyone actually use the 'commands'
module?

Yes, me. Quite a lot, in fact. getstatusoutput() is rather frequent in
some of my scripts. That doesn't mean I like the name, though. ;-)

Hans-Joachim
 
S

Steve Holden

Andrew said:
Alexander said:
``os.run('touch', 'foo')`` [1]

vs.

``os.spawnvp(os.P_WAIT, 'touch', ['touch', 'foo'])``


I for one would rather use the first. I know about the
spawn* functions but ...
(Which spawn* is the
right one? What does all that funny jargon in the docs mean? What were
the
right arguments again? Ooops, I forgot to pass the command I wanted to
run as
an argument... Finally -- what are the chances of getting all these
right from
memory?)


Bingo. I'll add the above to my toolbox. Even then I'll
need to figure out which of

os.spawnl os.spawnlp os.spawnv os.spawnvp
os.spawnle os.spawnlpe os.spawnve os.spawnvpe

to use. Yes I know there's a mnemonic (p = path, e =
environment, v = something, l = something else) but I
forget them.
"l" is a list of arguments and "v" is a variable number - which of
course in C requires a varargs signature.

To support your contention that they aren't easy to use, I had to look
them up in the docs (which I did to improve my mind (which could use
some improvement, believe me))

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top