Opinion on best practice...

Anssi Saari · Feb 6, 2013

Dennis Lee Bieber said:
PowerShell is meant to be used for administrative level scripting,
replacing such things as WSH.

Yeah and WSH has been included since Windows 98... So Windows has been
at least OK with shell scripting VBScript and JScript for the last 15
years or so. And I can't believe I'm defending Windoze.

Neil Cerutti · Feb 6, 2013

The OP stated explicitly that the target OS was Linux:

Don't get me wrong -- I think Python is great -- but when the
target OS is Linux, and what you want to do are file find,
move, copy, rename, delete operations, then I still say bash
should be what you try first.

I had Cygwin on my office computer for many years, and wrote
shell scripts to do things like reconcile fund lists from
separate source files, and generate reports of the differences.

Here's the top-level script. Both files have to be preprocessed
before comparing them.

#!/bin/sh
awk -f step1.awk -v arg=$1.spec $1 | sort | awk -f step2.awk > t1.out
awk -f step1.awk -v arg=$2.spec $2 | sort | awk -f step2.awk > t2.out
../recon.pl

I'm bad at shell scripts, obviously (step1? step2? Why doesn't
recon.pl use the command line arguments?). I used several
programs together that I knew only how to read the docs for.
Honestly, my first Python scripts from those days aren't that
much better, and have nearly all been themselves abandoned.

Today I'm much more productive with Python. Virtually all my
programs are data conversion and comparisons, with a good deal of
archiving and batch processing of large groups of files. I can
imagine a programmer who is learned in sh, sed, awk, sort and
sundry who would do just fine at these tasks, but I can't imagine
myself going back.

Grant Edwards · Feb 6, 2013

The impression I got was that he was looking for something that would
be cross-platform and work in both Windows and Linux. Python is a
reasonable choice for that if you're not willing to deal with Cygwin.

True, the _language_ of Python is cross-platform, but doing much of
anything filesystem/administration related in a cross-platform manner
can be painful.

Grant Edwards · Feb 6, 2013

I had Cygwin on my office computer for many years, and wrote
shell scripts to do things like reconcile fund lists from
separate source files, and generate reports of the differences.

Back when the earth was young, I used some pretty extensive Bourne
shell scripts to perform design checks on multi-volume software
requirements specifications that were written in LaTeX -- and that was
on VAX/VMS with DECShell (sort of the VMS equivalent of Cygwin). It
worked fine but it was excruciatingly slow because of the high fork()
overhead in VMS.

Grant Edwards · Feb 6, 2013

I have to disagree with much of this. bash is a poorly designed
language which, in my opinion, is only good enough for short (under
20 lines) throw-away scripts.

And the OP wanted to write someting that was what, about 3 lines?

This is how you test whether something is not the name of a directory:

[[ -d $dir ]] || {
echo "$FUNCNAME: $dir is not a directory" >&2
return 2
}

It can be written more clearly.

http://www.bashcookbook.com/bashinfo/source/bash-4.2/examples/functions/emptydir

Arithmetic requires either calling an external program, or special magic
syntax:

z=`expr $z + 3`
i=$(( i + 1 ))

Agreed: Bash is not good at math. It should not be used for numerical
analysis.

Spaces are magic -- these two lines do radically different things:

So? Leading spaces are magic in Python.

bash is even more weakly typed than Perl. As far as I can tell, bash
is completely untyped -- everything is text, all the time, even
arrays.

Correct. In bash, everything is a string.

If you're doing something other than manipulating files (and therefore
filenames and paths), then bash is probably the wrong language to use.

Dennis Lee Bieber · Feb 6, 2013

It feels silly enough translating this OS/2 batch script:

@logon SOME_USER /pSOME_PASS /vl
@e:\rexx\load
@db2 start database manager
@exit

into this REXX script:

/* */
"@logon SOME_USER /pSOME_PASS /vl"
"@e:\rexx\load"
"@db2 start database manager"
"@exit"

And that's a change I've made myriad times (to \startup.cmd on many an
OS/2 boot drive) It's far far worse translating it into Python, Pike,
or any other "good language".

Though that is the nice feature of REXX*... Anything that wasn't
parsable as a REXX statement was automatically sent to the current
command processor.

Converting such to Python turns into a mishmash of (pick your
favorite) os.system(), subprocess.Popen(), etc. with their particulars
of how to pass command line arguments.

* {the Amiga variant was really powerful as it didn't take much to
code an application so that it could be a command processor, including
another ARexx script}

Steven D'Aprano · Feb 6, 2013

Dennis said:
Though that is the nice feature of REXX*... Anything that wasn't
parsable as a REXX statement was automatically sent to the current
command processor.

Nice? Are you being sarcastic? What you're describing sounds like a
classic "Do What I Mean" system, which invariably end up being followed by
anguished shouts of "Noooo, I didn't mean that!!!".

The Zen of Python is not just design principles for *Python*. They hold for
other languages as well. Having to explicitly send a command to an external
command processor (say, the shell) is much better **and safer** than
implicitly doing so on SyntaxError, even if such a requirement makes your
program more verbose.

Converting such to Python turns into a mishmash of (pick your
favorite) os.system(), subprocess.Popen(), etc. with their particulars
of how to pass command line arguments.

Why on earth would anyone use a "mismash" within the one script? Pick one
and stick to it. If you have only wish to give a few commands, and don't
care about getting data back from them, use os.system which is
embarrassingly easy and almost as concise as the REXX script above:

from os import system as s
s("@logon SOME_USER /pSOME_PASS /vl")
s("@e:\rexx\load")
s("@db2 start database manager")
s("@exit")

If you find yourself needing complicated string escapes, use subprocess.call
instead.

If you do care about getting data back from the external system, then you
need to integrate the external code with your Python (or REXX) code, and
that requires more than just firing off a few system commands and
forgetting about them. That's inherently complicated, because there are
many possible factors to care about:

- do you care about stdout, stderr, stdin?
- start a new subshell?
- deal with shell metacharacters?
- deadlocks?

etc. Lack of terse syntax is the least of your worries here, doing these
things *correctly* is hard in the shell. Having to type a few extra symbols
doesn't make it any harder in any meaningful way, and readable syntax can
actually make it simpler to reason about the code.

If you say "Anything that isn't parsable is automatically sent to the
shell", it doesn't sound too bad. But when you say "Unparseable junk is
implicitly treated as code and sent off to be executed by something which
traditionally tends to be forgiving of syntax errors and has the ability to
turn your file system into so much garbage", it sounds a tad less
appealing.

Chris Angelico · Feb 7, 2013

Nice? Are you being sarcastic? What you're describing sounds like a
classic "Do What I Mean" system, which invariably end up being followed by
anguished shouts of "Noooo, I didn't mean that!!!".

If you say "Anything that isn't parsable is automatically sent to the
shell", it doesn't sound too bad. But when you say "Unparseable junk is
implicitly treated as code and sent off to be executed by something which
traditionally tends to be forgiving of syntax errors and has the ability to
turn your file system into so much garbage", it sounds a tad less
appealing.

You misunderstand. It's actually a very simple rule. Python follows
C's principle of accepting that any return value from an expression
should be ignored if you don't do anything with it. REXX says that any
"bare expression" used as a statement is implicitly addressed to the
default host, which is usually a shell (though I built myself a MUD
system where the default would send text to the client, and shell
execution required ADDRESS CMD "some_command" explicitly); it's very
simple and doesn't feel like a DWIM system at all.

ChrisA

Steven D'Aprano · Feb 7, 2013

You misunderstand. It's actually a very simple rule. Python follows C's
principle of accepting that any return value from an expression should
be ignored if you don't do anything with it.

Return values are safe. They don't do anything, since they are *being
ignored*, not being executed as code. You have to explicitly choose to do
something with the return value before it does anything.

If C said "if you don't do anything with the return result of an
expression, execute it as code in the shell", would you consider that a
desirable principle to follow?

def oh_my_stars_and_garters():
return "rm -rf /"

oh_my_stars_and_garters()

REXX says that any "bare
expression" used as a statement is implicitly addressed to the default
host, which is usually a shell (though I built myself a MUD system where
the default would send text to the client, and shell execution required
ADDRESS CMD "some_command" explicitly); it's very simple and doesn't
feel like a DWIM system at all.

Are you saying that Dennis' description of REXX sending unparsable text
to the shell for execution is incorrect?

Chris Angelico · Feb 7, 2013

Return values are safe. They don't do anything, since they are *being
ignored*, not being executed as code. You have to explicitly choose to do
something with the return value before it does anything.

If C said "if you don't do anything with the return result of an
expression, execute it as code in the shell", would you consider that a
desirable principle to follow?

def oh_my_stars_and_garters():
return "rm -rf /"

oh_my_stars_and_garters()

Naming a function is safe, too.

def earth_shattering():
os.system("rm -rf /")

earth_shattering;

But putting parentheses after it suddenly makes it dangerous. Wow!
Python's pretty risky, right?

In REXX, you simply don't *do* that sort of thing. (You'd use the CALL
statement, for instance.)

ChrisA

Dennis Lee Bieber · Feb 7, 2013

If C said "if you don't do anything with the return result of an
expression, execute it as code in the shell", would you consider that a
desirable principle to follow?

def oh_my_stars_and_garters():
return "rm -rf /"

oh_my_stars_and_garters()

Return values wouldn't be sent to the shell... The function call was
parseable as language syntax.

Are you saying that Dennis' description of REXX sending unparsable text
to the shell for execution is incorrect?

E:\UserData\Wulfraed\My Documents>type test.rx
/* */
say time('n')
'say' time('n')
echo time('n')
'echo' time('n')
'echo' "time('n')"
echo "time('n')"
what = 'ho'
'ec' || what time('n')
ec || what time('n')
time('n')
say Now Calling "time()"
call time 'n'
say "time() was called"

E:\UserData\Wulfraed\My Documents>regina test.rx
13:50:06
'say' is not recognized as an internal or external command,
operable program or batch file.
3 *-* 'say' time('n')
+++ RC=1 +++
13:50:06
13:50:06
time('n')
time('n')
13:50:06
13:50:06
The filename, directory name, or volume label syntax is incorrect.
11 *-* time('n')
+++ RC=1 +++
NOW CALLING time()
time() was called

E:\UserData\Wulfraed\My Documents>

Line 1 (treating the /**/ as line 0) is the REXX "say" command,
outputting the result of calling the REXX time() function.

Line 2 quotes the "say", making it a string, and not a parseable
REXX statement; Windows command shell produces an error.

Line 3 has unquoted "echo" which is not a REXX command; it is
considered an external command and is passed the /result/ of calling
REXX time() -- where Windows executes it

Line 4 quotes "echo" but is otherwise the same

Lines 5 and 6 quote the "time()" call -- making that a string.

Lines 7-9 illustrate creating lines from string expressions.

Line 10 is treating the result of an internal function as the
command to be executed. So yes, REXX does not throw away function return
values -- you must catch them; if you don't want the return value, then
you need to "call" the function explicitly, shown in lines 11-13.

Plain REXX didn't have equivalents to "import", but could call
"subprograms" by just naming them on a statement... Making REXX closer
to BASH, cmd.exe, PowerShell, VMS DCL... Whereas Python is closer to
BASIC -- a byte-code compiled/interpreted language.

Steven D'Aprano · Feb 7, 2013

Chris said:
Naming a function is safe, too.

def earth_shattering():
os.system("rm -rf /")

earth_shattering;

But putting parentheses after it suddenly makes it dangerous. Wow!

Yes, that is correct. Because now you are executing code, which could do
something, and some things are dangerous. But Python will never execute
code unless you explicitly tell it to. There's no concept in Python of
falling back onto code execution if parsing fails.

Python's pretty risky, right?

Not really, because Python never executes anything you don't tell it to.

But on the other hand, compared to the sandboxing capabilities of Java and
Javascript, Python *is* pretty risky. Hell, we known how risky Javascript
and Actionscript are (Flash vulnerabilities are now the number 1 source of
malware, or so I understand), and they're designed to run in a secure
sandbox. Python isn't. Some rather innocent-looking things *could* involve
code execution (e.g. imports, attribute access). But given the assumption
that you know what you're doing, and you don't eval() or exec() untrusted
strings, Python never executes code *by mistake*.

You might write the wrong code (you can do that in any language) but Python
will never cause something to be executed *because it can't parse it*. If
something doesn't parse, you get a syntax error.

In REXX, you simply don't *do* that sort of thing. (You'd use the CALL
statement, for instance.)

Well, Dennis claims that he *does* do it, and that it is one of the better
features of REXX. And in the code snippet you published earlier, I saw
plenty of code intended for the shell, but no CALL statement in sight.

I note that you ignored my question. If REXX reaches code that fails to
parse, does it send it to the shell to be executed by default? If the
answer was No, I expect you would have said so.

Steven D'Aprano · Feb 7, 2013

Dennis Lee Bieber wrote:

Line 3 has unquoted "echo" which is not a REXX command; it is
considered an external command and is passed the /result/ of calling
REXX time() -- where Windows executes it

Good lord, that's even worse than I feared. So it's not just unparsable
non-REXX code that is implicitly sent to the shell, but the equivalent to
Python's NameErrors. And you can implicitly mix calls to the shell and REXX
function calls in the same line.

I know that sometimes Python's lack of declarations for variables causes
problems. If you make a typo when assigning to a variable, Python will go
ahead and create a new variable. But if you make a typo when *calling* a
function, or try to call something that doesn't exist, you get an
exception, not silently pushing the typo off to some other process to be
executed.

Chris Angelico · Feb 8, 2013

Chris Angelico wrote:
Yes, that is correct. Because now you are executing code, which could do
something, and some things are dangerous. But Python will never execute
code unless you explicitly tell it to. There's no concept in Python of
falling back onto code execution if parsing fails.

A bare expression IS explicitly calling for code execution. That's
simply the way the language is. If you don't want the return value to
be sent to the host, you either assign it to a variable, or use CALL
(if it's a function call).

I note that you ignored my question. If REXX reaches code that fails to
parse, does it send it to the shell to be executed by default? If the
answer was No, I expect you would have said so.

Anything that doesn't parse is a syntax error. It's only expression
results that go to the host (shell).

Good lord, that's even worse than I feared. So it's not just unparsable
non-REXX code that is implicitly sent to the shell, but the equivalent to
Python's NameErrors. And you can implicitly mix calls to the shell and REXX
function calls in the same line.

If it was meant to be a name, then it'll be in an expression, which -
see above - is an explicit request for it to be handled by the host.
The only thing that'll trip you up there is misspelling a language
keyword:

iff blah blah blah blah

which will be sent to the host instead of being an 'IF' statement.

ChrisA

Dennis Lee Bieber · Feb 8, 2013

Well, Dennis claims that he *does* do it, and that it is one of the better
features of REXX. And in the code snippet you published earlier, I saw
plenty of code intended for the shell, but no CALL statement in sight.

I've not used it in years, but yes... Since on the Amiga many
applications opened a "rexxport" they were candidates for "current
command processor". REXX implementations on Windows and Linux aren't as
flexible -- for the most part the only viable command processor is
spawning a shell to handle one statement (a la Python's os.system() ).

On the Amiga, one could script the systems text editor by using
something similar to (my manuals are in storage):

/* */
address EDITOR
open "somefile.txt"
3d
i
date('n')
<esc>
save
address COMMAND
copy somefile.txt df0:

(I don't recall the notation used to actually escape out of insert mode.

Translated:

Define the editor as the command processor
Editor open file command
Move down three lines
Enter insert mode
Insert current date (date is the REXX data function, what is
inserted is the return value from calling date()
Exit insert mode
Save the file
Revert to the normal command shell
Use normal shell command to copy the file to the first floppy drive.

It's the interactive nature seen in the editor portion that is not
easy to perform using Python -- even with subprocess.Popen() and the
various read/write operations.

Dennis Lee Bieber · Feb 8, 2013

Dennis Lee Bieber wrote:

Good lord, that's even worse than I feared. So it's not just unparsable
non-REXX code that is implicitly sent to the shell, but the equivalent to
Python's NameErrors. And you can implicitly mix calls to the shell and REXX
function calls in the same line.

You have to remember -- REXX was not created to be a stand-alone
programming language, as Python. It was created (in the late 70s early
80s) to be a "better" "shell" scripting language on IBM mainframes in
place of JCL and similar ("shell" in quotes as neither the interactive
command line nor submitted batch files were considered to run in
"shells"). Being able to transparently make use of the parent command
processor (or switching to a different command processor -- have the
script preprocess some data files, for example, then invoke the system
editor, switch to the editor as command processor, and edit a file based
upon what contents are in the preprocessed data).

Common practice is to quote the first word of any thing being sent
to the command processor -- to make it more explicit and to avoid
conflicts with REXX keywords. Hypothetically, if "DO" were the command
to invoke some system command (dump object file?):

DO sourcefile

would generate a syntax error as it doesn't match the REXX "DO" loop
syntax (as I recall, I originally indicated that stuff that doesn't
parse as a REXX /statement/ was sent to the command -- this parses as a
REXX loop with a syntax error); whereas

'DO' sourcefile

is a string, and will be sent to the command processor. The emphasis is
on "statement". Statements basically fall into

keyword blah blah blah
or
variable = blah blah blah

E:\UserData\Wulfraed\My Documents>type test.rx
/* */
operation.1 = 'date'
operation.3 = 'time'
do i = 1 to 4
say operation.i
operation.i '/t'
end
call operation
E:\UserData\Wulfraed\My Documents>type operation.4
/* */
say inside "operation.4"

E:\UserData\Wulfraed\My Documents>type operation.rx
/* */
say inside "operation.rx"

E:\UserData\Wulfraed\My Documents>regina test.rx
date
Fri 02/08/2013
OPERATION.2
'OPERATION.2' is not recognized as an internal or external command,
operable program or batch file.
6 *-* operation.i '/t'
+++ RC=1 +++
time
01:28 PM
OPERATION.4
INSIDE operation.rx

E:\UserData\Wulfraed\My Documents>

(Note: OPERATION.4 brings up a Windows requester that it doesn't know
how to execute that file type -- but the file was found, so no REXX
error; the CALL operation finds operation.rx, and executes it as a
subprogram)

If the line doesn't begin with a keyword or of the form "variable
=", then first translations are done for any word in the line that
matches a variable or function name (variables are replaced by their
value, functions are called and replaced by their return value), and
whatever is left of the line is passed to the command processor.

If you want the real nightmare -- look into the IBM "queue" scheme
(not many REXX implementations except on IBM mainframes support that).
One can push lines onto the queue, such that when the script exits, the
command processor reads those lines first before reading from
keyboard... Or push lots of text in a way that the next script to start
reads it without using a temporary file. IBM mainframes didn't
"multitask" too well <G>; no easy creation of processes with pipes
between them.

Chris Angelico · Feb 8, 2013

If you want the real nightmare -- look into the IBM "queue" scheme
(not many REXX implementations except on IBM mainframes support that).
One can push lines onto the queue, such that when the script exits, the
command processor reads those lines first before reading from
keyboard... Or push lots of text in a way that the next script to start
reads it without using a temporary file. IBM mainframes didn't
"multitask" too well <G>; no easy creation of processes with pipes
between them.

Heh. The OS/2 implementation of REXX has that too, but also has much
easier piping mechanisms... and, ironically, provides a convenient way
to pipe text into your script using the RXQUEUE external command:

"some_command | rxqueue /fifo"
do while queued()>0
parse pull blah blah blah
end

ChrisA

John Ladasky · Feb 8, 2013

To do anything meaningful in bash, you need to be an expert on
passing work off to other programs... [snip]
If you took the Zen of Python,
and pretty much reversed everything, you might have the Zen of Bash:

I have to agree.

Recently I needed to write some glue code which would accept some input; run a few Linux command-line programs which were supplied that input; run some Matplotlib scripts of my own to graph the results; and finally, clean up some unwanted intermediate files.

I realized that bash was the "right" way to get the job done... but after struggling with bash for a day, I decided to try Python.

I wrote a shell script that starts with "#!/usr/bin/env python". My program imports os, sys, and shlex.split. I had my first working version within about four hours, even though I had never written a
command-line Python program before.

Over the next several months, I returned to the program to make several improvements. I can't imagine maintaining a bash script that does what my Python script does.

Best Practice Question	1	Feb 4, 2013
what is the best practice to separate Pygtk and long running thread code	6	Sep 22, 2006
Code Review	5	May 24, 2011
mmap 2GB allocation limit on Win XP, 32-bits, Python 2.5.4	5	Jul 24, 2009
Zipfile module errors	2	Jun 4, 2008
Problems running on HP Intel duo core machine	1	Dec 11, 2008
Filename case-insensitivity on OS X	13	Jan 3, 2006
ANN: Version 0.1.2 of sarge (a subprocess wrapper library) has beenreleased.	0	Dec 17, 2013

Opinion on best practice...

Anssi Saari

Neil Cerutti

Grant Edwards

Grant Edwards

Grant Edwards

Dennis Lee Bieber

Steven D'Aprano

Chris Angelico

Steven D'Aprano

Chris Angelico

Dennis Lee Bieber

Steven D'Aprano

Steven D'Aprano

Chris Angelico

Dennis Lee Bieber

Dennis Lee Bieber

Chris Angelico

John Ladasky

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads