porting shell scripts: system(list), system_pipe(lists)

Discussion in 'Python' started by eichin@metacarta.com, Oct 2, 2003.

  1. Guest

    One of my recent projects has involved taking an accretion of sh and
    perl scripts and "doing them right" - making them modular, improving
    the error reporting, making it easier to add even more features to
    them. "Of course," I'm redoing them in python - much of the cut&paste
    reuse has become common functions, which then get made more robust and
    have a common style and are callable from other (python) tools
    directly, instead of having to exec scripts to get at them. The usual
    "glorious refactoring."

    Most of it has been great - os.listdir+i.endswith() instead of
    globbing, exception handling instead of "exit 1", that sort of thing.
    I've run into one weakness, though: executing programs.

    Python has, of course, os.fork and os.exec* corresponding to the raw
    unix functions. It also has the higher level os.system, popen,
    expect, and commands.get* functions. The former need a bunch of
    stylized operations performed; the latter *all* involve passing in
    strings which then leads one to quoting issues, which can be serious
    risks in some applications.

    Perl had one very helpful interface for this kind of thing: system and
    exec will both take array arguments:
    $ perl -e 'system("echo", "*")'
    *
    $ perl -e 'exec("echo", "*")'
    *
    versus
    $ perl -e 'exec("echo *")'
    #.newsrc-dribble# CVS stuff ...
    This has always struck me as "correct" - not the overloading,
    necessarily, but the use of a list.

    So, implementing system this way is easy enough:

    def system(cmd):
    pid = os.fork()
    if pid > 0:
    p, st = os.waitpid(pid, os.P_WAIT)
    if st == 0:
    return
    raise ExecFailed(str(cmd), st)
    elif pid == 0:
    try:
    os.execvp(cmd[0], cmd)
    except OSError, e:
    traceback.print_exc()
    os._exit(113)

    [The try/except is an interesting issue: if cmd[0] isn't found,
    os.execvp throws -- but it is already in the child, and this walks up
    the stack to any surrounding try/except, which then continues,
    possibly disastrously, whatever that code had been doing *in a
    duplicate process*. The _exit explicitly short cuts this.]

    So, this makes a big difference when porting simple bits of shell (and
    usually, just in passing, fixing quoting bugs - if you had code that
    used to do "ci -l $foo" and it is now "system(['ci', '-l', foo])"
    you now properly handle spaces and punctuation in the value of foo,
    "for free".) However, the other thing you tend to find in
    "advanced"[1] shell scripts is lengthy pipelines. (Sure, you find
    while loops and case statements and such - but python's control
    structures handle those fine.)

    Implementing pipelines takes rather a bit more work, and one might
    (not unreasonably) throw up one's hands and just use os.system and
    some re.sub's to do the quoting. However, I had enough cases where
    the goal really was to run a complex shell pipeline (I also had cases
    where the pipeline converted nicely to some inline python code,
    especially with the help of the gzip module) that I sat down and
    cooked up a pipeline class.

    The interface I ended up with is pretty simple:
    g_pipe = pipeline()
    g_pipe.stdin(open("blort.gz", "r"))
    g_pipe.append(["gunzip"])
    g_pipe.append(["sort", "-u"])
    g_pipe.append(["wc", "-l"])
    g_pipe.stdout(open("blort.count", "w"))
    print g_pipe.run()

    is equivalent to the sh:
    gunzip < blort.gz | sort -u | wc -l > blort.count

    pipeline also has obvious stderr and chdir methods; pipeline.run
    actually returns an array with the return status of *each* pipeline
    element (which leads to "if filter(None, st): deal_with_error" being a
    useful idiom for noticing failures that a shell script would typically
    miss.)

    This has lead me to a few questions:

    1. Am I being dense? Are there already common modules (included or
    otherwise) that do this, or solve the problem some other way?
    2. Is there a more pythonic way of expressing the construction?
    Would exposing the internal array of commands make more sense,
    possibly by "passing through" various array operations on the
    class to the internal array (as the use of "append" hints at)? Or
    maybe "exec" objects that a "pipe" combiner operates on?
    3. Should an interface like this be in a "battery" somewhere? shutil
    didn't seem to quite match...
    4. Any reason to even try porting this interface to non-unix systems?
    Is there a close enough match to os.pipe/os.fork/os.exec/os.wait,
    or some other construct that works on microsoft platforms?

    _Mark_ <>

    [1] in the Invader Zim sense :)
    , Oct 2, 2003
    #1
    1. Advertising

  2. wrote:
    > One of my recent projects has involved taking an accretion of sh and
    > perl scripts and "doing them right" - making them modular, improving
    > the error reporting, making it easier to add even more features to
    > them. "Of course," I'm redoing them in python - much of the cut&paste
    > reuse has become common functions, which then get made more robust and
    > have a common style and are callable from other (python) tools
    > directly, instead of having to exec scripts to get at them. The usual
    > "glorious refactoring."
    >

    <<SNIP>>
    >
    > Implementing pipelines takes rather a bit more work, and one might
    > (not unreasonably) throw up one's hands and just use os.system and
    > some re.sub's to do the quoting. However, I had enough cases where
    > the goal really was to run a complex shell pipeline (I also had cases
    > where the pipeline converted nicely to some inline python code,
    > especially with the help of the gzip module) that I sat down and
    > cooked up a pipeline class.
    >
    > The interface I ended up with is pretty simple:
    > g_pipe = pipeline()
    > g_pipe.stdin(open("blort.gz", "r"))
    > g_pipe.append(["gunzip"])
    > g_pipe.append(["sort", "-u"])
    > g_pipe.append(["wc", "-l"])
    > g_pipe.stdout(open("blort.count", "w"))
    > print g_pipe.run()
    >
    > is equivalent to the sh:
    > gunzip < blort.gz | sort -u | wc -l > blort.count
    >

    <<SNIP>>
    >
    > _Mark_ <>
    >
    > [1] in the Invader Zim sense :)


    I think that your pipeline code looks nothing like the original sh
    script pipeline which to me counts heavily against it.
    Just playing at the cygwin prompt...
    $ ls -l|wc -l > /tmp/lines_in_dir
    $ cat /tmp/lines_in_dir
    465
    $ python
    >>> from os import system
    >>> system(r'''/bin/ls -l|/bin/wc -l > /tmp/lines_in_dir2''')

    0
    >>> system(r'''/bin/cat /tmp/lines_in_dir2''')

    463
    0

    I prefer the above because it looks like the original sh command.
    Of course, if script security is very important then you may want to
    change the way things are implemented again.

    Cheers, Paddy.
    Donald 'Paddy' McCarthy, Oct 2, 2003
    #2
    1. Advertising

  3. Donn Cave Guest

    Quoth :
    ....
    | 1. Am I being dense? Are there already common modules (included or
    | otherwise) that do this, or solve the problem some other way?

    I can't tell you whether any of them has come to be common, but
    there have been a handful of efforts along these lines - process
    and pipeline creation.

    | 2. Is there a more pythonic way of expressing the construction?
    | Would exposing the internal array of commands make more sense,
    | possibly by "passing through" various array operations on the
    | class to the internal array (as the use of "append" hints at)? Or
    | maybe "exec" objects that a "pipe" combiner operates on?

    Only thing that comes to mind is error handling. It certainly is
    not characteristic of Python functions to return an error status,
    rather they typically raise exceptions. Ideally, I would think
    the exception type for this would carry the exit status, other
    information in the status word, and text from error/diagnostic
    output. That last one is particularly important and particularly
    awkward to get.

    See appended example for a trick to deal with the special case
    where a Python exception is caught in the fork.

    | 3. Should an interface like this be in a "battery" somewhere? shutil
    | didn't seem to quite match...

    No one ever likes anyone else's version of this, so it's typically
    reinvented as required.

    | 4. Any reason to even try porting this interface to non-unix systems?
    | Is there a close enough match to os.pipe/os.fork/os.exec/os.wait,
    | or some other construct that works on microsoft platforms?

    There's os.spawnv, if you haven't noticed that.

    Donn Cave,
    -----------
    import fcntl
    import posix
    import sys
    import pickle

    def spawn_wnw(wait, file, args, env):
    p0, p1 = posix.pipe()
    pid = posix.fork()
    if pid:
    posix.close(p1)
    ps = posix.read(p0, 1024)
    posix.close(p0)
    if wait:
    junk, ret = posix.waitpid(pid, 0)
    else:
    ret = pid
    if ps:
    e, v = pickle.loads(ps)
    raise e, v
    else:
    return ret
    else:
    try:
    fcntl.fcntl(p1, fcntl.F_SETFD, fcntl.FD_CLOEXEC)
    posix.close(p0)
    posix.execve(file, args, env)
    except:
    e, v, t = sys.exc_info()
    s = pickle.dumps((e, v))
    posix.write(p1, s)
    posix._exit(117)

    def spawnw(file, args, env):
    spawn_wnw(1, file, args, env)

    def spawn(file, args, env):
    spawn_wnw(0, file, args, env)

    pid = spawn('/bin/bummer', ['bummer', '-ever', 'summer'], posix.environ)
    Donn Cave, Oct 2, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==

    List of lists of lists of lists...

    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==, May 8, 2006, in forum: Python
    Replies:
    5
    Views:
    384
    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==
    May 15, 2006
  2. yomgui

    list of lists of lists ....

    yomgui, Jul 28, 2006, in forum: Python
    Replies:
    6
    Views:
    297
    yomgui
    Jul 31, 2006
  3. bahoo
    Replies:
    3
    Views:
    289
    Bruno Desthuilliers
    Apr 3, 2007
  4. antar2
    Replies:
    2
    Views:
    375
    Bighead
    Jul 17, 2008
  5. Prakash Maria susai

    Porting ruby scripts to JRuby

    Prakash Maria susai, Feb 24, 2009, in forum: Ruby
    Replies:
    2
    Views:
    136
    Prakash Maria susai
    Mar 3, 2009
Loading...

Share This Page