Yet Another Command Line Parser

Discussion in 'Python' started by Manlio Perillo, Oct 26, 2004.

  1. Regards.
    In the standard library there are two modules for command line
    parsing: optparse and getopt.
    In the Python Cookbook there is another simple method for parsing,
    using a docstring.

    However sometimes (actually, in all my small scripts) one has a simple
    function whose arguments are choosen on the command line.

    For this reason I have written a simple module, optlist, that parses
    the command line as it was a function's argument list.

    It is more simple to post an example:


    import optlist


    def main(a, b, *args, **kwargs):
    print 'a =', a
    print 'b =', b

    print 'args:', args
    print 'kwargs:', kwargs

    optlist.call(main)


    And on the shell:
    shell: script.py 10, 20, 100, x=1



    Since sometimes one needs to keep the options, I have provided an
    alternate syntax, here is an example:


    import optlist

    optlist.setup('a, b, *args, **kwargs')

    print 'a =', optlist.a
    print 'b =', optlist.b

    print 'args:', optlist.args
    print 'kwargs:', optlist.kwargs



    Finally, the module is so small that I post it here:

    -------------------------- optlist.py --------------------------------

    import sys

    # add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
    _options = ' '.join(sys.argv[1:])

    def call(func):
    """
    Call func, passing to it the arguments from the command line
    """
    exec('func(' + _options + ')')

    def setup(template):
    """
    Template is a string containing the argument list.
    The command line options are evaluated according to the template
    and the values are stored in the module dictionary
    """
    exec('def helper(' + template +
    '):\n\tglobals().update(locals())')
    exec('helper(' + _options + ')')

    ----------------------------------------------------------------------



    I hope that this is not 'Yet Another Unuseful Module' and that the
    code is correct.

    The only problem is that error messages are ugly.



    Regards Manlio Perillo
    Manlio Perillo, Oct 26, 2004
    #1
    1. Advertising

  2. Manlio Perillo

    Andrew Dalke Guest

    Manlio Perillo wrote:
    > # add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
    > _options = ' '.join(sys.argv[1:])
    >
    > def call(func):
    > """
    > Call func, passing to it the arguments from the command line
    > """
    > exec('func(' + _options + ')')


    > The only problem is that error messages are ugly.


    And it's a huge security hole. What if I did


    script.py "x=6)\
    import os
    os.system('ls -l')"

    Even if not a security hole, it's tricky to handle the
    combined shell and Python escaping rules

    script.py x="This is a string"

    won't work, while

    script.py 'x="This is a string"'

    should. Embedding ! and \escaped characters should be
    even more fun.

    Andrew
    Andrew Dalke, Oct 26, 2004
    #2
    1. Advertising

  3. On Tue, 26 Oct 2004 19:33:42 GMT, Andrew Dalke <>
    wrote:

    >Manlio Perillo wrote:
    >> # add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
    >> _options = ' '.join(sys.argv[1:])
    >>
    >> def call(func):
    >> """
    >> Call func, passing to it the arguments from the command line
    >> """
    >> exec('func(' + _options + ')')

    >
    >> The only problem is that error messages are ugly.

    >
    >And it's a huge security hole.


    I know that executing arbitrary code is a security hole.
    However it is intended for 'personal' use.
    In this way for my scripts I have only to write a single line of code
    for options handling.
    Later, for production code, one can use getopt.


    >What if I did
    >
    >
    >script.py "x=6)\
    >import os
    >os.system('ls -l')"
    >


    A solution is to use eval, but this does not handle keyword arguments.

    >Even if not a security hole, it's tricky to handle the
    >combined shell and Python escaping rules
    >
    > script.py x="This is a string"
    >
    >won't work, while
    >
    > script.py 'x="This is a string"'
    >
    >should. Embedding ! and \escaped characters should be
    >even more fun.
    >


    I'm not a shell expert, but the solution isn't simply to use ' or ''?

    script.py x='\n'



    Thanks and regards Manlio Perillo
    Manlio Perillo, Oct 26, 2004
    #3
  4. Manlio Perillo

    Ian Bicking Guest

    Manlio Perillo wrote:
    > Regards.
    > In the standard library there are two modules for command line
    > parsing: optparse and getopt.
    > In the Python Cookbook there is another simple method for parsing,
    > using a docstring.
    >
    > However sometimes (actually, in all my small scripts) one has a simple
    > function whose arguments are choosen on the command line.
    >
    > For this reason I have written a simple module, optlist, that parses
    > the command line as it was a function's argument list.
    >
    > It is more simple to post an example:
    >
    >
    > import optlist
    >
    >
    > def main(a, b, *args, **kwargs):
    > print 'a =', a
    > print 'b =', b
    >
    > print 'args:', args
    > print 'kwargs:', kwargs
    >
    > optlist.call(main)
    >
    >
    > And on the shell:
    > shell: script.py 10, 20, 100, x=1


    I think it would be better if this was called like
    script 10 20 100 --x=1

    With something like:

    def parse_args(args):
    kw = {}
    pos = []
    for arg in args:
    if arg.startswith('--') and '=' in arg:
    name, value = arg.split('=', 1)
    kw[name] = value
    else:
    pos.append(arg)
    return pos, kw

    def call(func, args=None):
    if args is None:
    args = sys.argv[1:]
    pos, kw = parse_args(args)
    func(*pos, **kw)


    This isn't exactly what you want, since you want Python expressions
    (e.g., 10 instead of '10'). But adding expressions (using eval) should
    be easy. Or, you can be more restrictive, and thus safer:

    def coerce(arg_value):
    try:
    return int(arg_value)
    except TypeError:
    pass
    try:
    return float(arg_value)
    except TypeError:
    pass
    return arg_value

    Or a little less restrictive, allowing for dictionaries and lists, but
    still falling back on strings:

    def coerce(arg_value):
    # as above for int and float
    if arg_value[0] in ('[', '{'):
    return eval(arg_value)
    return arg_value


    --
    Ian Bicking / / http://blog.ianbicking.org
    Ian Bicking, Oct 26, 2004
    #4
  5. Andrew Dalke <> wrote:
    ...
    > > exec('func(' + _options + ')')

    >
    > > The only problem is that error messages are ugly.

    >
    > And it's a huge security hole. What if I did
    >
    > script.py "x=6)\
    > import os
    > os.system('ls -l')"


    Not to defend exec (ugly thing it is), but in this case I'm not sure
    what the security hole would be. If I enter that tricky commandline at
    a shell prompt, it will be just as if i had executed the 'ls -l' at the
    same shell prompt; weird, but where is the huge security hole? It's not
    as if there were setuid shell scripts (is there...? I sure hope not!-).

    IOW, what's the difference between that and the commandline

    script.py 'x=6' && ls -l

    for example? The latter is no security hole, after all.

    I understand and agree with the other criticisms you extend to the OP's
    code, but this one leaves me perplexed. exec is a huge security hole of
    you're doing it on untrusted data, data supplied by somebody else than
    the uid running the script; but how are commandline arguments
    'untrusted'...?


    Alex
    Alex Martelli, Oct 26, 2004
    #5
  6. Manlio Perillo

    Andrew Dalke Guest

    Alex:
    > Not to defend exec (ugly thing it is), but in this case I'm not sure
    > what the security hole would be.


    In some sense we're both right, or wrong. Security depends on
    the system. If someone saw that code, found it interesting, added
    it to a script, which passed through a few people to someone
    who uses it as part of a public service, then it's possible a
    malicious user of that service may be able to execute arbitrary
    code on the server.


    > If I enter that tricky commandline at
    > a shell prompt, it will be just as if i had executed the 'ls -l' at the
    > same shell prompt; weird, but where is the huge security hole? It's not
    > as if there were setuid shell scripts (is there...? I sure hope not!-).


    In that environment there are fewer problems.

    > but how are commandline arguments 'untrusted'...?


    I had to think about that for a bit. Much of the work I do
    (for money or otherwise) ends up being called by some sort
    of web interface or is the interface to such code. Much of
    the data I use can come from untrusted sources. So I've
    developed a programming habit of being distrustful of any
    data I get, even if it's from me.

    As a consequence that also means I don't need to think about
    the multiple levels in the system.

    Andrew
    Andrew Dalke, Oct 27, 2004
    #6
  7. Andrew Dalke <> wrote:
    ...
    > > Not to defend exec (ugly thing it is), but in this case I'm not sure
    > > what the security hole would be.

    >
    > In some sense we're both right, or wrong. Security depends on


    Yeah, I see your POV.

    > > but how are commandline arguments 'untrusted'...?

    >
    > I had to think about that for a bit. Much of the work I do
    > (for money or otherwise) ends up being called by some sort
    > of web interface or is the interface to such code. Much of
    > the data I use can come from untrusted sources. So I've
    > developed a programming habit of being distrustful of any
    > data I get, even if it's from me.
    >
    > As a consequence that also means I don't need to think about
    > the multiple levels in the system.


    Yes: never trusting any data anywhere is a safer habit to acquire, and
    if you do get into that mindset your code will have fewer risks of
    vulnerabilities. An "Only the paranoids survive" kind of stance.

    However, it's an interesting characteristic of security that it is _not_
    free: each security measure, precaution and stance carries a cost in
    terms of convenience and productivity. In any given situation, there
    _are_ upper limits to the total amount of such costs that can and will
    be born in the name of security. Thus, I believe it's _good_ for
    security to be aware exactly of how much security you're buying, what
    threat you are warding off and to what extent, with each security
    measure you are taking -- a cost/benefit analysis.

    Many practices that weaken security _also_ damage code quality in other
    ways, for example by being prone to hard-to-reproduce,
    hard-to-track-down bugs. The 'exec' statement that you criticized
    surely falls into that category. For _those_ practices, I believe that
    cost/benefit analysis may well be nearly superfluous: the old slogan
    "quality is free" has some truth to it, in as much as the costs of
    making good quality code today tend to be repaid with interest in
    lowering maintenance costs in the future, enhancing reusability, etc.

    So I think a "knee-jerk reaction" against some kinds of 'code smells' is
    quite OK. More generally, I'm not sure "knee-jerk security" is a net
    win, though. The classic example is forcing people to use
    12-characters-long, randomly assigned passwords that they can't change:
    inevitably they _will_ write those passwords down somewhere, creating
    far worse security risks than if some cost-benefit analysis had been
    done to find a reasonable compromise between security and practicality.

    These are rather general considerations, musings if you will, about
    security and development practices, and in no way meant to defend the
    'exec' statement you were criticizing.


    Alex
    Alex Martelli, Oct 27, 2004
    #7
  8. Manlio Perillo

    Andrew Dalke Guest

    Alex:
    > However, it's an interesting characteristic of security that it is _not_
    > free: each security measure, precaution and stance carries a cost in
    > terms of convenience and productivity.


    Almost completely agreed, though I think there are cases where
    a solution with better security doesn't have that tradeoff.

    Using Python is one .. haven't had to worry much about stack
    overflows, etc. and I've been much more productive. :)

    Andrew
    Andrew Dalke, Oct 27, 2004
    #8
  9. Andrew Dalke <> wrote:

    > Alex:
    > > However, it's an interesting characteristic of security that it is _not_
    > > free: each security measure, precaution and stance carries a cost in
    > > terms of convenience and productivity.

    >
    > Almost completely agreed, though I think there are cases where
    > a solution with better security doesn't have that tradeoff.
    >
    > Using Python is one .. haven't had to worry much about stack
    > overflows, etc. and I've been much more productive. :)


    Right, of course: more generally, increasing code quality tends to
    enhance security as a side effect (since many bugs might potentially be
    subject to security exploits), as I indicated, yet it also tends to
    lower lifetime costs (since maintenance costs are such a high part of
    lifetime costs).

    Using Python _is_ a case of "increasing code quality"!-)


    Alex
    Alex Martelli, Oct 27, 2004
    #9
  10. On Tue, 26 Oct 2004 19:33:42 GMT, Andrew Dalke <>
    wrote:

    >Manlio Perillo wrote:
    >> # add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
    >> _options = ' '.join(sys.argv[1:])
    >>
    >> def call(func):
    >> """
    >> Call func, passing to it the arguments from the command line
    >> """
    >> exec('func(' + _options + ')')

    >
    >> The only problem is that error messages are ugly.

    >
    >And it's a huge security hole. What if I did
    >
    >
    >script.py "x=6)\
    >import os
    >os.system('ls -l')"
    >


    I'm not sure (it does not works on Windows 'shell'), have you run this
    code? It does not raises a SyntaxError?

    >Even if not a security hole, it's tricky to handle the
    >combined shell and Python escaping rules
    >
    > script.py x="This is a string"
    >
    >won't work, while
    >
    > script.py 'x="This is a string"'
    >


    Actually on Windows the right syntax is
    script.py x='"This is a string"'

    >should. Embedding ! and \escaped characters should be
    >even more fun.
    >



    Thanks and regards Manlio Perillo
    Manlio Perillo, Oct 27, 2004
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Berehem
    Replies:
    4
    Views:
    537
    Lawrence Kirby
    Apr 28, 2005
  2. Giulio  Piancastelli

    (Yet Another?) RSS::Parser test suite

    Giulio Piancastelli, Nov 17, 2004, in forum: Ruby
    Replies:
    6
    Views:
    186
    Kouhei Sutou
    Nov 23, 2004
  3. Eric Mahurin
    Replies:
    10
    Views:
    235
    Eric Mahurin
    Sep 14, 2005
  4. stevetuckner
    Replies:
    3
    Views:
    182
    stevetuckner
    Sep 21, 2005
  5. Marc Hoeppner

    yet another text parser...

    Marc Hoeppner, Jul 18, 2007, in forum: Ruby
    Replies:
    5
    Views:
    125
    Marc Hoeppner
    Jul 18, 2007
Loading...

Share This Page