Using python for writing models: How to run models in restricted python mode?

Discussion in 'Python' started by vinjvinj, Nov 7, 2005.

  1. vinjvinj

    vinjvinj Guest

    I have an application which allows multiple users to write models.
    These models get distributed on a grid of compute engines. users submit
    their models through a web interface. I want to

    1. restrict the user from doing any file io, exec, import, eval, etc. I
    was thinking of writing a plugin for pylint to do all the checks? Is
    this is a good way given that there is no restricted python. What are
    the things I should serach for in python code

    2. restrict the amount of memory a module uses as well. For instance
    how can I restrict a user from doing a = range(10000000000) or similar
    tasks so that my whole compute farm does not come down.

    Thanks for your help
    vinjvinj, Nov 7, 2005
    #1
    1. Advertising

  2. vinjvinj

    Mike Meyer Guest

    Re: Using python for writing models: How to run models inrestricted python mode?

    "vinjvinj" <> writes:
    > 1. restrict the user from doing any file io, exec, import, eval, etc. I
    > was thinking of writing a plugin for pylint to do all the checks? Is
    > this is a good way given that there is no restricted python. What are
    > the things I should serach for in python code


    Um - I've got a restricted python module: rexec.py. Of course, it
    doesn't work correctly, in that it isn't really secure. Python is very
    powerful, and creating a secure sandbox is difficult - so much so that
    the task has never been accomplished. If you want something that will
    keep the obvious things from working, rexec.py might be for you - but
    don't kid yourself that it's secure. If you need real security, I'd
    consider switching to Jython, which at least has a VM which was
    designed with building such sandboxes as a possibility.

    > 2. restrict the amount of memory a module uses as well. For instance
    > how can I restrict a user from doing a = range(10000000000) or similar
    > tasks so that my whole compute farm does not come down.


    This is equivalent to trying to limit the amount of CPU time the
    module uses, which is better known as the halting problem. There's no
    algorithmic solution to that. If you want verify that some module will
    only use so much memory before executing it, the best you can do is
    verify that they don't do anything obvious. If you want to restrict
    them while they are running, you can probably get the OS to
    help. Exactly how will depend on your requirements, and the OS
    involved.

    <Mike
    --
    Mike Meyer <> http://www.mired.org/home/mwm/
    Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
    Mike Meyer, Nov 7, 2005
    #2
    1. Advertising

  3. vinjvinj

    vinjvinj Guest

    While I understand 2 is very hard (if not impossible) to do in single
    unix process. I'm not sure why 1 would be hard to do. Since I have
    complete control to what code I can allow or not allow on my grid. Can
    i not just search for certain strings and disallow the model if it
    fails certain conditions. It might not be 100% secure but will it not
    get me at 90%...
    vinjvinj, Nov 7, 2005
    #3
  4. vinjvinj

    Mike Meyer Guest

    Re: Using python for writing models: How to run models inrestricted python mode?

    "vinjvinj" <> writes:

    > While I understand 2 is very hard (if not impossible) to do in single
    > unix process. I'm not sure why 1 would be hard to do. Since I have
    > complete control to what code I can allow or not allow on my grid. Can
    > i not just search for certain strings and disallow the model if it
    > fails certain conditions. It might not be 100% secure but will it not
    > get me at 90%...


    Sure you can search for certain strings. Python lets you build strings
    dynamically, so you'd have to search for every possible way to create
    those strings. Further, Python provides lots of tools for
    introspection, meaning there are lots of ways to find these
    "forbidden" objects other than mentioning their name.

    You can get to *every* builtin function through any python module. For
    instance, are you going to prevent them from using regular
    rexpressions? If not, consider:

    >>> getattr(re, ''.join([chr(x + 1) for x in [94, 94, 97, 116, 104, 107, 115, 104, 109, 114, 94, 94]]))['fi' + 'le'] is open

    True
    >>>


    String searches only prevent the most obvious abuses, and may well
    miss things that are merely not quite so obvious. If you think of your
    "security" as a notice to the end user that they are doing something
    wrong, as opposed to a tool that will prevent them from doing it, then
    you'll have the right idea. In which case, I'd still recommend looking
    into the rexec module.

    <mike
    --
    Mike Meyer <> http://www.mired.org/home/mwm/
    Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
    Mike Meyer, Nov 7, 2005
    #4
  5. Re: Using python for writing models: How to run models in restrictedpython mode?

    vinjvinj wrote:

    > While I understand 2 is very hard (if not impossible) to do in single
    > unix process. I'm not sure why 1 would be hard to do. Since I have
    > complete control to what code I can allow or not allow on my grid. Can
    > i not just search for certain strings and disallow the model if it
    > fails certain conditions. It might not be 100% secure but will it not
    > get me at 90%...


    You might be able to think of and disallow the most
    obvious security holes, but how confident are you that
    you will think of the bad code that your users will
    think of?

    Are you concerned about malicious users, or just
    incompetent users?

    I suspect your best bet might be to write a
    mini-language using Python, and get your users to use
    that. You will take a small performance hit, but
    security will be very much improved.

    What do others think?


    --
    Steven.
    Steven D'Aprano, Nov 8, 2005
    #5
  6. vinjvinj

    Paul Rubin Guest

    Steven D'Aprano <> writes:
    > I suspect your best bet might be to write a mini-language using
    > Python, and get your users to use that. You will take a small
    > performance hit, but security will be very much improved.
    >
    > What do others think?


    That is the only approach that makes any sense. Even with restricted
    execution there's no way to stop memory exhaustion with restricted
    Python statements. Consider

    xxx = 'x'*10000000000
    Paul Rubin, Nov 8, 2005
    #6
  7. vinjvinj

    vinjvinj Guest

    I'm more worried about incompetent users then malicious users. I'm
    going to take the following steps:

    1. My users will be paying a decent amount of money to run models on
    the compute grid. If they are intentionaly writing malicious code then
    their account will be disabled.

    2. Since their models will be fairly basic.
    - No imports in the code.
    - No special charters allowed.
    - No access to special builtins.

    The users write functions which get called man many times with
    different variables. I'm not sure how this would work with the rexec
    module especially since I'll be passing values to th functions and the
    functions will be returning either None, yes, or False.

    3. Pylint has a pretty cool way to write your onw custom plugins. You
    can write custom handlers for each sort of available node at:
    http://www.python.org/doc/current/lib/module-compiler.ast.html
    this will allow me to compile a module and give users feedback on what
    is wrong and what is not allowed.

    4. I'll set up a test sandbox where the models will be run with a
    smaller dataaset before then can be pushed into production. if the
    models pass the sandbox test then they will be run in production.

    I'm going to have write some custom performance monitoring functions to
    get notified when some models are running for ever and be able to
    terminate them.

    vinjvinj
    vinjvinj, Nov 8, 2005
    #7
  8. vinjvinj

    vinjvinj Guest

    I have so many things to do to get this to production and writing a
    mini language would be a full project in itself. :-<.

    Is there an easy way to do this? If not, I'll go with the steps
    outlined in my other post.

    vinjvinj
    vinjvinj, Nov 8, 2005
    #8
  9. Re: Using python for writing models: How to run models in restrictedpython mode?

    vinjvinj wrote:
    > I have so many things to do to get this to production and writing a
    > mini language would be a full project in itself. :-<.
    >
    > Is there an easy way to do this? If not, I'll go with the steps
    > outlined in my other post.


    Do you really think it will be faster to start parsing Python code,
    looking for potentially dangerous constructs?
    Jeffrey Schwab, Nov 8, 2005
    #9
  10. vinjvinj

    vinjvinj Guest

    vinjvinj, Nov 8, 2005
    #10
  11. vinjvinj

    Paul Rubin Guest

    Paul Rubin, Nov 8, 2005
    #11
  12. vinjvinj

    vinjvinj Guest

    vinjvinj, Nov 8, 2005
    #12
  13. vinjvinj

    Magnus Lycka Guest

    Re: Using python for writing models: How to run models in restrictedpython mode?

    vinjvinj wrote:
    > I have an application which allows multiple users to write models.
    > These models get distributed on a grid of compute engines. users submit
    > their models through a web interface. I want to
    >
    > 1. restrict the user from doing any file io, exec, import, eval, etc. I
    > was thinking of writing a plugin for pylint to do all the checks? Is
    > this is a good way given that there is no restricted python. What are
    > the things I should serach for in python code


    I'm not sure why you want to prevent e.g. all file io. Let the jobs run
    as users with very limited permissions.

    > 2. restrict the amount of memory a module uses as well. For instance
    > how can I restrict a user from doing a = range(10000000000) or similar
    > tasks so that my whole compute farm does not come down.


    Use Sun Grid Engine. http://gridengine.sunsource.net/documentation.html
    Magnus Lycka, Nov 8, 2005
    #13
  14. vinjvinj wrote:

    > 2. restrict the amount of memory a module uses as well. For instance
    > how can I restrict a user from doing a = range(10000000000) or similar
    > tasks so that my whole compute farm does not come down.


    The safest way to do this in unix is to run the model in a separate process,
    and use ulimit (or the resource module) to limit the memory usage.

    --
    Jeremy Sanders
    http://www.jeremysanders.net/
    Jeremy Sanders, Nov 9, 2005
    #14
  15. vinjvinj

    vinjvinj Guest

    Unfortunately this in not an options since all the processes share
    objects in memory which are about 1gig for each node. Having a copy of
    this in each user process is just not an options. I think I'm going to
    use RestrictedPython from zope3 svn which should take care of 70-80 %
    of the problem.
    vinjvinj, Nov 9, 2005
    #15
  16. vinjvinj wrote:

    > Unfortunately this in not an options since all the processes share
    > objects in memory which are about 1gig for each node. Having a copy of
    > this in each user process is just not an options. I think I'm going to
    > use RestrictedPython from zope3 svn which should take care of 70-80 %
    > of the problem.


    I wonder whether it is possible to fork() the program, restricting the
    memory usuage for the forked program. In most unix variants, forked
    programs share memory until that memory is written to. Of course this may
    not be useful if there's data going back and forth all the time.

    --
    Jeremy Sanders
    http://www.jeremysanders.net/
    Jeremy Sanders, Nov 10, 2005
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jean-Paul Calderone
    Replies:
    0
    Views:
    430
    Jean-Paul Calderone
    Nov 7, 2005
  2. John J Lee
    Replies:
    3
    Views:
    470
    bruno at modulix
    Dec 1, 2005
  3. Edward Loper
    Replies:
    0
    Views:
    457
    Edward Loper
    Aug 7, 2007
  4. John J Lee
    Replies:
    0
    Views:
    511
    John J Lee
    Aug 7, 2007
  5. David Heinemeier Hansson
    Replies:
    0
    Views:
    233
    David Heinemeier Hansson
    Dec 23, 2004
Loading...

Share This Page