Embedding a restricted python interpreter

R

Rolf Magnus

Hi,

I would like to embed a python interpreter within a program, but since that
program would be able to automatically download scripts from the internet,
I'd like to run those in a restricted environment, which basically means
that I want to allow only a specific set of modules to be used by the
scripts, so that it wouldn't be possible for them to remove files from the
hard drive, kill processes or do other nasty stuff.
Is there any way to do that with the standard python interpreter?
 
P

Paul Rubin

Rolf Magnus said:
I would like to embed a python interpreter within a program, but since that
program would be able to automatically download scripts from the internet,
I'd like to run those in a restricted environment, which basically means
that I want to allow only a specific set of modules to be used by the
scripts, so that it wouldn't be possible for them to remove files from the
hard drive, kill processes or do other nasty stuff.
Is there any way to do that with the standard python interpreter?

Don't count on it.
 
M

Maurice LING

Rolf said:
Hi,

I would like to embed a python interpreter within a program, but since that
program would be able to automatically download scripts from the internet,
I'd like to run those in a restricted environment, which basically means
that I want to allow only a specific set of modules to be used by the
scripts, so that it wouldn't be possible for them to remove files from the
hard drive, kill processes or do other nasty stuff.
Is there any way to do that with the standard python interpreter?

I won't really count on that. In my opinions, which may be wrong, Python
is not constructed to work in a sandbox like Java. Java does it by
subjecting all classes that it loads through a security manager. What
you seems to want is a Python to have Java applet-typed of restrictions.

You can try to use 'exec' to run your scripts in a constructed
environment. For example,

global = {}
local = {}

.... your stuffs ....

statement = [] # to hold the script to run

for line in statement:
exec statement in global, local

global and local are the global and local namespaces respectively.
Although it had been explained to me before but I can't recall the
details of how it works. In gist, you may be able to craft a global and
local environment for your script to run in.

I do not know if it is possible to disable or override 'import'......

maurice
 
C

Craig Ringer

I won't really count on that. In my opinions, which may be wrong, Python
is not constructed to work in a sandbox like Java.

That is my understanding. In fact, I'd say with Python it's nearly
impossible given how dynamic everything is and the number of tricks that
can be used to obfuscate what you're doing. Think of the fun that can be
had with str.encode / str.decode and getattr/hasattr .

I looked into this, and my conclusion ended up being "Well, I'm using
Python because I want it's power and flexibilty. If I want a secure
scripting environment, I should use something like Lua or Qt Script for
Applications instead."

AFAIK that's why the rexec() builtin is disabled - it's just not
practical to make a restricted Python execution environment.
You can try to use 'exec' to run your scripts in a constructed
environment. For example,

global = {}
local = {}

... your stuffs ....

statement = [] # to hold the script to run

for line in statement:
exec statement in global, local

global and local are the global and local namespaces respectively.
Although it had been explained to me before but I can't recall the
details of how it works. In gist, you may be able to craft a global and
local environment for your script to run in.

I do not know if it is possible to disable or override 'import'......

You can do a fair bit to it by wrapping/replacing __builtin__.__import__
.. Preventing people from getting around what you've done, though... not
sure.
 
P

Paul Rubin

Maurice LING said:
I won't really count on that. In my opinions, which may be wrong,
Python is not constructed to work in a sandbox like Java. Java does it
by subjecting all classes that it loads through a security
manager. What you seems to want is a Python to have Java applet-typed
of restrictions.

Java has also been subject to years and years of attacks against the
sandbox, followed by patches, followed by more attacks and more
patches, so at this point it's not so easy to get past the security
any more. But in the beginning it was full of bugs, and it may still
have bugs. Python's rexec never attracted the attention of serious
attackers.

If you really have to do restricted execution, your best bet is to put
the sandbox in a separate process chrooted to where it can't mess with
the file system, and have it communicate with your application through
a socket. I think there may be a way now to trap any system calls
that it attempts, too. Of course none of that stops resource
exhaustion attacks, etc.

I don't have direct knowledge but it seems to me that there's
potential for the situation to improve under PyPy, whose interpreter
will have an extra layer where various bad operations can be trapped,
if my impression is correct. So the long term prospects for secure
rexec may be better than the immediate ones.
 
F

Fuzzyman

Fredrick Lundh (at www.effbot.org ) was working on a 'cut down python'
that only implements the bits of python he likes !! It would be great
if the core of that interpreter could be used as a 'restricted
interpreter'.

If you could externally disable os, sys, os.path modules etc and limit
the set of modules, then you could have a useful restricted
environment. It would need a special interpreter though - so NO is the
short answer.
Regards,

Fuzzy
http://www,voidspace.org.uk/python/index.shtml
 
A

Andy Gross

Check out
http://mail.python.org/pipermail/python-dev/2003-January/031851.html
for a historical thread on rexec.py's vulnerabilities.

Right now, the answer for people who want restricted execution is
usually "wait for pypy", due to the number of tricks that can subvert
the rexec model. There are probably some one-off, application-specific
things you can do that might meet your requirements, like special
import hooks, sys.settrace() callbacks that inspect each running frame
(and are slow), and namespace restrictions on stuff passed to exec or
eval. If you really need sandboxing, your probably out of luck.
Setting up a usermode linux instance or chrooted jail is probably the
best bet today.

/arg
 
P

Peter Maas

Craig said:
That is my understanding. In fact, I'd say with Python it's nearly
impossible given how dynamic everything is and the number of tricks that
can be used to obfuscate what you're doing. Think of the fun that can be
had with str.encode / str.decode and getattr/hasattr .

It would certainly be difficult to track all harmful code constructs.
But AFAIK the idea of a sandbox is not to look at the offending code
but to protect the offended objects: files, databases, URLs, sockets
etc. and to raise a security exception when some code tries to offend
them. Jython is as dynamic as C-Python and yet it generates class
files behaving well under the JVM's security regime.
I looked into this, and my conclusion ended up being "Well, I'm using
Python because I want it's power and flexibilty. If I want a secure
scripting environment, I should use something like Lua or Qt Script for
Applications instead."

It would be good for Python if it would offer a secure mode. Some
time ago I asked my hosting provider whether I could use mod_python
with apache to run Python scripts in the same way as PHP scripts.
He denied that pointing to Python security issues and to PHP safe.
mode. Python IS powerful but there are many areas where it is of
vital interest who is allowed to use its power and what can be done
with it. I think it would be a pity to exclude Python from these
areas where a lot of programming/computing is done.

Python is a very well designed language but progress is made by
criticism not by satisfaction ;)
 
D

Doug Holton

Rolf said:
Hi,

I would like to embed a python interpreter within a program, but since that
program would be able to automatically download scripts from the internet,
I'd like to run those in a restricted environment, which basically means
that I want to allow only a specific set of modules to be used by the
scripts, so that it wouldn't be possible for them to remove files from the
hard drive, kill processes or do other nasty stuff.
Is there any way to do that with the standard python interpreter?

Hi, there is a page on this topic here:
http://www.python.org/moin/SandboxedPython

The short answer is that it is not possible to do this with the CPython,
but you can run sandboxed code on other virtual machines, such as Java's
JVM with Jython, or .NET/Mono's CLR with Boo or IronPython.

In the future it may also be possible to do this with PyPy or Parrot.
 
C

Craig Ringer

Craig Ringer schrieb:
It would certainly be difficult to track all harmful code constructs.
But AFAIK the idea of a sandbox is not to look at the offending code
but to protect the offended objects: files, databases, URLs, sockets
etc. and to raise a security exception when some code tries to offend
them.

That's a good point. I'm not sure it's really all that different in the
end though, because in order to control access to those resources you
have to restrict what the program can do.

It'd probably be valid to implement a restricted mode at CPython level
(in my still-quite-new-to-the-Python/C-API view) by checking at the
"exit points" for important resources such as files, etc. I guess that's
getting into talk of something like the Java sandbox, though - something
Java proved is far from trivial to implement. Of course, CPython is just
a /tad/ smaller than Java ;-) .

Personally, I'd be worried about the amount of time it'd take and the
difficulty of getting it right. One wouldn't want to impart a false
sense of security.

My original point, though, was that I don't think you can use the
standard interpreter to create a restricted environment that will be
both useful and even vaguely secure. I'd be absolutely delighted if
someone could prove me wrong.
Python is a very well designed language but progress is made by
criticism not by satisfaction ;)

Heh, I'm hardly complacent... I run into quite enough problems,
especially with embedding and with the C API. Maybe one day I'll have
the knowledge - and the time - to have a chance at tackling them.

I'd love a restricted mode - it'd be great. I'm just not very optimistic
about its practicality.
 
M

Michael Sparks

Rolf said:
I would like to embed a python interpreter within a program, but since
that program would be able to automatically download scripts from the
internet, I'd like to run those in a restricted environment, which
basically means that I want to allow only a specific set of modules to be
used by the scripts, so that it wouldn't be possible for them to remove
files from the hard drive, kill processes or do other nasty stuff.
Is there any way to do that with the standard python interpreter?

Current advice seems to be essentially "no".

I've been pondering adding limited scripting to some personal apps I've
written and due to this toyed around with the idea of simple but parser
that only used ":" and whitespaces for indicating blocks with the aim of
being a generic/"universal"(*) language parser that could be used for many
little "languages". (ie no keywords, just "pure" structure)

(*) By "universal" I mean something that allows a variety of different
styles of syntax to be used, whilst technically still sharing the
same underlying syntax. (Since that's a rather bogus statement,
that's why it has quotes :)

In the end I sat down and wrote such a beast largely as a fun exercise. (It
uses PLY and is an SLR grammar) It *doesn't* have any backend so you get to
decided how restricted it can be, but, for example, the following code
would parse happily:
(It's not quite python, but it's close syntactically)

class Grammar(object):
from Lexer import Tokens as tokens
precedence = ( ( "left", "DOT"))
def p_error(self,p):
print "Syntax error at", p
end
end

This parses as follows:

A class function is provided with 3 arguments:
* Grammar(object)
* A code block
* A lexical token "end" (Which could be anything)

The code block then contains 3 statements
* The first is a function call, to a function called "from"
* The second is an assignment statement
* The third is a function call to the function "def" (which in turn takes
3 arguments - a signature, a codeblock and a trailing token (the
trailing token allows "else" clauses and try/except style blocks)

etc

However it will also parse happily:

EXPORT PROC compare(field::pTR TO person,type=>NIL) OF person:
DEF result=FALSE
IF type:
SELECT type:
CASE NAME:
result:=compare_name(self.name,field)
CASE PHONE:
result:=compare_telephone(self.telephone,field)
CASE ADDRESS:
result:=compare_address(self.address,field)
ENDCASES
ENDSELECT
ELSE:
result:=compare_name(self.name,field,ORDER) # if type = NIL, ordering
ENDIF
ENDPROC result

And also programs of the form:

shape square:
pen down
repeat 4:
forward 10
rotate 90
end
pen up
end

repeat (360/5):
square()
rotate 5
end

and so on.

If you're prepared to write backends to traverse an AST then you might find
it useful. (I also wrote the parser as an exercise in trying to generate a
parser in a test first manner)

If you're curious as to the sorts of languages it could parse the test cases
are here:
* http://thwackety.com/viewcvs/viewcvs.cgi/Scratch/SWP/progs/

Some rather random examples are:
29, A copy of the parser file at that point in time, but rewritten in a
python-esque language parsable by the parser
33, A simple program in a logo type language
34, A simple program based on declarative l-systems for modelling
biological growth systems.
35, A simple SML-like language file implementing a stack
37, An implementation of a "Person" object module in an Amiga-E like
language.

(NB, here "language" means whatever AST a given backend might understand,
since they're all technically the same language)

http://thwackety.com/viewcvs/viewcvs.cgi/Scratch/SWP/README?rev=1.1

Describes the grammar, etc. (31 explicit rules, or alternatively 13
aggregate rules)

If you think it might be useful to you, feel free to do an anonymous
checkout:

cvs -d :pserver:[email protected]:2401/home/cvs/cvsroot login
cvs -d :pserver:[email protected]:2401/home/cvs/cvsroot co Scratch/SWP/

Since there is *no* backend at all at present this would be a bit of work.
(I've been tempted to investigate putting a lisp backend on the back, but
not found the time to do so. If I did though this would be a brackets free
lisp :) You can fine PLY here: http://systems.cs.uchicago.edu/ply/ .

Best Regards,


Michael.
 
D

Dieter Maurer

Doug Holton said:
...
Hi, there is a page on this topic here:
http://www.python.org/moin/SandboxedPython

The short answer is that it is not possible to do this with the
CPython, but you can run sandboxed code on other virtual machines,
such as Java's JVM with Jython, or .NET/Mono's CLR with Boo or
IronPython.

Zope contains a "restrictedPython" implementation.

It uses a specialized compiler that prevents dangerous bytecode operations
to be generated and enforces a restricted builtin environment.
 
P

Paul Rubin

Dieter Maurer said:
It uses a specialized compiler that prevents dangerous bytecode operations
to be generated and enforces a restricted builtin environment.

Does it stop the user from generating his own bytecode strings and
demarshalling them?
 
D

Dieter Maurer

Paul Rubin said:
Does it stop the user from generating his own bytecode strings and
demarshalling them?

Almost surely, I do not understand you:

In the standard setup, the code has no access to most
of Python's runtime library. Only a few selected modules
are deemed to be safe and can be imported (and used) in
"RestrictedPython". "marshal" or "unmarshal" are not considered safe.
Security Declaration can be used to make more modules importable -- but
then, this is an explicite decision by the application developper.

*If* the framework decided to exchange byte code between
user and iterpreter, then there would be no security at
all, because the interpreter is the standard interpreter
and security is built into the compilation process.
Of course, you should not step in *after* the secured step ;-)

Thus, "RestrictedPython" expects that the user sends
Python source code (and not byte code!), it compiles
this source code into byte code that enforces a strict
access and facility policy.


Dieter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top