Embedding Python in Python

5

510046470588-0001

Robey Holderith said:
Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

use the rexec module, or see how Zope does it


Klaus Schilling
 
B

Benjamin Niemann

Well it seems that this is impossible to do with the current Python. But
it is a feature that would be important for certain applications.
Actually I've been searching for this, too - and only found
abandoned/deprecated modules.

If you want to use the current Python interpreter to execute the code,
you'd have to remove many language features, because they could provide
a backdoor for malicous code. This could be done by defining a grammar
for a subset of Python (perhaps with some semantic checks), and verify
that the code satisfies the grammar before you feed it into eval(). This
could either be easy (resulting in a small subset of Python that is
probably too small for real use...), or difficult (resulting in a usable
subset, but with a large amount of complex grammar rules - with at least
one rule that introduces a security leak...).

A good solution has to be implemented in the Python interpreter. Are
there any plans for future versions of Python? I've seen the phrase
"security initiative" on this list. Was that a "there is a ..." or
"there should be a ..."? I couldn't find anything on the web (but didn't
search very deep).

My first idea:

- extend the C-API (alternative to Py_Initialize??) for embedding Python
to provide a 'stripped down' interpreter: no builtins with sideeffects
(like open()...), ...
I don't know anything about Pythons internals or embedding Python, so I
can say, if this is easy or possible at all.

- communication of the embedded script to the outside world (file or
network I/O...) must be provided by the hosting application that is
responsible for enforcing the desired security limitations.

- wrap it into a Python module. Then you can start the isolated embedded
Python from 'real' Python code.

The interesting (and most difficult) thing is, which part of Pythons
standard library relies on "dangerous" features. This could drastically
reduce the usability of this approach (until you build your own 'secure'
library).
Using this model, the secure interpreter is running in the same process
context as the unsecure host. A bug in python could result in unchecked
access to resources of the host. For higher security a separate process
should be started.
 
J

JCM

Paul Rubin said:
Hint:
e = vars()['__builtins__'].eval
print e('2+2')

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars, etc.
I don't see how. Your rules were to disallow:
1) exec statements. My example doesn't use it.
2) eval identifier. My example uses eval as an attribute and not an
identifier. You can eliminate the use of eval as an attribute with
e = getattr(vars()('__builtins__'), 'ev'+'al').
Now not even the string 'eval' appears in one piece.

You've used eval an as identifier (at least by the terminology to
which I'm accustomed), just not as a variable.
3) identifiers like __this__. My example doesn't use any. It
uses a constant string of that form, not an identifier. The
string could be computed instead, like the eval example above.
4) import statements. My example doesn't use them.
Conclusion, my example gets past your suggested rules. I also
didn't use compile, execfile, input, or reload. I did use vars but
there are probably other ways to do the same thing. You can't take
something full of holes and start plugging holes until you think you
found them all. You have to start with something that has no holes.

It's fine to look at it that way. Start with a subset of Python that
you know to be safe, for example only integer literal expressions.
Keep adding more safe features until you're satisfied with the
expressiveness of your subset.
The Python crowd has been through this many times already; do some
searches for rexec/Bastion security.

I did do a [quick] search, and saw a lot of articles about how rexec
and Bastion were insecure; but I didn't find any arguments about how
it's (too) difficult to come up with a safe subset of Python, for some
definition of "safe".
 
P

Paul Rubin

JCM said:
It's fine to look at it that way. Start with a subset of Python that
you know to be safe, for example only integer literal expressions.
Keep adding more safe features until you're satisfied with the
expressiveness of your subset.

Well ok, but then you haven't got Python, you've got some subset, with
a completely different implementation than the Python that it's
embedded in.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top