Embedding Python in Python

Phil Frost · Aug 18, 2004

You probably want something like this:

globalDict = {}
exec(stringOfPythonCodeFromUser, globalDict)

globalDict is now the global namespace of whatever was in
stringOfPythonCodeFromUser, so you can grab values from that and
selectivly import them into your namespace.

Paul Rubin · Aug 18, 2004

Robey Holderith said:
Anyone know a good way to embed python within python?
No.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

There was a feature called rexec/Bastion for that purposes in older
version of Python, but it was removed because it was insecure.

Any ideas/examples?

Run your sensitive stuff in a separate process (or separate computer)
and allow the hostile clients to communicate through sockets.

JCM · Aug 18, 2004

Paul Rubin said:
There was a feature called rexec/Bastion for that purposes in older
version of Python, but it was removed because it was insecure.

Run your sensitive stuff in a separate process (or separate computer)
and allow the hostile clients to communicate through sockets.

If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous. You'll
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

Robey Holderith · Aug 18, 2004

Anyone know a good way to embed python within python?

Now before you tell me that's silly, let me explain
what I'd like to do.

I'd like to allow user-defined scriptable objects. I'd
like to give them access to modify pieces of my classes.
I'd like to disallow access to pretty much the rest of
the modules.

Any ideas/examples?

-Robey

Phil Frost · Aug 18, 2004

No. An easy way to escape that is to start one's code with
'del __builtins__', then python will add the default __builtins__ back
to the namespace. Restricting what arbitrary code can do has been
discussed many, many times, and it seems there is no way to do it short
of reimplementing a python interpretor.

Paul Rubin · Aug 18, 2004

Robey Holderith said:
Would this be secure?
No.

Paul, what's your take on this?

Don't count on it.

Paul Rubin · Aug 18, 2004

JCM said:
If you're concerned about security, another possibility is to parse
the user's code and look for anything potentially dangerous. You'll
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

JCM · Aug 18, 2004

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

Robey Holderith · Aug 18, 2004

You probably want something like this:

globalDict = {}
exec(stringOfPythonCodeFromUser, globalDict)

globalDict is now the global namespace of whatever was in
stringOfPythonCodeFromUser, so you can grab values from that and
selectivly import them into your namespace.

So using this (with a little additional reading) it looks like I
can do this:

globalDict = {'__builtins__': <my modules here>}
exec(<pythonCodeFromUser>, globalDict)

And that this will disallow both importing of new modules and direct
access to my namespace. It will however allow access to the

Would this be secure?

Paul, what's your take on this?

-Robey

Jack Diederich · Aug 18, 2004

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Click to expand...

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Click to expand...

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

Click to expand...

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

This is a job for the operating system and not python.
Google groups for rexec and Bastion if you want to read ten lenghty
discussions of why this is the OS's job.

-Jack

JCM · Aug 18, 2004

Jack Diederich said:
I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

Click to expand...

foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

Also would be rejected by my original set of rules (can't use
__dict__). But I'd disallow vars too.

JCM · Aug 18, 2004

I'm going to have to agree with Paul on this one. I do not feel up to
the task of thinking of every possible variant of malicious code. There
are far too many ways of writing the exact same thing. I think it would
be much easier to write my own interpreter.

Well it certainly isn't easier to write your own interpreter if you're
talking about the effort you'd need to put into it. And I'm not
convinced it's that tricky to come up with a set of syntax rules to
decide whether a piece of code is simple/safe enough to run. It
basically comes down to disallowing certain statements and certain
identifiers. Of course you'll end up rejecting a lot of code that
isn't malicious.

If you're interested enough, I'll try to throw a safety-checker
together. You'd have to be pretty interested though (I'm lazy).

Robey Holderith · Aug 18, 2004

By the time you're done with all that, you may as well design a new
restricted language and interpret just that.

Click to expand...

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Click to expand...

Even Java keeps getting new holes found, and Python is not anywhere
near Java when it comes to this kind of thing.

Click to expand...

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

I'm going to have to agree with Paul on this one. I do not feel up to
the task of thinking of every possible variant of malicious code. There
are far too many ways of writing the exact same thing. I think it would
be much easier to write my own interpreter.

-Robey

Jack Diederich · Aug 18, 2004

Jack Diederich said:
Jack Diederich said:

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

Click to expand...

foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

Click to expand...

Also would be rejected by my original set of rules (can't use
__dict__). But I'd disallow vars too.

Google groups for this topic, it's been dead horse kicked.
You would have to eliminate getarr too and any C func that can
result in an infite loop.

Not-python's-job-ly,

-Jack

JCM · Aug 18, 2004

Jack Diederich said:
Jack Diederich said:

On Wed, Aug 18, 2004 at 07:44:47PM +0000, JCM wrote: ...
I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars,
etc.

foo = "ev" + "al"
e = vars()['__builtins__'].__dict__[foo]
print e('2+2')

Click to expand...

Also would be rejected by my original set of rules (can't use
__dict__). But I'd disallow vars too.

Click to expand...

Google groups for this topic, it's been dead horse kicked.
You would have to eliminate getarr too and any C func that can
result in an infite loop.

Infinite loops (and other resource use) are a different story, not
addressed by source code inspection. I worked on a project which
needed to run untrusted code, and we dealt with the infinite-loop
situation by always running untrusted code on the main thread and
signalling it if it took too long to execute (this worked on unix--I
don't know what you'd do on Windows). I realize this could leave data
in a bad state. Infinite loops are harder to deal with.

Robey Holderith · Aug 18, 2004

No. An easy way to escape that is to start one's code with
'del __builtins__', then python will add the default __builtins__ back
to the namespace. Restricting what arbitrary code can do has been
discussed many, many times, and it seems there is no way to do it short
of reimplementing a python interpretor.

Out of curiosity I tried the following in 2.3.4

#------Begin Code

import random

globalDict = {'__builtins__':random}
localDict = {}
execfile("test2.py", globalDict, localDict)

print globalDict
print localDict

localDict['move']()

#------- End Code

Where test2.py looked like this:

#---------Begin Code

print __builtins__

try:
del __builtins__
print 'del worked'
except:
pass

try:
exec('del __builtins__')
print('exec del worked')
except:
pass

try:
import sys
print 'Import Worked'
except:
pass

try:
f = file('out.tmp','w')
f.write('asdfasdf')
f.close()
print 'File Access Worked'
except:
pass

seed()

def move():
print __builtins__

#------ End Code

I sure it has a crack in in somewhere, but it doesn't
seem to be del __builtins__ .

-Robey

Robey Holderith · Aug 18, 2004

I've found the crack in the armor. See additions below.

-Robey

Where test2.py looked like this:

#---------Begin Code

print __builtins__

try:
del __builtins__
print 'del worked'
except:
pass

try:
exec('del __builtins__')
print('exec del worked')
except:
pass

try:
import sys
print 'Import Worked'
except:
pass

try:
f = file('out.tmp','w')
f.write('asdfasdf')
f.close()
print 'File Access Worked'
except:
pass

seed()

def move():

#Add the following for a nice security hole
global __builtins__
del __builtins__

Paul Rubin · Aug 18, 2004

JCM said:
need to be aggressive, but I believe it's possible. For example,
disallow exec statements, the identifier "eval", any identifier of
__this__ form, import statements, etc. This is overly restrictive,
but it will provide security.

Click to expand...

Hint:
e = vars()['__builtins__'].eval
print e('2+2')

Click to expand...

I don't think it's as difficult as you think. Your snippet of code
would be rejected by the rules I suggested. You'd also want to
prohibit other builtins like compile, execfile, input, reload, vars, etc.

I don't see how. Your rules were to disallow:

1) exec statements. My example doesn't use it.

2) eval identifier. My example uses eval as an attribute and not an
identifier. You can eliminate the use of eval as an attribute with
e = getattr(vars()('__builtins__'), 'ev'+'al').
Now not even the string 'eval' appears in one piece.
3) identifiers like __this__. My example doesn't use any. It
uses a constant string of that form, not an identifier. The
string could be computed instead, like the eval example above.
4) import statements. My example doesn't use them.

Conclusion, my example gets past your suggested rules. I also didn't
use compile, execfile, input, or reload. I did use vars but there are
probably other ways to do the same thing. You can't take something
full of holes and start plugging holes until you think you found them
all. You have to start with something that has no holes. The Python
crowd has been through this many times already; do some searches for
rexec/Bastion security.

Michael J. Fromberger · Aug 18, 2004

Robey Holderith said:
Anyone know a good way to embed python within python?

-M

Robey Holderith · Aug 18, 2004

Well it certainly isn't easier to write your own interpreter if you're
talking about the effort you'd need to put into it. And I'm not
convinced it's that tricky to come up with a set of syntax rules to
decide whether a piece of code is simple/safe enough to run. It
basically comes down to disallowing certain statements and certain
identifiers. Of course you'll end up rejecting a lot of code that
isn't malicious.

If you're interested enough, I'll try to throw a safety-checker
together. You'd have to be pretty interested though (I'm lazy).

Don't do it on my behalf. I started far too many projects doing something
similar before I realized that the only effective way to do security was
from the bottom up. The problem looks something like this (assuming each
function has 10 places where it is implemented.

Level | Malicious Variation Count
-----------------------------------------
0 | 10^0
1 | 10^1
2 | 10^2
x | 10^x

Suffice to say that in simple code... it is doable. In a
mature interpreter... near impossible.

-Robey

Processing in Python help	0	Aug 31, 2022
Module missing when embedding?	0	Dec 12, 2013
SOLVE THIS IF YOU CAN PYTHON MASTER	7	Jan 30, 2023
how to avoid spaghetti in Python?	2	Jan 21, 2014
Docplex package in python	0	Nov 8, 2022
Information with WMI in Python.	1	Feb 28, 2023
Python battle game help	2	Feb 23, 2023
KML to CSV file conversion using Python and Windows Powershell	0	Oct 14, 2022

Embedding Python in Python

Phil Frost

Paul Rubin

JCM

Robey Holderith

Phil Frost

Paul Rubin

Paul Rubin

JCM

Robey Holderith

Jack Diederich

JCM

JCM

Robey Holderith

Jack Diederich

JCM

Robey Holderith

Robey Holderith

Paul Rubin

Michael J. Fromberger

Robey Holderith

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads