Secure Python

F

Fredrik Tolf

Hi List!

I was thinking about secure Python code execution, and I'd really
appreciate some comments from those who know Python better than I do.

I was thinking that maybe it could be possible to load and run untrusted
Python code, simply by loading it in a module with a modified version of
__builtins__. Without any reachable function that do unsafe operations,
code running from there shouldn't be able to do evil things.

Or? What would happen to `import'? Would it be possible to set a null
import path for a specific module. Are there any other ways to reach
modules/functions that would make this impossible (I don't seem to be
able to remember, but aren't there cross-references somewhere to the
defining modules of data passed to the code in the secure module)?

If this doesn't work, might there be some other way to run untrusted
code that I haven't thought of (apart from using O/S-specific stuff like
SECCOMD, of course).

Thank you very much for your time!

Fredrik Tolf
 
S

Steven D'Aprano

Hi List!

I was thinking about secure Python code execution, and I'd really
appreciate some comments from those who know Python better than I do.

I was thinking that maybe it could be possible to load and run untrusted
Python code, simply by loading it in a module with a modified version of
__builtins__. Without any reachable function that do unsafe operations,
code running from there shouldn't be able to do evil things.

How would you prevent a Denial Of Service attack like this?

# don't try this at home kids! leave this to the professionals!
n = 10000**4
L = []
for i in range(n):
L.append(str(2L**n))

Here's an interesting one. Bug or deliberate attack?


def evens():
# iterator returning even numbers
i = 0
while True:
yield i
i += 2
# now get all the even numbers up to 15
L = [n for n in evens() if n < 15]
 
T

timmy

Steven said:
Hi List!

I was thinking about secure Python code execution, and I'd really
appreciate some comments from those who know Python better than I do.

I was thinking that maybe it could be possible to load and run untrusted
Python code, simply by loading it in a module with a modified version of
__builtins__. Without any reachable function that do unsafe operations,
code running from there shouldn't be able to do evil things.


How would you prevent a Denial Of Service attack like this?

# don't try this at home kids! leave this to the professionals!
n = 10000**4
L = []
for i in range(n):
L.append(str(2L**n))

Here's an interesting one. Bug or deliberate attack?


def evens():
# iterator returning even numbers
i = 0
while True:
yield i
i += 2
# now get all the even numbers up to 15
L = [n for n in evens() if n < 15]

congraulations you have discovered loops and their misuse
 
F

Fredrik Lundh

timmy said:
congraulations you have discovered loops and their misuse

if you don't know what the phrase "denial of service attack" means, you
can always google for it.

</F>
 
S

Steven D'Aprano

congraulations you have discovered loops and their misuse

Did you have a point in your utterly inane comment, or did you just want
to see your name on Usenet?

In any case, it isn't just "loops" that are dangerous.

print 2**512**512

No loop there, but it will operate as a lovely DoS attack if you run it.

The Original Poster is suggesting running UNTRUSTED code. That means you
have to assume that it will be actively hostile, but even if it isn't
deliberately hostile, there will be bugs which the developer can't control.

He wants to run this untrusted (hostile or buggy or both) code in an
environment where it can't do bad things. "Bad things" include Denial of
Service attacks. So, Timmy, let's hear your brilliant scheme for
preventing DoS attacks when running hostile code in Python.
 
S

Stephan Kuhagen

Fredrik said:
If this doesn't work, might there be some other way to run untrusted
code that I haven't thought of (apart from using O/S-specific stuff like
SECCOMD, of course).

There was a module called rexec which tries to give you a restricted
environment for executing code. But it seems, that it is not maintained
anymore, because there were too much problems with it. It seems, that it is
very complicated to get a restricted execution environment without losing
too much of Pythons functionality.

One question is, what you want to achieve. As another posting in this thread
mentioned, you can't get around of denial of service attacks, even in
restricted or trusted environments. So I assume, that what you want is
something like a sandbox, where specific file actions (deleting files,
access to specific part of the FS at all) and some other things can be
restricted or forbidden. I think, this should be possible, even for some
DOS-Attacks (e.g. restricting the amount of memory that can be used by the
script, or the max stack size, depth of recursion limits etc.), but it is a
hard job to find all places, where code can break out of your sandbox. For
a full load of bad examples, simply have a look at JavaScript...

For a IMHO really good implementation of the sandbox idea, have a look at
the "safe interp" in Tcl. A short description (and by no mean complete) of
the safe interp is to run a second and completely independent interpreter
with all possibly dangerous commands completely removed and a
one-way-channel to inject commands and scripts into its evaluation loop
from the trusted interpreter. Depending on how much faith you have into the
untrusted script, you can selectively allow additional commands in the safe
interp or map common commands to other restricted or monitored versions of
them, which you implemented yourself inside your trusted environment. I do
not know, how complex it would be to do this in Python (since Tcl may look
a little old fashioned to some people but has some unique features that
helps especially with this kind of problem, such as having no keywords,
which makes it possible to change the semantics of even the most basic
constructs in the language from the scripting level), but I think it would
be a really useful feature for Python to have a sandbox mechanism to run
untrusted code.

Regards
Stephan
 
T

timmy

Fredrik said:
if you don't know what the phrase "denial of service attack" means, you
can always google for it.

</F>
maybe you should google "linux kernel limit" and you can prevent any
user/process maxing out your system
 
T

timmy

Steven said:
Did you have a point in your utterly inane comment, or did you just want
to see your name on Usenet?

In any case, it isn't just "loops" that are dangerous.

print 2**512**512

No loop there, but it will operate as a lovely DoS attack if you run it.

The Original Poster is suggesting running UNTRUSTED code. That means you
have to assume that it will be actively hostile, but even if it isn't
deliberately hostile, there will be bugs which the developer can't control.

He wants to run this untrusted (hostile or buggy or both) code in an
environment where it can't do bad things. "Bad things" include Denial of
Service attacks. So, Timmy, let's hear your brilliant scheme for
preventing DoS attacks when running hostile code in Python.

as posted before, linux kernel limit.

then you and your users can go as crazy as you want and you won't take
out your system.

maybe you should think a little more before going on the attack like that.
 
F

Fredrik Lundh

timmy said:
maybe you should google "linux kernel limit" and you can prevent any
user/process maxing out your system

one would have thought that the phrase "apart from OS-specific stuff"
might have meant that the OP wasn't asking for Linux-specific solutions.

</F>
 
S

Stephan Kuhagen

timmy <"timothy at open-networks.net"> wrote:

This sub-thread starts to become a flame-war, isn't it? Calm down, both of
you... No need to fight, when only some ideas for a technical question are
requested.
as posted before, linux kernel limit.

then you and your users can go as crazy as you want and you won't take
out your system.

The problem with linux kernel limits are, that they won't work really good
on MacOSX and Windows... OTOH the idea is the right one, but the effect can
be achieved inside of Python. Since Python does byte compile the code and
the interpreter evaluates each byte code token in one evaluation step. The
interpreter could be extended for such usecases to count and limit the
number of evaluation steps allowed for untrusted script or methods in
untrusted script as well as to limit the recursion depth or memory to be
allocated. All those limits are managed by the interpreter for script code
and hence can be limited for untrusted code by the interpreter. This also
does not really make DoS impossible (what about C extensions? - maybe
restricting "import"?). - As I said before in this thread, making a sandbox
really secure is a hard job, and may need some serious changes in the
Python interpreter, but AFAIK from Tcl, it is possible - and would be nice
to have.

Regards
Stephan
 
T

timmy

Fredrik said:
one would have thought that the phrase "apart from OS-specific stuff"
might have meant that the OP wasn't asking for Linux-specific solutions.

</F>

sorry i didn't see that.
cpu and memory limiting aren't specific to linux though, any NT system
can also do it.
the only effective way to prevent someone with access to a compiler
from performing a local dos on your system is use each os's resource
controls. there's no cross platform way to do this, since every system
has vastly different methods of memory and cpu time handling.
This looks like a case where he will just have to accept this as a trade
off (security is always a trade off)
 
T

timmy

Stephan said:
timmy <"timothy at open-networks.net"> wrote:

This sub-thread starts to become a flame-war, isn't it? Calm down, both of
you... No need to fight, when only some ideas for a technical question are
requested.

i'm not fighting, sometimes i can be a little terse for that i aplogise.
The problem with linux kernel limits are, that they won't work really good
on MacOSX and Windows... OTOH the idea is the right one, but the effect can
be achieved inside of Python. Since Python does byte compile the code and
the interpreter evaluates each byte code token in one evaluation step. The
interpreter could be extended for such usecases to count and limit the
number of evaluation steps allowed for untrusted script or methods in
untrusted script as well as to limit the recursion depth or memory to be
allocated.

idunno sounds like a lot of trouble to engineer a solution that has
already got a solution. all win NT systems have resource managment and i
imagine OS X would as well??
 
S

Stephan Kuhagen

timmy said:
idunno sounds like a lot of trouble to engineer a solution that has
already got a solution. all win NT systems have resource managment and i
imagine OS X would as well??

Sounds very likely, but does not solve the problem. With resource management
on the OS level you can indeed set some important limits for untrusted
scripts, but there are at least two drawbacks, which come to my mind (and
maybe more, that I'm not aware of): 1. OS level can always only implement
the lowest common denominator of all OS resource managements to be platform
independent, which is a strong requirement, IMO. 2. If you want to exec a
untrusted script from inside a trusted interpreter giving it a sandbox,
then the sandbox has the same OS level restrictions as the first
interpreter (except when started in a separate process, which makes
communication between trusted and untrusted parts much more complicated).
Also you can't have such a fine grained control over the untrusted
execution environment at the OS level, e.g. limiting the recursion depth,
which is a very important limit for secure interpreters. Limiting the stack
on the OS level is for example no solution for this, because the byte code
may behave completely different on the stack (and regarding hidden internal
recursion) as what the toplevel Python code does (does anyone understand,
what I'm talking about... - I think I just reached the recurion limit of my
english capabilities), which means that you can't set deterministic
restrictions for untrusted code in a platform independent manner at the OS
level. - Managing all this in the interpreter would solve the problem, at
the cost of implementing lots of resource management code. A good sandbox
seems to be a real adventure with few survivors, as can be seen in the
JavaScript-world.

Regards
Stephan
 
P

Paul Boddie

Stephan said:
Sounds very likely, but does not solve the problem. With resource management
on the OS level you can indeed set some important limits for untrusted
scripts, but there are at least two drawbacks, which come to my mind (and
maybe more, that I'm not aware of): 1. OS level can always only implement
the lowest common denominator of all OS resource managements to be platform
independent, which is a strong requirement, IMO.

I think I understand what you intend to say here: that some kind of
Python sandbox relying on operating system facilities can only depend
on facilities implemented in all of the most interesting operating
systems (which I once referred to as "the big three", accompanied by
howls of protest/derision). Yet just as people like to say that
choosing a language is all about "choosing the right tool for the job",
shouldn't the choice of operating system be significant as well? If
you're running a "Try Python" Web site, as some people were doing a few
months ago, isn't it important to choose the right operating system as
part of the right complete environment, instead of having the
theoretical possibility of running it on something like RISC OS, yet
having someone take your site down within seconds anyway? I don't know
whether it's the same people who like to promote "how well Python plays
with everything else" who also demand totally cross-platform solutions
("if it doesn't work on Windows, we won't do it"), but if so, I'd be
interested in how they manage to reconcile these views.

[...]
A good sandbox seems to be a real adventure with few survivors, as can be seen in the
JavaScript-world.

Certainly, there are interesting directions to be taken with safe
execution at the language and runtime levels, but as technologies like
Java (in particular) have shown, it's possible for a project or a
company to find itself focusing heavily on such strategies at the cost
of readily available, mature technologies which might be good enough.
The emergence of virtualisation as a commodity technology would suggest
that sandboxing language runtimes isn't as fashionable as it was ten
years ago.

Paul
 
D

Diez B. Roggisch

as posted before, linux kernel limit.

then you and your users can go as crazy as you want and you won't take
out your system.

maybe you should think a little more before going on the attack like that.

You should maybe read a little bit more when making bold statements about
the feasibility of a sandboxed _PYTHON_. The OP wrote:

"""
I was thinking that maybe it could be possible to load and run untrusted
Python code, simply by loading it in a module with a modified version of
__builtins__. Without any reachable function that do unsafe operations,
code running from there shouldn't be able to do evil things.
"""

At least to me - and I presume pretty much everybody except you in this
thread - this means that he is interested in executing arbitrary pieces of
python code inside the interpreter, which comes from e.g. players who
customize their in-game behavior of their avatars.

Now how exactly does linux (or any other resource limiting technique on any
OS) help here - killing the whole game server surely isn't a desirable
solution when one player goes berserk, might it be intentionally or not.

It is a recurring and pretty much understandable request on c.l.py to be
able to do so - sometimes it arises in the disguise of killable threads.
But unfortunately the solution doesn't seem to be as simple as one would
wish.

Diez
 
S

Stephan Kuhagen

Paul said:
I think I understand what you intend to say here: that some kind of
Python sandbox relying on operating system facilities can only depend
on facilities implemented in all of the most interesting operating
systems (which I once referred to as "the big three", accompanied by
howls of protest/derision

Oberon, Plan 9 and AmigaOS...? ;-)
). Yet just as people like to say that
choosing a language is all about "choosing the right tool for the job",
shouldn't the choice of operating system be significant as well?

Yes, it should. But it isn't most times, I think. Often you have the
situation to run a certain app e.g. on a OS that you can't simply exchange
to your needs, for example the game server you mentioned, if this should
run on an external host which is not maintained by you.

Personally I would always prefer an OS independent solution, because it
makes you more flexible. Some OS may be a good choice at a given time, but
after you app has grown up, you may come to the decision to change the OS
for some reason, but can't because you app depends on some of its specific
features. Especially for apps written in a scripting language I would try
to avoid that.
If
you're running a "Try Python" Web site, as some people were doing a few
months ago, isn't it important to choose the right operating system as
part of the right complete environment, instead of having the
theoretical possibility of running it on something like RISC OS, yet
having someone take your site down within seconds anyway? I don't know
whether it's the same people who like to promote "how well Python plays
with everything else" who also demand totally cross-platform solutions
("if it doesn't work on Windows, we won't do it"), but if so, I'd be
interested in how they manage to reconcile these views.

I'm afraid, we can't have a perfect world... But as I stated in another
posting before, I think it is possible, to get a secure execution
environment in a platform independent manner. The Tcl people did it and
since I made myself already very unpopular at c.l.tcl by requesting some of
Pythons goods for Tcl, I can do the same here by requesting some of Tcls
good inventions for Python... ;-)
The emergence of virtualisation as a commodity technology would suggest
that sandboxing language runtimes isn't as fashionable as it was ten
years ago.

Virtual environments are a good choice for some of the tasks that were done
with sandboxes in the past. But I'm afraid, that they are too huge for many
problems. Imagine running an instance of a virtual machine on a mobile
phone, or needing to execute some hundreds of them in parallel on a game
server (or CGI) which itself runs on a virtual host at your webhoster, and
of course none of them should be able to kill it's neighbours, so all of
them need their own VM... phiu, that would need a really big iron. So the
the idea of VMs _is_ a good one for certain situations, but the need for
secure execution environments inside an interpreter remains.

Regards
Stephan
 
P

Paul Boddie

[Multiplayer game servers]
Now how exactly does linux (or any other resource limiting technique on any
OS) help here - killing the whole game server surely isn't a desirable
solution when one player goes berserk, might it be intentionally or not.

A significant issue is the architecture of the server itself. Is a
per-process solution acceptable or must everything happen in the same
process with lots of threads (or microthreads)? Of course, there are
games using lots of microthreads, although I'm not sure whether they
also use lots of processes, too, and it has been asserted that having
lots of operating system threads or processes is just too resource
intensive, but I think it's especially worth considering the nature of
the platform you're using and what it offers.

Presumably, the original idea with UNIX-based systems was that you'd
employ lots of processes in order to serve lots of customers, players,
and so on, and there were companies like Internet service providers
using precisely that one process per customer model in a naive fashion
(until they exceeded the limit on simultaneous process identifiers in
one particular case, I believe). Subsequent work focusing on throwing
lots of threads into a single server-side container and then trying to
isolate them from each other, all whilst running the container on a
UNIX variant - a classic Java architectural pattern - seems like a
curious distraction when one considers the strong portfolio of
appropriate and readily available technologies that are left unused in
the operating system of the deployment environment concerned.
It is a recurring and pretty much understandable request on c.l.py to be
able to do so - sometimes it arises in the disguise of killable threads.
But unfortunately the solution doesn't seem to be as simple as one would
wish.

And this is where the hot topics collide: people want performant
multitasking with lots of shared state (the global interpreter lock
controversy) together with sandboxing so that the individual threads
can't access most of that shared state (the restricted execution
controversy). But it's like using a trip to meet the neighbours to
justify a mission to the moon: you can visit the neighbours at a much
lower cost with the vehicles you already have. I hear that various
operating systems support better interprocess communication these days,
but then we meet the third hot topic: why won't it work on Windows?
Something has to give.

Paul
 
D

Diez B. Roggisch

A significant issue is the architecture of the server itself. Is a
per-process solution acceptable or must everything happen in the same
process with lots of threads (or microthreads)? Of course, there are
games using lots of microthreads, although I'm not sure whether they
also use lots of processes, too, and it has been asserted that having
lots of operating system threads or processes is just too resource
intensive, but I think it's especially worth considering the nature of
the platform you're using and what it offers.

AFAIK most engines today are only single-threaded. A big grief for all those
dual-core owners out there.

And having thousands of players is common - spawning a process for each of
them certainly too resource-consuming.

AFAIK stackless python was initially financially supported by a
game-company. So I guess that shows us pretty much what games (at least)
are after: low-profile in-process threads, fine-grained controllable.

Diez
 
T

timmy

Diez said:
You should maybe read a little bit more when making bold statements about
the feasibility of a sandboxed _PYTHON_. The OP wrote:

there is nothing preventing you putting limits on the resources each
process uses, on just about any modern day OS
At least to me - and I presume pretty much everybody except you in this
thread -

Oh no i understand perfectly what he wants, i merely suggest a simple OS
based solution.

this means that he is interested in executing arbitrary pieces of
python code inside the interpreter, which comes from e.g. players who
customize their in-game behavior of their avatars.

Now how exactly does linux (or any other resource limiting technique on any
OS) help here - killing the whole game server surely isn't a desirable
solution when one player goes berserk, might it be intentionally or not.

resource managment does not kill anything it merely prevents one process
running away and consuming the whole server. this is EXACTLY what he is
afraid of.
if he intends on running arbitrary code then i suggest he spawns each
one as a seperate thread with a spefic name and merely set limits on all
processes named X. that way he can run any whacky code he wants safely
inside those processes without fear of any one of them crashing the
server. I know it can be done under any of the nix's, I'm not sure how
to do so under windows, but it could probably be done.
It is a recurring and pretty much understandable request on c.l.py to be
able to do so - sometimes it arises in the disguise of killable threads.
But unfortunately the solution doesn't seem to be as simple as one would
wish.

i can understand people wanting an application based cross platform
solution to this, but i'm yet to see anything practicle hence i suggest
and OS based solution.
 
T

timmy

Paul said:
Diez B. Roggisch wrote:


[Multiplayer game servers]

Now how exactly does linux (or any other resource limiting technique on any
OS) help here - killing the whole game server surely isn't a desirable
solution when one player goes berserk, might it be intentionally or not.

And this is where the hot topics collide: people want performant
multitasking with lots of shared state (the global interpreter lock
controversy) together with sandboxing so that the individual threads
can't access most of that shared state (the restricted execution
controversy).

i'm not talking about sandboxing, that's a whole different kettle of
fish. i'm talking about resource managment options you can set in for
instance, the linux kernel.
you can limit the cpu and memory a process uses while still allowing it
the same access it would have outside of a sandbox. that way if any
clever monkeys try to dos you they merely consume their alloted quota.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top