safe eval of moderately simple math expressions

J

Joel Hedlund

Hi all!

I'm writing a program that presents a lot of numbers to the user, and I
want to let the user apply moderately simple arithmentics to these
numbers. One possibility that comes to mind is to use the eval function,
but since that sends up all kinds of warning flags in my head, I thought
I'd put my idea out here first so you guys can tell me if I'm insane. :)

This is the gist of it:
----------------------------------------------------------
import math

globals = dict((s, getattr(math, s)) for s in dir(math) if '_' not in s)
globals.update(__builtins__=None, divmod=divmod, round=round)

def calc(expr, x):
if '_' in expr:
raise ValueError("expr must not contain '_' characters")
try:
return eval(expr, globals, dict(x=x))
except:
raise ValueError("bad expr or x")

print calc('cos(x*pi)', 1.33)
----------------------------------------------------------

This lets the user do stuff like "exp(-0.01*x)" or "round(100*x)" but
prevents malevolent stuff like "__import__('os').system('del *.*')" or
"(t for t in (42).__class__.__base__.__subclasses__() if t.__name__ ==
'file').next()" from messing things up.

I assume there's lots of nasty and absolutely lethal stuff that I've
missed, and I kindly request you show me the error of my ways.

Thank you for your time!
/Joel Hedlund
 
M

Matt Nordhoff

Joel said:
Hi all!

I'm writing a program that presents a lot of numbers to the user, and I
want to let the user apply moderately simple arithmentics to these
numbers. One possibility that comes to mind is to use the eval function,
but since that sends up all kinds of warning flags in my head, I thought
I'd put my idea out here first so you guys can tell me if I'm insane. :)

This is the gist of it:
----------------------------------------------------------
import math

globals = dict((s, getattr(math, s)) for s in dir(math) if '_' not in s)
globals.update(__builtins__=None, divmod=divmod, round=round)

def calc(expr, x):
if '_' in expr:
raise ValueError("expr must not contain '_' characters")
try:
return eval(expr, globals, dict(x=x))
except:
raise ValueError("bad expr or x")

print calc('cos(x*pi)', 1.33)
----------------------------------------------------------

This lets the user do stuff like "exp(-0.01*x)" or "round(100*x)" but
prevents malevolent stuff like "__import__('os').system('del *.*')" or
"(t for t in (42).__class__.__base__.__subclasses__() if t.__name__ ==
'file').next()" from messing things up.

I assume there's lots of nasty and absolutely lethal stuff that I've
missed, and I kindly request you show me the error of my ways.

Thank you for your time!
/Joel Hedlund

I'm way too dumb and lazy to provide a working example, but someone
could work around the _ restriction by obfuscating them a bit, like this:
<type 'int'>

Is that enough to show you the error of your ways? :-D Cuz seriously,
it's a bad idea.

I'm sorry, but I don't know a good solution. The simplicity of eval is
definitely very attractive, but it's just not safe.

(BTW: What if a user tries to do some ridiculously large calculation to
DoS the app? Is that a problem?)
--
 
A

Aaron Brady

Hi all!

I'm writing a program that presents a lot of numbers to the user, and I
want to let the user apply moderately simple arithmentics to these
numbers. One possibility that comes to mind is to use the eval function,
but since that sends up all kinds of warning flags in my head, I thought
I'd put my idea out here first so you guys can tell me if I'm insane. :)

This is the gist of it: snip
def calc(expr, x):
     if '_' in expr:
         raise ValueError("expr must not contain '_' characters") snip
I assume there's lots of nasty and absolutely lethal stuff that I've
missed, and I kindly request you show me the error of my ways.

Thank you for your time!
/Joel Hedlund

Would you be willing to examine a syntax tree to determine if there
are any class accesses? Would it work?
 
T

Terry Reedy

Joel said:
Hi all!

I'm writing a program that presents a lot of numbers to the user, and I
want to let the user apply moderately simple arithmentics to these
numbers. One possibility that comes to mind is to use the eval function,
but since that sends up all kinds of warning flags in my head,

Where does the program execute? If on the user's own machine, no
problem. Eval is no more dangerous than Python itself.
 
P

Paul McGuire

Hi all!

I'm writing a program that presents a lot of numbers to the user, and I
want to let the user apply moderately simple arithmentics to these
numbers.

Joel -

Take a look at the examples page on the pyparsing wiki (http://
pyparsing.wikispaces.com/Examples). Look at the examples fourFn.py
and simpleArith.py for some expression parsers that you could extend
to support whatever math builtins you wish. Since you would be doing
your own parsing and eval code, you could be sure that no dangerous
code was being run, just simple arithmetic.

-- Paul
 
S

Steven D'Aprano

Where does the program execute? If on the user's own machine, no
problem.

Until the user naively executes a code sample he downloaded from the
Internet, and discovers to his horror that his *calculator* is able to
upload his banking details to an IRC server hosted in Bulgaria.

How quickly we forget... for twenty or thirty years all malware
infections was via programs executed on the user's own machine.

Eval is no more dangerous than Python itself.

But users know Python is a Turing-complete programming language that can
do anything their computer can do. It would come to an unpleasant
surprise to discover that (say) your icon editor was also a Turing-
complete programming language capable of doing anything your C-compiler
could do. The same holds for applications written in Python.
 
A

Aaron Brady

Until the user naively executes a code sample he downloaded from the
Internet, and discovers to his horror that his *calculator* is able to
upload his banking details to an IRC server hosted in Bulgaria.

Mine does that anyway! ..Often without telling anyone.
How quickly we forget... for twenty or thirty years all malware
infections was via programs executed on the user's own machine.


But users know Python is a Turing-complete programming language that can
do anything their computer can do. It would come to an unpleasant
surprise to discover that (say) your icon editor was also a Turing-
complete programming language capable of doing anything your C-compiler
could do. The same holds for applications written in Python.

Don't they know that his calculator is written in Python? Do many
applications include a programming language?

Why do I get the feeling that the authors of 'pyparsing' are out of
breath?

I wonder if you could do something like copy and paste a "fork" of the
'ast' module, and just remove non-arithmetic classes; then do a normal
walk and transform of the foreign code...
 
J

Joel Hedlund

Aaron said:
Would you be willing to examine a syntax tree to determine if there
are any class accesses?

Sure? How do I do that? I've never done that type of thing before so I
can't really say if it would work or not.

/Joel
 
J

Joel Hedlund

Matt said:
<type 'int'>

Is that enough to show you the error of your ways?

No, because
True

:-D Cuz seriously, it's a bad idea.

Yes probably, but that's not why. :)
(BTW: What if a user tries to do some ridiculously large calculation to
DoS the app? Is that a problem?)

Nope. If the user wants to hang her own app that's fine with me.

/Joel
 
J

Joel Hedlund

Matt said:
<type 'int'>

Is that enough to show you the error of your ways?

No, because
True

:-D Cuz seriously, it's a bad idea.

Yes probably, but that's not why. :)
(BTW: What if a user tries to do some ridiculously large calculation to
DoS the app? Is that a problem?)

Nope. If the user wants to hang her own app that's fine with me.

/Joel
 
P

Peter Otten

Joel said:
No, because

True

But what your planning to do seems more like
.... return "_" not in source
........ print eval(source)
....
<type 'int'>

Peter
 
P

Peter Otten

Joel said:
No, because

True

But what you're planning to do seems more like
.... return "_" not in source
........ print eval(source)
....
<type 'int'>

Peter
 
J

Joel Hedlund

Peter said:
But what you're planning to do seems more like

... return "_" not in source
...
... print eval(source)
...
<type 'int'>

Bah. You are completely right of course.

Just as a thought experiment, would this do the trick?

def is_it_safe(source):
return "_" not in source and r'\' not in source

I'm not asking because I'm hellbent on having eval in my app, but
because it's always useful to see what hazards you don't know about.

/Joel
 
P

Peter Otten

Joel said:
Peter said:
But what you're planning to do seems more like

... return "_" not in source
...
... print eval(source)
...
<type 'int'>

Bah. You are completely right of course.

Just as a thought experiment, would this do the trick?

def is_it_safe(source):
return "_" not in source and r'\' not in source
"".join(map(chr, [95, 95, 110, 111, 95, 95]))
'__no__'

By the way, a raw string may not end with a backslash:
File "<stdin>", line 1
r'\'
^
SyntaxError: EOL while scanning single-quoted string

Peter
 
J

Joel Hedlund

Peter said:
Joel said:
Peter said:
def is_it_safe(source):
return "_" not in source and r'\' not in source
"".join(map(chr, [95, 95, 110, 111, 95, 95]))
'__no__'
But you don't have access to neither map or chr?

/Joel
'5f5f7374696c6c5f6e6f745f736166655f5f'.decode("hex")
'__still_not_safe__'

Now *that's* a thing of beauty. A horrible, horrible kind of beauty.

Thanks for blowing holes in my inflated sense of security!
/Joel
 
A

Aaron Brady

Sure? How do I do that? I've never done that type of thing before so I
can't really say if it would work or not.

/Joel

NO PROMISES. No warranty is made, express or implied.

Of course, something this devious, a "white" list, may just make it so
your enemy finds out its weakness before you do.

It's ostensibly for Python 3, but IIRC there's a way to do it in 2.

'ast.literal_eval' appears to evaluate a literal, but won't do
expressions, which is what you are looking for. We should refer
people to it more often.

+1 ast.walk, btw.

If you want subtraction and division, you'll have to add them
yourself. You could probably compress the 'is_it_safe' function to
one line, provided that it's sound to start with: if all( x in
safe_node_classes for x in ast.walk( ast.parse( exp ) ) ), or better
yet, if set( ast.walk( ast.parse( exp ) ) )<= safe_node_classes. +1!

/Source:
import ast
safe_exp= '( 2+ 4 )* 7'
unsafe_exp= '( 2+ 4 ).__class__'
unsafe_exp2= '__import__( "os" )'

safe_node_classes= set( [
ast.Module,
ast.Expr,
ast.BinOp,
ast.Mult,
ast.Add,
ast.Num
] )

def is_it_safe( exp ):
print( 'trying %s'% exp )
top= ast.parse( exp )
for node in ast.walk( top ):
print( node )
if node.__class__ not in safe_node_classes:
return False
print( 'ok!' )
return True

print( safe_exp, is_it_safe( safe_exp ) )
print( )
print( unsafe_exp, is_it_safe( unsafe_exp ) )
print( )
print( unsafe_exp2, is_it_safe( unsafe_exp2 ) )
print( )

/Output:

trying ( 2+ 4 )* 7
<_ast.Module object at 0x00BB5DF0>
<_ast.Expr object at 0x00BB5E10>
<_ast.BinOp object at 0x00BB5E30>
<_ast.BinOp object at 0x00BB5E50>
<_ast.Mult object at 0x00BAF590>
<_ast.Num object at 0x00BB5EB0>
<_ast.Num object at 0x00BB5E70>
<_ast.Add object at 0x00BAF410>
<_ast.Num object at 0x00BB5E90>
ok!
( 2+ 4 )* 7 True

trying ( 2+ 4 ).__class__
<_ast.Module object at 0x00BB5E90>
<_ast.Expr object at 0x00BB5DF0>
<_ast.Attribute object at 0x00BB5E10>
( 2+ 4 ).__class__ False

trying __import__( "os" )
<_ast.Module object at 0x00BB5E10>
<_ast.Expr object at 0x00BB5E30>
<_ast.Call object at 0x00BB5E50>
__import__( "os" ) False
 
S

Steven D'Aprano

Bah. You are completely right of course.

Just as a thought experiment, would this do the trick?

def is_it_safe(source):
return "_" not in source and r'\' not in source

I'm not asking because I'm hellbent on having eval in my app, but
because it's always useful to see what hazards you don't know about.

Can we pass your test and still write to a file? Too easy.

Traceback (most recent call last):
.... print eval(source)
....
9'spam spam spam'


Can we pass your test and import a module and grab its docstring?
.... eval(source)
....
"OS routines for Mac, NT, or Posix depending ... "



Restricting Python is hard. No, not hard. It's *REALLY HARD*. Experts
have tried and failed. A good example is Tav's recent attempt to secure
Python code from *one* threat: writing a file on the local disk. Should
be simple, right?

If only.

http://tav.espians.com/a-challenge-to-break-python-security.html

The first exploit came an hour after Tav went public.

You can read the discussion on the Python-Dev list starting here:
http://mail.python.org/pipermail/python-dev/2009-February/086401.html


More here:
http://tav.espians.com/paving-the-way-to-securing-the-python-interpreter.html

http://tav.espians.com/update-on-securing-the-python-interpreter.html


My recommendation is that you do one of these:

(1) Give up on making your code "safe". Recognise that the threat is
relatively small, but real, and put a warning in your documentation about
the risk to user's own system if they evaluate arbitrary code, and then
just use eval and hope for the best.

(2) Decide that you don't want your calculate to be a full-fledged
programming language, and give up on making eval safe. Write your own
mini-parser to do arithmetic expressions. It's really not that difficult:
really easy with PyParsing, and not that hard without.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top