Turning String into Numerical Equation

B

Brian Kazian

Here's my problem, and hopefully someone can help me figure out if there is
a good way to do this.

I am writing a program that allows the user to enter an equation in a text
field using pre-existing variables. They then enter numerical values for
these variables, or can tell the program to randomize the values used within
a certain bounds. My problem is taking in this equation they have written
in the text field and converting it into an equation Python can calculate.

The only way I know of to do this is by parsing it, which could get pretty
nasty with all of the different mathematical rules. Does anyone have a
better idea than parsing to compute an equation from a string
representation?

Thanks so much!

Brian Kazian
 
A

Artie Gold

Brian said:
Here's my problem, and hopefully someone can help me figure out if there is
a good way to do this.

I am writing a program that allows the user to enter an equation in a text
field using pre-existing variables. They then enter numerical values for
these variables, or can tell the program to randomize the values used within
a certain bounds. My problem is taking in this equation they have written
in the text field and converting it into an equation Python can calculate.

The only way I know of to do this is by parsing it, which could get pretty
nasty with all of the different mathematical rules. Does anyone have a
better idea than parsing to compute an equation from a string
representation?

Thanks so much!

Brian Kazian
eval()

See: http://docs.python.org/lib/built-in-funcs.html#l2h-23

HTH,
--ag
 
B

Brian Kazian

Thanks for the help, I didn't even think of that.

I'm guessing there's no easy way to handle exponents or logarithmic
functions? I will be running into these two types as well.
 
A

Artie Gold

Brian said:
Thanks for the help, I didn't even think of that.

I'm guessing there's no easy way to handle exponents or logarithmic
functions? I will be running into these two types as well.

Well, consider:

import math
eval("log(pow(x,2)*pow(y,3),2)",{'pow':math.pow,'log':math.log},{'x':1,'y':2})

[No, you wouldn't want to write it that way; it's merely illustrating
what you can do without doing much.]

HTH,
--ag

[BTW -- cultural question: Do we top-post here?]
 
R

Robert Kern

Artie said:
[BTW -- cultural question: Do we top-post here?]

Please don't.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
M

Michael Spencer

Brian said:
Thanks for the help, I didn't even think of that.

I'm guessing there's no easy way to handle exponents or logarithmic
functions? I will be running into these two types as well.

eval will handle exponents just fine: try eval("2**16")
in fact, it will evaluate any legal python expression*

logarithmic functions live in the math module, so you will either need to import
the functions/symbols you want from math, or give that namespace to eval:

* this means that, eval("sys.exit()") will likely stop your interpreter, and
there are various other inputs with possibly harmful consequences.

Concerns like these may send you back to your original idea of doing your own
expression parsing. The good news is that the compiler package will parse any
legal Python expression, and return an Abstract Syntax Tree. It's
straightforward to walk the tree and achieve fine-grain control over evaluation.

Here's an example of a math calculator that doesn't use eval. It evaluates any
Python scalar numeric expression (i.e., excludes container types), and only
those symbols and functions that are explicity specified. This code is barely
tested and probably not bullet-proof. But with care and testing it should be
possible to achieve a good balance of functionality and security.


import compiler
import types
import math

# create a namespace of useful funcs
mathfuncs = {"abs":abs, "min": min, "max": max}
mathfuncs.update((funcname, getattr(math,funcname)) for funcname in vars(math)
if not funcname.startswith("_"))

mathsymbols = {"pi":math.pi, "e":math.e}

# define acceptable types - others will raise an exception if
# entered as literals
mathtypes = (int, float, long, complex)

class CalcError(Exception):
def __init__(self,error,descr = None,node = None):
self.error = error
self.descr = descr
self.node = node
#self.lineno = getattr(node,"lineno",None)

def __repr__(self):
return "%s: %s" % (self.error, self.descr)
__str__ = __repr__


class EvalCalc(object):

def __init__(self):
self._cache = {} # dispatch table

def visit(self, node,**kw):
cls = node.__class__
meth = self._cache.setdefault(cls,
getattr(self,'visit'+cls.__name__,self.default))
return meth(node, **kw)

def visitExpression(self, node, **kw):
return self.visit(node.node)

def visitConst(self, node, **kw):
value = node.value
if isinstance(value, mathtypes):
return node.value
else:
raise CalcError("Not a numeric type", value)

# Binary Ops
def visitAdd(self,node,**kw):
return self.visit(node.left) + self.visit(node.right)
def visitDiv(self,node,**kw):
return self.visit(node.left) / self.visit(node.right)
def visitFloorDiv(self,node,**kw):
return self.visit(node.left) // self.visit(node.right)
def visitLeftShift(self,node,**kw):
return self.visit(node.left) << self.visit(node.right)
def visitMod(self,node,**kw):
return self.visit(node.left) % self.visit(node.right)
def visitMul(self,node,**kw):
return self.visit(node.left) * self.visit(node.right)
def visitPower(self,node,**kw):
return self.visit(node.left) ** self.visit(node.right)
def visitRightShift(self,node,**kw):
return self.visit(node.left) >> self.visit(node.right)
def visitSub(self,node,**kw):
return self.visit(node.left) - self.visit(node.right)

# Unary ops
def visitNot(self,node,*kw):
return not self.visit(node.expr)
def visitUnarySub(self,node,*kw):
return -self.visit(node.expr)
def visitInvert(self,node,*kw):
return ~self.visit(node.expr)
def visitUnaryAdd(self,node,*kw):
return +self.visit(node.expr)

# Logical Ops
def visitAnd(self,node,**kw):
return reduce(lambda a,b: a and b,[self.visit(arg) for arg in node.nodes])
def visitBitand(self,node,**kw):
return reduce(lambda a,b: a & b,[self.visit(arg) for arg in node.nodes])
def visitBitor(self,node,**kw):
return reduce(lambda a,b: a | b,[self.visit(arg) for arg in node.nodes])
def visitBitxor(self,node,**kw):
return reduce(lambda a,b: a ^ b,[self.visit(arg) for arg in node.nodes])
def visitCompare(self,node,**kw):
comparisons = {
"<": operator.lt, # strictly less than
"<=": operator.le,# less than or equal
">": operator.gt, # strictly greater than
">=": operator.ge, # greater than or equal
"==": operator.eq, # equal
"!=": operator.ne, # not equal
"<>": operator.ne, # not equal
"is": operator.is_, # object identity
"is not": operator.is_not # negated object identity
}
obj = self.visit(node.expr)
for op, compnode in node.ops:
compobj = self.visit(compnode)
if not comparisons[op](obj, compobj):
return False
obj = compobj
return True
def visitOr(self,node,**kw):
return reduce(lambda a,b: a or b,[self.visit(arg) for arg in node.nodes])

def visitCallFunc(self,node,**kw):

func = self.visit(node.node, context = "Callable")
# Handle only positional args
posargs = [self.visit(arg) for arg in node.args]

return func(*posargs)

def visitName(self, node, context = None, **kw):
name = node.name
if context == "Callable":
# Lookup the function only in mathfuncs
try:
return mathfuncs[name]
except KeyError:
raise CalcError("Undefined function", name)
else:
try:
return mathsymbols[name]
except KeyError:
raise CalcError("Undefined symbol",name)

def default(self, node, **kw):
"""Anything not expressly allowed is forbidden"""
raise CalcError("Syntax Error",
node.__class__.__name__,node)


def calc(source):
walker = EvalCalc()
try:
ast = compiler.parse(source,"eval")
except SyntaxError, err:
raise
try:
return walker.visit(ast)
except CalcError, err:
return err

Examples:


Michael
 
P

Paul McGuire

Almost this exact parser, called fourFn.py, is included in the examples
with pyparsing (at http://pyparsing.sourceforge.net). Since it is pure
Python, you can extend the grammar with whatever builtin functions you
like. But it *is* a parser, not just a short cut.

-- Paul
 
B

Brian Kazian

Wow, thanks so much guys!


Michael Spencer said:
Brian said:
Thanks for the help, I didn't even think of that.

I'm guessing there's no easy way to handle exponents or logarithmic
functions? I will be running into these two types as well.

eval will handle exponents just fine: try eval("2**16")
in fact, it will evaluate any legal python expression*

logarithmic functions live in the math module, so you will either need to
import the functions/symbols you want from math, or give that namespace to
eval:

* this means that, eval("sys.exit()") will likely stop your interpreter,
and there are various other inputs with possibly harmful consequences.

Concerns like these may send you back to your original idea of doing your
own expression parsing. The good news is that the compiler package will
parse any legal Python expression, and return an Abstract Syntax Tree.
It's straightforward to walk the tree and achieve fine-grain control over
evaluation.

Here's an example of a math calculator that doesn't use eval. It
evaluates any Python scalar numeric expression (i.e., excludes container
types), and only those symbols and functions that are explicity specified.
This code is barely tested and probably not bullet-proof. But with care
and testing it should be possible to achieve a good balance of
functionality and security.


import compiler
import types
import math

# create a namespace of useful funcs
mathfuncs = {"abs":abs, "min": min, "max": max}
mathfuncs.update((funcname, getattr(math,funcname)) for funcname in
vars(math)
if not funcname.startswith("_"))

mathsymbols = {"pi":math.pi, "e":math.e}

# define acceptable types - others will raise an exception if
# entered as literals
mathtypes = (int, float, long, complex)

class CalcError(Exception):
def __init__(self,error,descr = None,node = None):
self.error = error
self.descr = descr
self.node = node
#self.lineno = getattr(node,"lineno",None)

def __repr__(self):
return "%s: %s" % (self.error, self.descr)
__str__ = __repr__


class EvalCalc(object):

def __init__(self):
self._cache = {} # dispatch table

def visit(self, node,**kw):
cls = node.__class__
meth = self._cache.setdefault(cls,
getattr(self,'visit'+cls.__name__,self.default))
return meth(node, **kw)

def visitExpression(self, node, **kw):
return self.visit(node.node)

def visitConst(self, node, **kw):
value = node.value
if isinstance(value, mathtypes):
return node.value
else:
raise CalcError("Not a numeric type", value)

# Binary Ops
def visitAdd(self,node,**kw):
return self.visit(node.left) + self.visit(node.right)
def visitDiv(self,node,**kw):
return self.visit(node.left) / self.visit(node.right)
def visitFloorDiv(self,node,**kw):
return self.visit(node.left) // self.visit(node.right)
def visitLeftShift(self,node,**kw):
return self.visit(node.left) << self.visit(node.right)
def visitMod(self,node,**kw):
return self.visit(node.left) % self.visit(node.right)
def visitMul(self,node,**kw):
return self.visit(node.left) * self.visit(node.right)
def visitPower(self,node,**kw):
return self.visit(node.left) ** self.visit(node.right)
def visitRightShift(self,node,**kw):
return self.visit(node.left) >> self.visit(node.right)
def visitSub(self,node,**kw):
return self.visit(node.left) - self.visit(node.right)

# Unary ops
def visitNot(self,node,*kw):
return not self.visit(node.expr)
def visitUnarySub(self,node,*kw):
return -self.visit(node.expr)
def visitInvert(self,node,*kw):
return ~self.visit(node.expr)
def visitUnaryAdd(self,node,*kw):
return +self.visit(node.expr)

# Logical Ops
def visitAnd(self,node,**kw):
return reduce(lambda a,b: a and b,[self.visit(arg) for arg in
node.nodes])
def visitBitand(self,node,**kw):
return reduce(lambda a,b: a & b,[self.visit(arg) for arg in
node.nodes])
def visitBitor(self,node,**kw):
return reduce(lambda a,b: a | b,[self.visit(arg) for arg in
node.nodes])
def visitBitxor(self,node,**kw):
return reduce(lambda a,b: a ^ b,[self.visit(arg) for arg in
node.nodes])
def visitCompare(self,node,**kw):
comparisons = {
"<": operator.lt, # strictly less than
"<=": operator.le,# less than or equal
">": operator.gt, # strictly greater than
">=": operator.ge, # greater than or equal
"==": operator.eq, # equal
"!=": operator.ne, # not equal
"<>": operator.ne, # not equal
"is": operator.is_, # object identity
"is not": operator.is_not # negated object identity
}
obj = self.visit(node.expr)
for op, compnode in node.ops:
compobj = self.visit(compnode)
if not comparisons[op](obj, compobj):
return False
obj = compobj
return True
def visitOr(self,node,**kw):
return reduce(lambda a,b: a or b,[self.visit(arg) for arg in
node.nodes])

def visitCallFunc(self,node,**kw):

func = self.visit(node.node, context = "Callable")
# Handle only positional args
posargs = [self.visit(arg) for arg in node.args]

return func(*posargs)

def visitName(self, node, context = None, **kw):
name = node.name
if context == "Callable":
# Lookup the function only in mathfuncs
try:
return mathfuncs[name]
except KeyError:
raise CalcError("Undefined function", name)
else:
try:
return mathsymbols[name]
except KeyError:
raise CalcError("Undefined symbol",name)

def default(self, node, **kw):
"""Anything not expressly allowed is forbidden"""
raise CalcError("Syntax Error",
node.__class__.__name__,node)


def calc(source):
walker = EvalCalc()
try:
ast = compiler.parse(source,"eval")
except SyntaxError, err:
raise
try:
return walker.visit(ast)
except CalcError, err:
return err

Examples:


Michael
 
G

Giovanni Bajo

Michael said:
* this means that, eval("sys.exit()") will likely stop your
interpreter, and
there are various other inputs with possibly harmful consequences.

Concerns like these may send you back to your original idea of doing
your own expression parsing.

I use something along these lines:

def safe_eval(expr, symbols={}):
return eval(expr, dict(__builtins__=None, True=True, False=False), symbols)

import math
def calc(expr):
return safe_eval(expr, vars(math))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in calc
File "<stdin>", line 2, in safe_eval
4352
 
S

Steven Bethard

Michael said:
Giovanni said:
I use something along these lines:

def safe_eval(expr, symbols={}):
return eval(expr, dict(__builtins__=None, True=True, False=False),
symbols)

import math
def calc(expr):
return safe_eval(expr, vars(math))
That offers only notional security:
calc("acos.__class__.__bases__[0]")
<type 'object'>

Yeah, I was concerned about the same thing, but I realized that I can't
actually access any of the func_globals attributes:

py> eval('(1).__class__.mro()[-1].__subclasses__()[17]'
.... '.substitute.func_globals', dict(__builtins__=None))
Traceback (most recent call last):
File "<interactive input>", line 2, in ?
File "<string>", line 0, in ?
RuntimeError: restricted attribute

AFAIK, you need to get to func_globals to do anything really
interesting. (You can get file through object, but you can't get
__import__ AFAIK. So you can read and write files which means you can
create a DOS attack, but I don't know how to do the eqivalent of, say,
'rm -rf /'.)

Also interesting is that an old exec trick[1] no longer works:

py> exec """\
.... global __builtins__
.... del __builtins__
.... print __builtins__""" in dict(__builtins__=None)
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "<string>", line 3, in ?
NameError: global name '__builtins__' is not defined

(It used to make __builtins__ available.)

STeVe

[1]http://mail.python.org/pipermail/python-list/2004-August/234838.html
 
S

Steven Bethard

Steven said:
Yeah, I was concerned about the same thing, but I realized that I can't
actually access any of the func_globals attributes:

py> eval('(1).__class__.mro()[-1].__subclasses__()[17]'
... '.substitute.func_globals', dict(__builtins__=None))
Traceback (most recent call last):
File "<interactive input>", line 2, in ?
File "<string>", line 0, in ?
RuntimeError: restricted attribute

AFAIK, you need to get to func_globals to do anything really
interesting. (You can get file through object, but you can't get
__import__ AFAIK. So you can read and write files which means you can
create a DOS attack, but I don't know how to do the eqivalent of, say,
'rm -rf /'.)

Hmm... I also can't access the file constructor:

py> eval("(1).__class__.mro()[-1].__subclasses__()[16]"
.... "('temp.txt', 'w').write('')", dict(__builtins__=None))
Traceback (most recent call last):
File "<interactive input>", line 2, in ?
File "<string>", line 0, in ?
IOError: file() constructor not accessible in restricted mode

STeVe
 
G

Giovanni Bajo

Steven said:
I use something along these lines:

def safe_eval(expr, symbols={}):
return eval(expr, dict(__builtins__=None, True=True,
False=False), symbols)

import math
def calc(expr):
return safe_eval(expr, vars(math))
That offers only notional security:
calc("acos.__class__.__bases__[0]")
<type 'object'>

Yeah, I was concerned about the same thing, but I realized that I
can't actually access any of the func_globals attributes:


When __builtin__ is not the standard __builtin__, Python is in restricted
execution mode. In fact, I believe my solution to be totally safe, and I
otherwise would love to be proved wrong.
 
M

Michael Spencer

Giovanni said:
Steven Bethard wrote:

I use something along these lines:

def safe_eval(expr, symbols={}):
return eval(expr, dict(__builtins__=None, True=True,
False=False), symbols)

import math
def calc(expr):
return safe_eval(expr, vars(math))


That offers only notional security:

calc("acos.__class__.__bases__[0]")
<type 'object'>

Yeah, I was concerned about the same thing, but I realized that I
can't actually access any of the func_globals attributes:
Interesting, of course I had never actually tried it
When __builtin__ is not the standard __builtin__, Python is in restricted
execution mode.

After a little experimenting, it appears to be a bit stronger than that. Once a
frame is set for restricted execution (f_restricted == 1), then even if you set
f_globals['__builtin__'] = __builtins__, you are still left in resticted
execution mode.
In fact, I believe my solution to be totally safe,

That's a bold claim! I'll readily concede that I can't access func_globals from
restricted mode eval (others may know better). But your interpreter is still be
vulnerable to DOS-style attack from rogue calculations or quasi-infinite loops.
> otherwise would love to be proved wrong.

Michael
 
S

Steven Bethard

Giovanni said:
When __builtin__ is not the standard __builtin__, Python is in restricted
execution mode.

Do you know where this is documented? I looked around, but couldn't
find anything.

STeVe
 
G

Giovanni Bajo

Steven said:
Do you know where this is documented? I looked around, but couldn't
find anything.


I found some documentation in the reference of the (now disabled) modules for
Restricted Execution (chapter 17 in the Library Reference). Quoting:

"""
The Python run-time determines whether a particular code block is executing in
restricted execution mode based on the identity of the __builtins__ object in
its global variables: if this is (the dictionary of) the standard __builtin__
module, the code is deemed to be unrestricted, else it is deemed to be
restricted.
"""

There are also some hints in the documentation for eval() itself:

"""
If the globals dictionary is present and lacks '__builtins__', the current
globals are copied into globals before expression is parsed. This means that
expression normally has full access to the standard __builtin__ module and
restricted environments are propagated
"""

In fact, the documentation for eval() could be improved to explain the benefits
of setting __builtins__ in the globals.
 
G

Giovanni Bajo

Michael said:
That's a bold claim! I'll readily concede that I can't access
func_globals from restricted mode eval (others may know better). But
your interpreter is still be vulnerable to DOS-style attack from
rogue calculations or quasi-infinite loops.


Yes, but I don't see your manually-rolled-up expression calculator being
DOS-safe. I believe DOS attacks to be a problem whenever you want to calculate
the result of an expression taken from the outside. What I was trying to show
is that my simple one-liner is no worse than a multi-page full-blown expression
parser and interpreter.
 
S

Steven Bethard

Giovanni said:
In fact, the documentation for eval() could be improved to explain the benefits
of setting __builtins__ in the globals.

Well, if you think you're pretty clear on what's happening, a patch is
always appreciated. =) I have a feeling that the docs are at least
partially vague because no one actually wants to advertise the
restricted execution features[1] since no one can guarantee that they're
really secure...

STeVe

[1] Guido say as much
http://mail.python.org/pipermail/python-dev/2002-December/031234.html
 
M

Michael Spencer

Giovanni said:
Michael Spencer wrote:





Yes, but I don't see your manually-rolled-up expression calculator being
DOS-safe. I believe DOS attacks to be a problem whenever you want to calculate
the result of an expression taken from the outside. What I was trying to show
is that my simple one-liner is no worse than a multi-page full-blown expression
parser and interpreter.

Fair point that brevity is itself valuable in achieving security. It isn't
worth using my "manually-rolled-up expression calculator" simply to deny access
to func_globals as you have demonstrated.

However, the advantage of the MRUEP is that every operation is evaluated
individually. In the example I showed, loops are disabled, attribute access is
disabled. Numeric inputs and intermediate results can be checked easily for
boundedness (though they are not in the example I gave). This sort of
fine-grain control is very much harder to do with a straight eval model.

Cheers

Michael
 
G

Giovanni Bajo

Steven said:
In fact, the documentation for eval() could be improved to explain
the benefits of setting __builtins__ in the globals.

Well, if you think you're pretty clear on what's happening, a patch is
always appreciated. =) I have a feeling that the docs are at least
partially vague because no one actually wants to advertise the
restricted execution features[1] since no one can guarantee that
they're really secure...

I am by no means clear. I found out by accident this "feature" of eval and
wondered why it is not explained in the documentation. The link you provided is
a good answer to my question. I understand Guido's concerns, in fact.

Then, I should start my usual rant about how is really sad to send patches to
Python and have them ignored for years (not even an acknowledge). Really sad.
This is why I'm not going to do that again.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top