derivation: sick Python trick of the week

Robert Brewer · Feb 23, 2004

WARNING:

The following is NOT all Pythonic. However, it is Python. This is just
fun--nothing I'm going to put into production. I don't have any
questions or problems--just thought I'd share.

BACKGROUND:

I've been playing around for a couple of days with parsing and
derivation ("unparsing"). My goal was and is to take snippets of Python
code and turn them into "equivalent" SQL statements. For example, the
Python snippet:

x.Group > 3 or x.Name == 'fumanchu'

....should equate to an SQL WHERE clause like:

Group > 3 or Name = 'fumanchu'

I tore into the compiler package, and hacked together a solution which
does just that. One invokes such a task with code like:

sqlstmt = derive.ADODeriver().evaluate("x.Group > 3 or x.Name ==
'fumanchu'")

However, it's parsing the output of a compiler.Transformer, which itself
is parsing an AST (lots of nested tuples), so it's not very quick. 1000
repetitions of the above example take over 1/2 second to run on my
laptop.

IT'S ALIVE!!!

In a fit of mad-scientist genius (get out the pitchforks and torches), I
wondered if it might be faster to let Python do *all* the parsing, and
just watch it work and take notes. That's what the code below does. The
sick and twisted part is how one invokes this technique:
['a', 3, <built-in function eq>, 'b', 1, <built-in function gt>, 'b',
[/QUOTE]
'((a == 3) and ((b > 1) or (b < -10)))'

Yes, in line two we run comparisons and boolean operations on
non-existent attributes of x, and discard the results! Sick. However, we
were keeping track (as the repr of x shows on line 3); we built a
postfix list of the comparisons we performed. When we call
Deriver().derive(x), the Deriver traverses that list and rebuilds the
Python code. I made a similar ADODeriver which builds SQL code (too
complex to include here; it also used a different Adapter for the
object-to-DB mapper).

I had to replace boolean 'and' and 'or' with binary calls in order to
override them, and 'not' with 'neg'. That makes it even sicker. And it's
*far* from obvious that 'x.a' should reduce to just 'a'.

But it's about 7 times as fast as the first solution.

So for a host of reasons, don't ever use this. I won't. But it was
interesting.

Robert Brewer
MIS
(e-mail address removed)

---- sick_derive.py ----

import operator

def containedby(a, b):
"""Return the outcome of the test a in b. Note the operand order."""
return a in b

def startswith(a, b):
"""Return True if string starts with the prefix, otherwise return
False."""
return a.startswith(b)

def endswith(a, b):
"""Return True if the string ends with the suffix, otherwise return
False."""
return a.endswith(b)

class Operand(object):
"""Push atoms onto a postfix expression stack."""
def __init__(self, expr, name=''):
self.expr = expr
self.name = name

def __neg__(self):
self.expr.append(operator.not_)
return Operand(self.expr)

def __lt__(self, other):
self.expr.extend([self.name, other, operator.lt])
return Operand(self.expr)

def __le__(self, other):
self.expr.extend([self.name, other, operator.le])
return Operand(self.expr)

def __eq__(self, other):
if self.name:
self.expr.extend([self.name, other, operator.eq])
else:
self.expr.extend([other, operator.eq])
return Operand(self.expr)

def __ne__(self, other):
self.expr.extend([self.name, other, operator.ne])
return Operand(self.expr)

def __ge__(self, other):
self.expr.extend([self.name, other, operator.ge])
return Operand(self.expr)

def __gt__(self, other):
self.expr.extend([self.name, other, operator.gt])
return Operand(self.expr)

def __and__(self, other):
self.expr.append(operator.and_)
return Operand(self.expr)

def __or__(self, other):
self.expr.append(operator.or_)
return Operand(self.expr)

def __contains__(self, other):
self.expr.extend([self.name, other, operator.contains])
return Operand(self.expr)

def contains(self, other):
self.expr.extend([self.name, other, operator.contains])
return Operand(self.expr)

def containedby(self, other):
self.expr.extend([self.name, other, containedby])
return Operand(self.expr)

def startswith(self, other):
self.expr.extend([self.name, other, startswith])
return Operand(self.expr)

def endswith(self, other):
self.expr.extend([self.name, other, endswith])
return Operand(self.expr)

class Expression(list):
"""A postfix recorder and evaluator for operations on arbitrary
objects."""

unaries = (operator.not_, )
binaries = (operator.and_,
operator.or_,
operator.lt,
operator.le,
operator.eq,
operator.ne,
operator.gt,
operator.ge,
operator.contains,
containedby,
startswith,
endswith,
)

def __getattr__(self, name):
return Operand(self, name)

def evaluate(self, obj, ifEmpty=True):
"""Evaluate an object against self.
Names will be looked up in the object's attribute dictionary.
"""
stack = self[:]
if not stack:
return ifEmpty

def operate():
op = stack.pop()
if op in self.unaries:
a = operate()
return op(a)
elif op in self.binaries:
b = operate()
a = operate()
if a not in (True, False):
a = getattr(obj, a)
return op(a, b)
else:
return op
return operate()

class Adapter(object):
"""Transform values according to their type."""

def __init__(self):
self.default_processor = repr
self.processors = {}

def process(self, value):
try:
xform = self.processors[type(value)]
except KeyError:
xform = self.default_processor
return xform(value)

class Deriver(object):
"""Derive an Expression according to a grammar."""

def __init__(self, adapter=Adapter()):
f = adapter.process
self.unaries = {operator.not_: lambda x: "(not %s)" % x}
self.binaries = {operator.and_: lambda x, y: "(%s and %s)" % (x,
y),
operator.or_: lambda x, y: "(%s or %s)" % (x,
y),
operator.lt: lambda x, y: "(%s < %s)" % (x,
f(y)),
operator.le: lambda x, y: "(%s <= %s)" % (x,
f(y)),
operator.eq: lambda x, y: "(%s == %s)" % (x,
f(y)),
operator.ne: lambda x, y: "(%s != %s)" % (x,
f(y)),
operator.gt: lambda x, y: "(%s > %s)" % (x,
f(y)),
operator.ge: lambda x, y: "(%s >= %s)" % (x,
f(y)),
operator.contains: lambda x, y: "(%s in %s)" %
(f(y), x),
containedby: lambda x, y: "(%s in %s)" % (x,
f(y)),
startswith: lambda x, y: "(%s.startswith(%s))"
% (x, f(y)),
endswith: lambda x, y: "(%s.endswith(%s))" %
(x, f(y)),
}
self.ifEmpty = ''

def derive(self, expr):
try:
stack = expr[:]
except TypeError, x:
x.args += (type(expr), )
raise x

if not stack:
return self.ifEmpty

def operate():
op = stack.pop()
if op in self.unaries:
a = operate()
return self.unaries[op](a)
elif op in self.binaries:
b = operate()
a = operate()
return self.binaries[op](a, b)
else:
return op
return operate()

Greg Ewing (using news.cis.dfn.de) · Feb 23, 2004

Robert said:
In a fit of mad-scientist genius (get out the pitchforks and torches), I
wondered if it might be faster to let Python do *all* the parsing, and
just watch it work and take notes. That's what the code below does. The
sick and twisted part is how one invokes this technique:

It's not really all that sick, or at least it's not a new
idea. I'm sure there's at least one DB module out there
somewhere that uses this technique.

Yes, in line two we run comparisons and boolean operations on
non-existent attributes of x, and discard the results! Sick.

That part is perhaps a bit too twisted. It's not really
necessary, you could just as well design it so you say

expr = (x.a == 3) & ((x.b > 1) | (x.b < -10))

> And it's
> *far* from obvious that 'x.a' should reduce to just 'a'.

In the case of database queries, you can make this seem
much more natural by having 'x' represent some meaningful
object such as a table.

db = open_database("favourite_fruits")
fruits = db.fruits
query = (fruits.kind == "apple") && (fruits.tastiness >= 3)

I had to replace boolean 'and' and 'or' with binary calls in order to
override them, and 'not' with 'neg'. That makes it even sicker.

Yes, that's the main nuisance with this technique. I have
a PEP idea floating around to make and/or/not overridable,
for this very purpose. Hmmm... does that make me sick?-)

Michael Hudson · Feb 23, 2004

Greg Ewing (using news.cis.dfn.de) said:
Yes, that's the main nuisance with this technique. I have
a PEP idea floating around to make and/or/not overridable,
for this very purpose. Hmmm... does that make me sick?-)

I'd say so.

Cheers,
mwh

has · Feb 23, 2004

Greg Ewing (using news.cis.dfn.de) said:
It's not really all that sick, or at least it's not a new
idea.

And a lot less sick that hacking the language compiler in my
opinion/experience... (I came to Python from a certain language whose
use of clever-clever compiler hackery makes it ludicrously tetchy
about how you structure your code. ("No, no! You must write it THIS
way!") Such contextual confusion and pain in the brain that you
wouldn't believe, the moment you try writing anything but the most
trivial code in the most trivial fashion.)

I'm sure there's at least one DB module out there
somewhere that uses this technique.

Can't speak for DB modules, but must confess to using a close
variation of this technique in my current project - a
MacPython-to-Apple Event Manager bridge - where it's used to assemble
AEM references and test expressions.

Main difference in my design to Robert's is that call capture objects
don't share mutable state. This means users can safely define new
references/test expressions by extending existing ones if they like,
rather than having to create a new one from scratch each time; the
original object won't be changed by this. eg:

import sick_derive

x = sick_derive.Expression()

e1 = ((x.b > 1) | (x.b < -10))
e2 = (e1 == 4)

print sick_deriver.Deriver().derive(e2.expr)
--> (((b > 1) or (b < -10)) == 4)

print sick_derive.Deriver().derive(e1.expr)
--> (((b > 1) or (b < -10)) == 4) # !!! e1 has also changed!

vs:

from appscript import its

e3 = its.name_extension.isin(['jpg', 'jpeg'])
e4 = e3 .OR (its.file_type == 'JPEG')

print e4
# --> OR(its.name_extension.isin(['jpg', 'jpeg']), its.file_type
== 'JPEG')

print e3
# --> its.name_extension.isin(['jpg', 'jpeg'])

In the case of database queries, you can make this seem
much more natural by having 'x' represent some meaningful
object such as a table.

db = open_database("favourite_fruits")
fruits = db.fruits
query = (fruits.kind == "apple") && (fruits.tastiness >= 3)

Once you solve the mutating state problem, you can simply have your
module define a single (generic) 'root' object that users can use to
build any expression. (Appscript stores this object in its global
'its' variable, allowing it to be exported.)

Yes, that's the main nuisance with this technique.

Yup. (Trouble with these sorts of exercises is sooner or later you
start knocking against the boundaries of what the language can do,
and'll just tie yourself in knots with outrageous acrobatics if you're
not careful.)

I have
a PEP idea floating around to make and/or/not overridable,
for this very purpose. Hmmm... does that make me sick?-)

Raised this idea on the PythonMac SIG last week; it died a quick death
once someone pointed out that boolean operators also perform flow
control (lazy evaluation of operands). Implementing a suitable
override mechanism would be a non-trivial exercise (and seems unlikely
the BDFL would support it).

Greg Ewing (using news.cis.dfn.de) · Feb 24, 2004

has said:
Raised this idea on the PythonMac SIG last week; it died a quick death
once someone pointed out that boolean operators also perform flow
control

I do have a plan for handling that. Yes, it'll be non-trivial
(although not too complicated, I hope), and yes, Guido probably
won't like it much, but you never know 'till you try...

has · Feb 24, 2004

Greg Ewing (using news.cis.dfn.de) said:
I do have a plan for handling that. Yes, it'll be non-trivial
(although not too complicated, I hope), and yes, Guido probably
won't like it much, but you never know 'till you try...

You certainly have my interest...

has

Python battle game help	2	Feb 23, 2023
Recursive class \| can you modify self directly?	10	Jul 9, 2013
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Python point location of intersect between two lines	0	Feb 28, 2018
Oddity using sorted with key	6	Mar 11, 2014
"Laws of Form" are a notation for the SK calculus, demo in Python.	0	Mar 31, 2013
Python code problem	2	Apr 23, 2023
My attempts in playing with tail-recursion in python	1	Aug 28, 2013

derivation: sick Python trick of the week

Robert Brewer

Greg Ewing (using news.cis.dfn.de)

Michael Hudson

has

Greg Ewing (using news.cis.dfn.de)

has

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads