derivation: sick Python trick of the week

Discussion in 'Python' started by Robert Brewer, Feb 23, 2004.

  1. WARNING:

    The following is NOT all Pythonic. However, it is Python. This is just
    fun--nothing I'm going to put into production. I don't have any
    questions or problems--just thought I'd share.


    BACKGROUND:

    I've been playing around for a couple of days with parsing and
    derivation ("unparsing"). My goal was and is to take snippets of Python
    code and turn them into "equivalent" SQL statements. For example, the
    Python snippet:

    x.Group > 3 or x.Name == 'fumanchu'

    ....should equate to an SQL WHERE clause like:

    Group > 3 or Name = 'fumanchu'

    I tore into the compiler package, and hacked together a solution which
    does just that. One invokes such a task with code like:

    sqlstmt = derive.ADODeriver().evaluate("x.Group > 3 or x.Name ==
    'fumanchu'")

    However, it's parsing the output of a compiler.Transformer, which itself
    is parsing an AST (lots of nested tuples), so it's not very quick. 1000
    repetitions of the above example take over 1/2 second to run on my
    laptop.


    IT'S ALIVE!!!

    In a fit of mad-scientist genius (get out the pitchforks and torches), I
    wondered if it might be faster to let Python do *all* the parsing, and
    just watch it work and take notes. That's what the code below does. The
    sick and twisted part is how one invokes this technique:

    >>> import sick_derive
    >>> x = sick_derive.Expression()
    >>> (x.a == 3) & ((x.b > 1) | (x.b < -10))
    >>> x

    ['a', 3, <built-in function eq>, 'b', 1, <built-in function gt>, 'b',
    -10, <built-in function lt>, <built-in function or_>, <built-in function
    and_>]
    >>> sick_derive.Deriver().derive(x)

    '((a == 3) and ((b > 1) or (b < -10)))'

    Yes, in line two we run comparisons and boolean operations on
    non-existent attributes of x, and discard the results! Sick. However, we
    were keeping track (as the repr of x shows on line 3); we built a
    postfix list of the comparisons we performed. When we call
    Deriver().derive(x), the Deriver traverses that list and rebuilds the
    Python code. I made a similar ADODeriver which builds SQL code (too
    complex to include here; it also used a different Adapter for the
    object-to-DB mapper).

    I had to replace boolean 'and' and 'or' with binary calls in order to
    override them, and 'not' with 'neg'. That makes it even sicker. And it's
    *far* from obvious that 'x.a' should reduce to just 'a'.

    But it's about 7 times as fast as the first solution. ;)


    So for a host of reasons, don't ever use this. I won't. But it was
    interesting.


    Robert Brewer
    MIS



    ---- sick_derive.py ----

    import operator


    def containedby(a, b):
    """Return the outcome of the test a in b. Note the operand order."""
    return a in b

    def startswith(a, b):
    """Return True if string starts with the prefix, otherwise return
    False."""
    return a.startswith(b)

    def endswith(a, b):
    """Return True if the string ends with the suffix, otherwise return
    False."""
    return a.endswith(b)


    class Operand(object):
    """Push atoms onto a postfix expression stack."""
    def __init__(self, expr, name=''):
    self.expr = expr
    self.name = name

    def __neg__(self):
    self.expr.append(operator.not_)
    return Operand(self.expr)

    def __lt__(self, other):
    self.expr.extend([self.name, other, operator.lt])
    return Operand(self.expr)

    def __le__(self, other):
    self.expr.extend([self.name, other, operator.le])
    return Operand(self.expr)

    def __eq__(self, other):
    if self.name:
    self.expr.extend([self.name, other, operator.eq])
    else:
    self.expr.extend([other, operator.eq])
    return Operand(self.expr)

    def __ne__(self, other):
    self.expr.extend([self.name, other, operator.ne])
    return Operand(self.expr)

    def __ge__(self, other):
    self.expr.extend([self.name, other, operator.ge])
    return Operand(self.expr)

    def __gt__(self, other):
    self.expr.extend([self.name, other, operator.gt])
    return Operand(self.expr)

    def __and__(self, other):
    self.expr.append(operator.and_)
    return Operand(self.expr)

    def __or__(self, other):
    self.expr.append(operator.or_)
    return Operand(self.expr)

    def __contains__(self, other):
    self.expr.extend([self.name, other, operator.contains])
    return Operand(self.expr)

    def contains(self, other):
    self.expr.extend([self.name, other, operator.contains])
    return Operand(self.expr)

    def containedby(self, other):
    self.expr.extend([self.name, other, containedby])
    return Operand(self.expr)

    def startswith(self, other):
    self.expr.extend([self.name, other, startswith])
    return Operand(self.expr)

    def endswith(self, other):
    self.expr.extend([self.name, other, endswith])
    return Operand(self.expr)


    class Expression(list):
    """A postfix recorder and evaluator for operations on arbitrary
    objects."""

    unaries = (operator.not_, )
    binaries = (operator.and_,
    operator.or_,
    operator.lt,
    operator.le,
    operator.eq,
    operator.ne,
    operator.gt,
    operator.ge,
    operator.contains,
    containedby,
    startswith,
    endswith,
    )

    def __getattr__(self, name):
    return Operand(self, name)

    def evaluate(self, obj, ifEmpty=True):
    """Evaluate an object against self.
    Names will be looked up in the object's attribute dictionary.
    """
    stack = self[:]
    if not stack:
    return ifEmpty

    def operate():
    op = stack.pop()
    if op in self.unaries:
    a = operate()
    return op(a)
    elif op in self.binaries:
    b = operate()
    a = operate()
    if a not in (True, False):
    a = getattr(obj, a)
    return op(a, b)
    else:
    return op
    return operate()


    class Adapter(object):
    """Transform values according to their type."""

    def __init__(self):
    self.default_processor = repr
    self.processors = {}

    def process(self, value):
    try:
    xform = self.processors[type(value)]
    except KeyError:
    xform = self.default_processor
    return xform(value)


    class Deriver(object):
    """Derive an Expression according to a grammar."""

    def __init__(self, adapter=Adapter()):
    f = adapter.process
    self.unaries = {operator.not_: lambda x: "(not %s)" % x}
    self.binaries = {operator.and_: lambda x, y: "(%s and %s)" % (x,
    y),
    operator.or_: lambda x, y: "(%s or %s)" % (x,
    y),
    operator.lt: lambda x, y: "(%s < %s)" % (x,
    f(y)),
    operator.le: lambda x, y: "(%s <= %s)" % (x,
    f(y)),
    operator.eq: lambda x, y: "(%s == %s)" % (x,
    f(y)),
    operator.ne: lambda x, y: "(%s != %s)" % (x,
    f(y)),
    operator.gt: lambda x, y: "(%s > %s)" % (x,
    f(y)),
    operator.ge: lambda x, y: "(%s >= %s)" % (x,
    f(y)),
    operator.contains: lambda x, y: "(%s in %s)" %
    (f(y), x),
    containedby: lambda x, y: "(%s in %s)" % (x,
    f(y)),
    startswith: lambda x, y: "(%s.startswith(%s))"
    % (x, f(y)),
    endswith: lambda x, y: "(%s.endswith(%s))" %
    (x, f(y)),
    }
    self.ifEmpty = ''

    def derive(self, expr):
    try:
    stack = expr[:]
    except TypeError, x:
    x.args += (type(expr), )
    raise x

    if not stack:
    return self.ifEmpty

    def operate():
    op = stack.pop()
    if op in self.unaries:
    a = operate()
    return self.unaries[op](a)
    elif op in self.binaries:
    b = operate()
    a = operate()
    return self.binaries[op](a, b)
    else:
    return op
    return operate()
     
    Robert Brewer, Feb 23, 2004
    #1
    1. Advertising

  2. Robert Brewer wrote:
    > In a fit of mad-scientist genius (get out the pitchforks and torches), I
    > wondered if it might be faster to let Python do *all* the parsing, and
    > just watch it work and take notes. That's what the code below does. The
    > sick and twisted part is how one invokes this technique:


    It's not really all that sick, or at least it's not a new
    idea. I'm sure there's at least one DB module out there
    somewhere that uses this technique.

    > Yes, in line two we run comparisons and boolean operations on
    > non-existent attributes of x, and discard the results! Sick.


    That part is perhaps a bit too twisted. It's not really
    necessary, you could just as well design it so you say

    expr = (x.a == 3) & ((x.b > 1) | (x.b < -10))

    > And it's
    > *far* from obvious that 'x.a' should reduce to just 'a'.


    In the case of database queries, you can make this seem
    much more natural by having 'x' represent some meaningful
    object such as a table.

    db = open_database("favourite_fruits")
    fruits = db.fruits
    query = (fruits.kind == "apple") && (fruits.tastiness >= 3)

    > I had to replace boolean 'and' and 'or' with binary calls in order to
    > override them, and 'not' with 'neg'. That makes it even sicker.


    Yes, that's the main nuisance with this technique. I have
    a PEP idea floating around to make and/or/not overridable,
    for this very purpose. Hmmm... does that make me sick?-)

    --
    Greg Ewing, Computer Science Dept,
    University of Canterbury,
    Christchurch, New Zealand
    http://www.cosc.canterbury.ac.nz/~greg
     
    Greg Ewing (using news.cis.dfn.de), Feb 23, 2004
    #2
    1. Advertising

  3. "Greg Ewing (using news.cis.dfn.de)" <> writes:

    > Yes, that's the main nuisance with this technique. I have
    > a PEP idea floating around to make and/or/not overridable,
    > for this very purpose. Hmmm... does that make me sick?-)


    I'd say so.

    Cheers,
    mwh

    --
    . <- the point your article -> .
    |------------------------- a long way ------------------------|
    -- Cristophe Rhodes, ucam.chat
     
    Michael Hudson, Feb 23, 2004
    #3
  4. Robert Brewer

    has Guest

    "Greg Ewing (using news.cis.dfn.de)" <> wrote in message news:<c1bobu$1g1p9e$-berlin.de>...

    > Robert Brewer wrote:
    >> In a fit of mad-scientist genius (get out the pitchforks and

    torches), I
    >> wondered if it might be faster to let Python do *all* the parsing,

    and
    >> just watch it work and take notes. That's what the code below does.

    The
    >> sick and twisted part is how one invokes this technique:

    >
    > It's not really all that sick, or at least it's not a new
    > idea.


    And a lot less sick that hacking the language compiler in my
    opinion/experience... (I came to Python from a certain language whose
    use of clever-clever compiler hackery makes it ludicrously tetchy
    about how you structure your code. ("No, no! You must write it THIS
    way!") Such contextual confusion and pain in the brain that you
    wouldn't believe, the moment you try writing anything but the most
    trivial code in the most trivial fashion.)


    > I'm sure there's at least one DB module out there
    > somewhere that uses this technique.


    Can't speak for DB modules, but must confess to using a close
    variation of this technique in my current project - a
    MacPython-to-Apple Event Manager bridge - where it's used to assemble
    AEM references and test expressions.

    Main difference in my design to Robert's is that call capture objects
    don't share mutable state. This means users can safely define new
    references/test expressions by extending existing ones if they like,
    rather than having to create a new one from scratch each time; the
    original object won't be changed by this. eg:

    import sick_derive

    x = sick_derive.Expression()

    e1 = ((x.b > 1) | (x.b < -10))
    e2 = (e1 == 4)

    print sick_deriver.Deriver().derive(e2.expr)
    --> (((b > 1) or (b < -10)) == 4)

    print sick_derive.Deriver().derive(e1.expr)
    --> (((b > 1) or (b < -10)) == 4) # !!! e1 has also changed!


    vs:

    from appscript import its

    e3 = its.name_extension.isin(['jpg', 'jpeg'])
    e4 = e3 .OR (its.file_type == 'JPEG')

    print e4
    # --> OR(its.name_extension.isin(['jpg', 'jpeg']), its.file_type
    == 'JPEG')

    print e3
    # --> its.name_extension.isin(['jpg', 'jpeg'])




    > In the case of database queries, you can make this seem
    > much more natural by having 'x' represent some meaningful
    > object such as a table.
    >
    > db = open_database("favourite_fruits")
    > fruits = db.fruits
    > query = (fruits.kind == "apple") && (fruits.tastiness >= 3)


    Once you solve the mutating state problem, you can simply have your
    module define a single (generic) 'root' object that users can use to
    build any expression. (Appscript stores this object in its global
    'its' variable, allowing it to be exported.)


    >> I had to replace boolean 'and' and 'or' with binary calls in order

    to
    >> override them, and 'not' with 'neg'. That makes it even sicker.

    >
    > Yes, that's the main nuisance with this technique.


    Yup. (Trouble with these sorts of exercises is sooner or later you
    start knocking against the boundaries of what the language can do,
    and'll just tie yourself in knots with outrageous acrobatics if you're
    not careful.)


    >I have
    > a PEP idea floating around to make and/or/not overridable,
    > for this very purpose. Hmmm... does that make me sick?-)


    Raised this idea on the PythonMac SIG last week; it died a quick death
    once someone pointed out that boolean operators also perform flow
    control (lazy evaluation of operands). Implementing a suitable
    override mechanism would be a non-trivial exercise (and seems unlikely
    the BDFL would support it).
     
    has, Feb 23, 2004
    #4
  5. has wrote:
    > Raised this idea on the PythonMac SIG last week; it died a quick death
    > once someone pointed out that boolean operators also perform flow
    > control


    I do have a plan for handling that. Yes, it'll be non-trivial
    (although not too complicated, I hope), and yes, Guido probably
    won't like it much, but you never know 'till you try...

    --
    Greg Ewing, Computer Science Dept,
    University of Canterbury,
    Christchurch, New Zealand
    http://www.cosc.canterbury.ac.nz/~greg
     
    Greg Ewing (using news.cis.dfn.de), Feb 24, 2004
    #5
  6. Robert Brewer

    has Guest

    "Greg Ewing (using news.cis.dfn.de)" <> wrote in message news:<c1eddd$1h4621$-berlin.de>...
    > has wrote:
    > > Raised this idea on the PythonMac SIG last week; it died a quick death
    > > once someone pointed out that boolean operators also perform flow
    > > control

    >
    > I do have a plan for handling that. Yes, it'll be non-trivial
    > (although not too complicated, I hope), and yes, Guido probably
    > won't like it much, but you never know 'till you try...


    You certainly have my interest...

    has
     
    has, Feb 24, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guest
    Replies:
    3
    Views:
    1,882
    Alexandre
    Dec 22, 2003
  2. Steven T. Hatton
    Replies:
    12
    Views:
    1,709
    Jonathan Turkanis
    Aug 20, 2004
  3. Replies:
    0
    Views:
    300
  4. adfgvx

    Sick (yaml library) and Python

    adfgvx, Sep 20, 2003, in forum: Python
    Replies:
    0
    Views:
    321
    adfgvx
    Sep 20, 2003
  5. Guest
    Replies:
    3
    Views:
    544
    Alexandre
    Dec 22, 2003
Loading...

Share This Page