[OT] code is data

Laurent Pointal · Jun 19, 2006

bruno at modulix a écrit :

My my my... I'm not against the idea of dynamic source code
transformation, but for heaven's sake, *why* would one put XML in the
mix ???????

Because its "à la mode", and is better for a commercial point of view,
even if inefficient for this problem.

Fredrik Lundh · Jun 19, 2006

because lots of people know how to describe XML transformations, and
there are plenty of tools that implement such transformations efficiently ?

Because its "à la mode", and is better for a commercial point of view,
even if inefficient for this problem.

why would XML be inefficient ?

</F>

Kay Schluehr · Jun 19, 2006

Ravi said:
People have however written various language interpreters (Scheme,
Forth and yes, even Basic) in Python, just for kicks. Still does not
make it a DSL language anymore than it makes C a DSL language.

At present, the closest thing to writing a DSL in Python is Logix
http://livelogix.net/logix/
Too bad though, the project is defunct and there has never been enough
interest in it.

You might be interested in EasyExtend:

http://www.fiber-space.de/EasyExtend/doc/EE.html

Unlike Logix there are no macros defined in application code and there
are no runtime macro expansions. So extension language semantics is
always fixed at compile time.

Personally, I would like to see macros in Python (actually Logix
succeeding is good enough). But I am no language designer and the
community has no interest in it. When I absolutely need macros, I will
go elsewhere.

Although Logix was written in Python and compiled to CPython bytecodes
it was a language on its own right: a Python / Lisp hybrid. Opposed to
this EasyExtend is a Python framework for language extensions and not
itself a language. A quite typical use case may not involve any new
grammar rules or terminals, but just generates code. See the coverage
fiber as an example.

bruno at modulix · Jun 19, 2006

Fredrik said:
Laurent Pointal wrote:

because lots of people know how to describe XML transformations, and
there are plenty of tools that implement such transformations efficiently ?

Efficiently enough for dynamic (runtime) use ?

Ravi Teja · Jun 19, 2006

I think this example more is a symptom of a childish need to get
things your way than of a deficiency in Python.

I thought I had enough asterisks in there to indicate that it is a
preference that I will not be defending on rational grounds. I had a
better argument before it in the same post. But you had to choose only
the trivial one to dismiss me as childish. Didn't you?

BTW, range(5) = 0..4 in Ada and Ruby.

My bad. I usually write range(1, 5 + 1) to get 1..5.
I could write range(1, 6). But I would like to see the upper bound
explicitly. Of course, I could write a function to wrap that up.

You said "when I absolutely need macros" but none of your examples
demonstrate any "absolute need." I can't see your point.

Did you miss the word - *WHEN*?
I don't need them absolutely now. And I know, that I won't get them
here. And just so you don't misinterpret, I don't call that a
"deficiency". Just a mismatch between the personal and the community
mindset.
BTW, the recent language changes - decorators, conditional expressions
and with statements are not absolute either. That did not stop them
from being welcome additions.

Ravi Teja · Jun 19, 2006

Kay said:
You might be interested in EasyExtend:

http://www.fiber-space.de/EasyExtend/doc/EE.html

Your framework does look very interesting and might just be what I am
looking for. Will give it a try.

Thanks.

Diez B. Roggisch · Jun 19, 2006

because lots of people know how to describe XML transformations, and

Efficiently enough for dynamic (runtime) use ?

Using XML-transformation for AST manipulation isn't my first choice
either - yet efficiency concerns aren't really the point here - after
all we're talking about generating code, which would be pretty useless
if the work was to be done by the transformation instead of that very
code generated ...

So the question is: do XML/XSL give an advantage here? As I said - I
personally don't think so, IMHO a standard reducer using a decent
visitor is easy enough and works well. But your (or better Fredrik's) MMV.

Diez

Ian Bicking · Jun 20, 2006

Ravi said:
You blogged on Django. Let's use that. Don't you think model creation
in Django can be represented better, given that it is done often
enough?

Actually, no, it's not done that much. Creating and managing tables
isn't something done lightly. It's essential to building a new
application, but (at least in my experience, in similar systems) the
database models stabalize early and you don't spend that much time with
them. Especially not with the DSL aspects. I add and remove methods
often, but I am loathe to add and remove columns.

Now, this might seem like I'm being pedantic, but in my experience lots
of seemingly obvious DSLs end up not being that obvious. XML
generation, for instance. It's nice to have a good syntax -- and you
can get a pretty good syntax in Python (e.g., HTMLGen, stan, etc). But
efforts that go further are generally misplaced, because it's actually
not a very hard or common thing to do, even when you are slinging
around lots of XML.

Or... maybe to be more specific, the hard work later on goes into
*code*. If you are enhancing your model, you do so with methods on the
model classes, and those methods don't effect the DSL, they are just
"code". You create some raw XML in the beginning, but quickly it's
just a matter of gluing those pieces together, using functions instead
of DSLs, and that's just "code".

Let's take an example from the official tutorial
from
http://www.djangoproject.com/documentation/tutorial1/#creating-models

class Poll(models.Model):
question = models.CharField(maxlength=200)
pub_date = models.DateTimeField('date published')

class Choice(models.Model):
poll = models.ForeignKey(Poll)
choice = models.CharField(maxlength=200)
votes = models.IntegerField()

I don't use Django and I made this up quickly, so please don't pick on
subtleties.

@Poll:
question: char length 200
pub_date('date published'): date

@Choice:
poll -> Poll
choice: char length 200
votes: int

That doesn't look that much better. How do you create it
programmatically? I know how to pass a variable to
CharField(maxlength=200); can I pass a variable to "char length 200"
just as easily? Can I use **kw? Can I read it from a CSV file and
construct the class that way? Maybe, but only by recreating all the
native patterns that I can infer easily looking at the Django class.

The following is my rationale. Annoted variables, symbols and code
layout visually cue more efficiently to the object nature than do
explicit text definitions. Of course, this is only sensible when there
aren't too many of any of those. In that case, the cognitive cost of
notation outweighs the representational cost of text.

Words are great. Python is light on symbols, and that is good. Python
is not perfect when it comes to expressing data structures (the more I
think about it, the more PEP 359 grows on me), but real DSLs are
questionable to me.

Even the Lisps stick to an incredibly homogenous syntax (far more
homogeneous than Python) to make macros feel familiar.

Representational minimalism is troublesome in general code (ala Perl),
but not so in a DSL where the context is constrained.

Constrained context is a step backward! How do you add methods? How
do you do looping? How do you write *code*? If you aren't going to
allow those things, then just make a parser and build the structure
from the file, and make it a DSL implemented entirely external to
Python. That's completely okay, though in my experience it's not very
satisfying for something like a model definition (see MiddleKit for an
example of an ORM that doesn't use Python code).

Ian

K.S.Sreeram · Jun 20, 2006

Fredrik said:
because lots of people know how to describe XML transformations, and
there are plenty of tools that implement such transformations efficiently ?

why would XML be inefficient ?

XML Transformations (XSLT) would *certainly* be an overkill here.
They've invented a whole new declarative programming language, and we
certainly don't need that when we've got Python!

XML by itself feels completely out of place in this context. What we
need is, just a flexible, easy to manipulate, in-memory tree structure
(AST) for the Python source.

Regards
Sreeram

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEl2gfrgn0plK5qqURAkIXAKC62Jr35C0mVuyjYDBer3Imwf6tFACeNFta
wb5h5K0wgcRRJ48znAoLNec=
=3OO3
-----END PGP SIGNATURE-----

Kay Schluehr · Jun 20, 2006

Ian said:
That doesn't look that much better. How do you create it
programmatically? I know how to pass a variable to
CharField(maxlength=200); can I pass a variable to "char length 200"
just as easily? Can I use **kw? Can I read it from a CSV file and
construct the class that way? Maybe, but only by recreating all the
native patterns that I can infer easily looking at the Django class.

If it is just a different kind of representation of common data
structures as in YAML the answer might be a translation of these
declarative blocks into dicts/lists ( or derivatives of those ) at
compile time. The underlying semantics would be that of an "implicitely
embedded DSL" ( there are quite a lot in Python ). Enabling code
generation would just make them more explicit. For example XML syntax
could be considered as an alternate surface syntax for elementrees. XML
elements in Python code might be translated to aequivalent elementree
annotation syntax at compile time.

Under this considerations "choice: char length 200" and
"CharField(maxlength = 200 )" are essentially the same thing. I guess
@Choice.choice would finally represented by a property.

Regards,
Kay

Fredrik Lundh · Jun 20, 2006

Kay said:
If it is just a different kind of representation of common data
structures

but how do you know ?

</F>

Laurent Pointal · Jun 20, 2006

Fredrik Lundh a écrit :

because lots of people know how to describe XML transformations, and
there are plenty of tools that implement such transformations efficiently ?

why would XML be inefficient ?

As a storage tool, its nice, and as you says, there are many tools to
deal with.

But as an internal representation for an AST, I certainly prefer an
ad-hoc classes definitions with well defined members.
Just have a wrapper between both representations.

Laurent

A+

Kay Schluehr · Jun 20, 2006

Fredrik said:
but how do you know ?

</F>

The semantics is specified by the syntax transformer so it is actually
compile-time semantics relative to the base language Python . For any
custom statement/expression ( expressed by a production rule / node in
the parse-tree ) one or more target statements/expressions in standard
Python are created. The specification of the with-statement in PEP 343
can be regarded as a good example of this definition practice. The
with-statement is expanded to a "protocol" that can be expressed in
Python 2.4. In a more general case this expansion might involve
additional libraries e.g. ctypes or elementree.

bruno at modulix · Jun 20, 2006

Diez said:
Using XML-transformation for AST manipulation isn't my first choice
either - yet efficiency concerns aren't really the point here - after
all we're talking about generating code,

I thought we were talking about *transforming* code - just like one uses
metaclasses to transform a class definition, or @decorators to transform
a function definition...

(snip)

Diez B. Roggisch · Jun 20, 2006

bruno said:
I thought we were talking about *transforming* code - just like one uses
metaclasses to transform a class definition, or @decorators to transform
a function definition...

Yes we were. So where does the runtime efficiency you mention come in to
play?

While the _result_ of a transformation might be a less efficient piece of
code (e.g. introducing a lock around each call to enable concurrent
access), the transformation itself is very - if not totally - static - and
usually only run once.

So except from a start up latency, it has no impact. So if for whatever
reason XSLT is someones favorite method of AST-transformation because it
fits her mindset - perfect. As I said: it wouldn't be mine either, but I
can't see your concerns about efficiency.

And XSLT certainly is suited for tree manipulation, so it might be that it
would be good for e.g. recursivly stripping type annotations of some kind
(think of e.g. type-verifying decorators that you want to get rid of for
production.)

Diez

bruno at modulix · Jun 20, 2006

Diez said:
bruno at modulix wrote:

Yes we were. So where does the runtime efficiency you mention come in to
play?

class transformations via metaclasses and function wrapping does happen
at runtime - when the class or (decorated) def statements are eval'd.
This is not the same as having a distinct preprocessing phase that would
write a new .py file.

While the _result_ of a transformation might be a less efficient piece of
code (e.g. introducing a lock around each call to enable concurrent
access), the transformation itself is very - if not totally - static -

really ?

and
usually only run once.

Nope, it's runned each time the module is loaded (with 'loaded' distinct
from 'imported') - which can make a real difference in some execution
models...

So except from a start up latency, it has no impact.

Having a high startup latency can be a problem in itself.

But the problem may not be restricted to startup latency. If for example
you use a metaclasse and a function that *dynamically* creates new
classes using this metaclass, then both the class statement and the
metaclass code transformation will be executed on each call to this
function.

The whole point of a code transformation mechanism like the one Anton is
talking about is to be dynamic. Else one just needs a preprocessor...

So if for whatever
reason XSLT is someones favorite method of AST-transformation because it
fits her mindset - perfect. As I said: it wouldn't be mine either, but I
can't see your concerns about efficiency.

cf above.

Diez B. Roggisch · Jun 20, 2006

While the _result_ of a transformation might be a less efficient piece of

really ?

See below.

Nope, it's runned each time the module is loaded (with 'loaded' distinct
from 'imported') - which can make a real difference in some execution
models...

I already mentioned that latency. If it for whatever reason really becomes
important, it would be the best to cache the result of the transformation.
Which would BTW eliminate any complexity driven runtime penalty -
regardless of the tool used. So - loading time is _not_ an issue. And I
spare you the premature optimization babble...

Having a high startup latency can be a problem in itself.

See above.

But the problem may not be restricted to startup latency. If for example
you use a metaclasse and a function that *dynamically* creates new
classes using this metaclass, then both the class statement and the
metaclass code transformation will be executed on each call to this
function.

This is an assumption I don't agree upon. The whole point of the OPs post
was about creating DSLs or alter the syntax of python itself. All that to
enhance expressiveness.

But we are still talking about CODE here - things that get written by
programmers. Even if that is piped through so many stages, it won't grow
endlessly.

Runtime (runtime meaning here not on a startup-phase, but constantly/later)
feeding of something that generates new code - I wouldn't say that is
unheard of, but I strongly doubt it occurs so often that it rules out tree
transformations that don't try and squeeze the latest bit of performance
out themselves. Which, BTW, would rule out python in itself as nothing
beats runtime assembly generation BY assembly. Don't you think?

The whole point of a code transformation mechanism like the one Anton is
talking about is to be dynamic. Else one just needs a preprocessor...

No, it is not the whole point. The point is

""
The idea is that we now have a fast parser (ElementTree) with a
reasonable 'API' and a data type (XML or JSON) that can be used as an
intermediate form to store parsing trees. Especially statically typed
little languages seem to be very swallow-able. Maybe I will be able to
reimplement GFABasic (my first love computer language, although not my
first relationship) someday, just for fun.
"""

No on-the-fly code generation here. He essentially wants lisp-style-macros
with better parsing. Still a programming language. Not a data-monger.

Diez

Boris Borcic · Jun 20, 2006

bruno said:
My my my... I'm not against the idea of dynamic source code
transformation, but for heaven's sake, *why* would one put XML in the
mix ???????

If a good transform could "reveal" xml as "python's s-expression syntax", not
only would source2source transform using xml t2t transform tools be facilitated,
but generally speaking it would be easier for "xml retro-coded" python source to
find its way through xml-enabled tools&chains. And incite developpers of such
tools to consider python a better candidate (eg than it currently is) whenever
the matter of scripting the tool comes up.

Anton Vredegoor · Jun 20, 2006

Diez B. Roggisch wrote:

No, it is not the whole point. The point is

""
The idea is that we now have a fast parser (ElementTree) with a
reasonable 'API' and a data type (XML or JSON) that can be used as an
intermediate form to store parsing trees. Especially statically typed
little languages seem to be very swallow-able. Maybe I will be able to
reimplement GFABasic (my first love computer language, although not my
first relationship) someday, just for fun.
"""

No on-the-fly code generation here. He essentially wants lisp-style-macros
with better parsing. Still a programming language. Not a data-monger.

The 'problem' is that a lot of incredibly smart people are reading and
replying here who are seeing a lot more into my post than I was prepared
for

Anyway, the last few weeks I have been busy transforming MsWord
documents into XML using Open Office, and next parsing this XML and
transforming it into a special subset of HTML using ElementTree's
XMLWriter class.

Then the output of the XMLWriter was put into a Zope/Plone page but I
added special markup for footnotes, making them plone objects that could
be separately edited, and I added image tags for images that were
retrieved from a separate server using an XSLT script.

To accomplish that a special zope parser was written to recognize my
nonstandard footnote and image tags, and to create the necessary
objects, and to insert them into the page.

After that I came across some turbogears code (which is stacking code at
different levels like it were those things you put under your beer
glass) and still later I saw some JSON equivalents of XML. JSON looks a
lot like Python dicts which makes it seem likely that javascript will be
able to interface with Python more efficiently.

Remember that ElementTree comes from the same place that brought us PIL
which is a package that can transform images into different types.

So if we can transform documents, images and XML, why not sourcecode?

Especially if it's not a conversion into a 'lossy' file format, (I
consider dynamically typed code versus statically typed code the analog
thing to JPEG versus bitmaps) it would be easy to convert all datatypes
into the datatypes of another language, thereby making it possible to
exchange code between languages. Algorithms just being things that
convert sets of data-objects into other sets of data-objects.

Now if one would equate standardized code exchange between languages and
within a language with macros then I guess there is nothing left for me
to do but wait till a certain google bot comes knocking at my ip-address
port 80 and transfers me to the google equivalent of Guantanamo.

But the whole point of distinguishing macros from official language
structures *is* standardization, as some other clever poster already
pointed out, so it would be extremely unfair to equate trans-language
standardized code exchange with the guerrilla type macro activities that
are plaguing the Lisp community.

Then there are some people who keep insisting they don't understand what
I'm talking about until I simplify things enough to get them on-board,
but then simply dismiss my ideas with 'you can already do that easily
with this standard python construct'. This strategy was also eloquently
refuted by some other poster, so I don't need to repeat it

I've gotten a lot of things to think about, so thanks all for your
thoughts, but since this is getting way above my head I'll just wimp out
and leave the rest of the thread to the experts!

Regards,

Anton

Bruno Desthuilliers · Jun 20, 2006

Anton said:
Diez B. Roggisch wrote:

The 'problem' is that a lot of incredibly smart people are reading and
replying here who are seeing a lot more into my post than I was prepared
for

no comment...

(snip various transformations examples)

So if we can transform documents, images and XML, why not sourcecode?
(snip preliminary precautions)
it would be easy to convert all datatypes
into the datatypes of another language, thereby making it possible to
exchange code between languages.

You mean like 'converting' javascript to python or python to ruby (or
converting any home-grown DSL to Python, etc) ?

Algorithms just being things that
convert sets of data-objects into other sets of data-objects.

Now if one would equate standardized code exchange between languages and
within a language with macros then I guess there is nothing left for me
to do but wait till a certain google bot comes knocking at my ip-address
port 80 and transfers me to the google equivalent of Guantanamo.

Lol !-)

Well, given this quote from another of your posts:
"""
The idea is to have a way to transform a Python (.py) module into XML
and then do source code manipulations in XML-space using ElementTree.
"""
I effectively understood something like a python to python
transformation, which of course led me to something very very like macros.

But the whole point of distinguishing macros from official language
structures *is* standardization, as some other clever poster already
pointed out, so it would be extremely unfair to equate trans-language
standardized code exchange with the guerrilla type macro activities that
are plaguing the Lisp community.

Then there are some people who keep insisting they don't understand what
I'm talking about until I simplify things enough to get them on-board,

count me in then

but then simply dismiss my ideas with 'you can already do that easily
with this standard python construct'. This strategy was also eloquently
refuted by some other poster, so I don't need to repeat it

I've gotten a lot of things to think about, so thanks all for your
thoughts, but since this is getting way above my head I'll just wimp out
and leave the rest of the thread to the experts!

No way you will escape from your responsabilities so easily !-)

How to host data visualization beginner friendly?	1	Aug 10, 2023
My http request is working but not doing it correctly	0	Oct 13, 2023
IQR Code not working after Anaconda Upgrade	0	Nov 27, 2023
What code do I add / overwrite so that the ebDriver' object has no attribute 'find_element_by_css_selector error is gone ?	0	Sep 19, 2022
React native post-request is not working	1	May 27, 2023
Use c# code to access the registry, can not modify the data to the registry	3	Aug 5, 2021
Number of languages known [was Re: Python is readable] - somewhat OT	15	Mar 22, 2012
Help with code	0	Jun 12, 2022

[OT] code is data

Laurent Pointal

Fredrik Lundh

Kay Schluehr

bruno at modulix

Ravi Teja

Ravi Teja

Diez B. Roggisch

Ian Bicking

K.S.Sreeram

Kay Schluehr

Fredrik Lundh

Laurent Pointal

Kay Schluehr

bruno at modulix

Diez B. Roggisch

bruno at modulix

Diez B. Roggisch

Boris Borcic

Anton Vredegoor

Bruno Desthuilliers

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads