looking for tips on how to implement "ruby-style" Domain SpecificLanguage in Python

M

mark

I want to implement a internal DSL in Python. I would like the syntax
as human readable as possible. This means no disturbing '.;()\'
characters. I like to have the power of the hosting language as well.
Thats why I want to build it as an internal DSL and NOT as a external
DSL.

I want the DSL as human readable as possible:eek:pen_browser

navigate_to 'www.openstreetmap.org' website

search 'Von-Gumppenberg-Strasse, Schmiechen'

verify search_result

zoom in
<<<

Martin Fowler recommends "Method Chaining" to build internal DSLs: Browser("http://www.openstreetmap.org/") \
.search("Von-Gumppenberg-Strasse, Schmiechen") \
.zoom_in()
<<<

I guess that it is possible to argue that this means the same.
Nevertheless I do not like all the parentheses and punctuation
necessary to satisfy the Python interpreter.

The reason why I need this is that I want to have non technical people
review the files written in the DSL.

I already know that there are parser frameworks available but I want
to build it as internal DSL in Python (Yes, I know ANTLR, Ply, and
whatnot).

How would one approach this in Python? Do I need to build a custom
loader which compiles *.dsl files to *.pyc files? Is it possible to
switch between the custom DSL and the standard Python interpreter?
 
S

sturlamolden

Is it possible to
switch between the custom DSL and the standard Python interpreter?

As far as I can tell, there are three different options:

- Embed a Python and DSL interpreter in the same executable.

- Write the DSL interpreter in Python.

- Expose the DSL interpreter as a Python extension module.


I don't know which you prefer, but I would try to avoid the first.
 
S

sturlamolden

I want to implement a internal DSL in Python. I would like the syntax
as human readable as possible.

Also beware that Python is not Lisp. You cannot define new syntax (yes
I've seen the goto joke).
 
J

Jonathan Gardner

- Write the DSL interpreter in Python.

There are Python modules out there that make writing a language
interpreter almost trivial, provided you are familiar with tools like
Bison and the theories about parsing in general. I suggest PLY, but
there are other really good solution out there.

If you are familiar enough with parsing and the syntax is simple
enough, you can write your own parser. The syntax you describe above
is really simple, so using str.split and then calling a function based
on the first item is probably enough.
 
J

Jonathan Gardner

Also beware that Python is not Lisp. You cannot define new syntax (yes
I've seen the goto joke).

This isn't really true. You can, for instance, write a program (in
Python) that takes your pseudo-Python and converts it into Python.
This is what a number of templating libraries such as Mako do.
 
K

Kay Schluehr

How would one approach this in Python? Do I need to build a custom
loader which compiles *.dsl files to *.pyc files? Is it possible to
switch between the custom DSL and the standard Python interpreter?

Sure, but there is no way to avoid extending the Python parser and
then your DSL becomes external.

I remember having had a similar discussion a while ago with Kevin
Dangoor the original TurboGears developer who has also written Paver
[1]. In the end DSL syntax wasn't worth the hassle and Kevin developed
Paver entirely in Python.

Kay

[1] http://www.blueskyonmars.com/projects/paver/
 
M

M.-A. Lemburg

Sure, but there is no way to avoid extending the Python parser and
then your DSL becomes external.

Try python4ply:

http://dalkescientific.com/Python/python4ply.html

....much easier to work with than extending the Python parser by hand.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Jan 06 2009)________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
 
S

Steve Holden

J Kenneth King wrote:
[...]
I could go on a really long rant about how the two are worlds apart, but
I'll let Google tell you if you're really interested.

a) How is Google going to know if he's really interested?

b) Put a space after the "--" in your sig, please; that way my mailer
won't yto to quote your signature as part of the message.

regards
Steve
 
J

Jonathan Gardner

Which is not even close to being the same.

Lisp - the program source is also the data format

Python - the program source is a string

I could go on a really long rant about how the two are worlds apart, but
I'll let Google tell you if you're really interested.

I get that Lisp is special because you can hack on the reader as it is
reading the file in. This is strongly discouraged behavior, as far as
I know, despite the number of cute hacks you can accomplish with it.

But consider that this really isn't different than having a program
read in the lisp-with-modification source and spitting out pure lisp,
to be read by an honest-to-gosh lisp program later.

If that's the case, then Lisp and Python really aren't that different
in this regard, except that you don't have the option of modifying the
reader as it reads in the file.
 
J

J Kenneth King

Jonathan Gardner said:
I get that Lisp is special because you can hack on the reader as it is
reading the file in. This is strongly discouraged behavior, as far as
I know, despite the number of cute hacks you can accomplish with it.

It is generally discouraged unless there's a reason for it.
But consider that this really isn't different than having a program
read in the lisp-with-modification source and spitting out pure lisp,
to be read by an honest-to-gosh lisp program later.

If that's the case, then Lisp and Python really aren't that different
in this regard, except that you don't have the option of modifying the
reader as it reads in the file.

I think you are missing the distinction.

Lisp expressions are also data structures. A Lisp expression can be
passed to functions and macros to be operated on before being
executed. When you're writing Lisp source, you're basically looking at
the AST on one level and when you start writing macros for your program,
you're creating a "DSL" or interface to that AST. Lisp source is
eventually expanded to a giant list that is consed by the evaluator (as
far as I understand it. I'm just getting into the compiler stuff
myself).

Consider:

(my-func 1 2 3)

This is just a list, the "primitive" data-type in Lisp! This piece of
"data" can be operated on by other bits of Lisp code because it is just
a list as far as Lisp is concerned.

In contrast, Python source is a string that needs to be parsed into
bytecode which is then run through the interpreter. The AST is
completely hidden from the source author. Python expressions are not
data types either and hence no macros -- I can't write a python function
that generates python code at compile time. I can only write a python
program that parses some other string and generates code that can be run
by another interpreter.

Consider:

for i in range(0, 100):
do_something_interesting(i)

That's a pretty straight forward Python expression, but I can't do
anything with it -- it's not a unit of data, it's a string.

The distinction is not subtle by any means.
 
K

Kay Schluehr

Python expressions are not
data types either and hence no macros -- I can't write a python function
that generates python code at compile time.

Have you ever considered there are languages providing macros other
than Lisp? Macros have nothing to do with homoiconcity.
I can only write a python
program that parses some other string and generates code that can be run
by another interpreter.

No, it is the same interpreter and it is also possible to modify
python parsers on the fly. This is just not possible with Pythons
builtin parser.
 
C

Chris Mellon

It is generally discouraged unless there's a reason for it.


I think you are missing the distinction.

Lisp expressions are also data structures. A Lisp expression can be
passed to functions and macros to be operated on before being
executed. When you're writing Lisp source, you're basically looking at
the AST on one level and when you start writing macros for your program,
you're creating a "DSL" or interface to that AST. Lisp source is
eventually expanded to a giant list that is consed by the evaluator (as
far as I understand it. I'm just getting into the compiler stuff
myself).

Consider:

(my-func 1 2 3)

This is just a list, the "primitive" data-type in Lisp! This piece of
"data" can be operated on by other bits of Lisp code because it is just
a list as far as Lisp is concerned.

In contrast, Python source is a string that needs to be parsed into
bytecode which is then run through the interpreter. The AST is
completely hidden from the source author. Python expressions are not
data types either and hence no macros -- I can't write a python function
that generates python code at compile time. I can only write a python
program that parses some other string and generates code that can be run
by another interpreter.

Consider:

for i in range(0, 100):
do_something_interesting(i)

That's a pretty straight forward Python expression, but I can't do
anything with it -- it's not a unit of data, it's a string.

The distinction is not subtle by any means.


Ignoring reader macros for a moment, there is no way in either lisp,
ruby, or python to change the syntax that the compiler understands,
and the ability to work with your code directly as a data structure
(which is what makes lisp macros powerful) isn't directly relevant to
the idea of an "internal' DSL.

The OP wants a Ruby-style DSL by which he means "something that lets
me write words instead of expressions". The ruby syntax is amenable to
this, python (and lisp, for that matter) syntax is not and you can't
implement that style of internal DSL in those languages.

The answer to the OP is "you can't - use Ruby or modify your requirements".
 
J

J Kenneth King

Kay Schluehr said:
Have you ever considered there are languages providing macros other
than Lisp?

Of course.
Macros have nothing to do with homoiconcity.

Not directly, no.
No, it is the same interpreter and it is also possible to modify
python parsers on the fly. This is just not possible with Pythons
builtin parser.

PyPy is probably the best bet when/if it gets finished.
 
C

Carl Banks

I want to implement a internal DSL in Python. I would like the syntax
as human readable as possible. This means no disturbing '.;()\'
characters. I like to have the power of the hosting language as well.
Thats why I want to build it as an internal DSL and NOT as a external
DSL.

I want the DSL as human readable as possible:

open_browser

navigate_to 'www.openstreetmap.org'website

search 'Von-Gumppenberg-Strasse, Schmiechen'

verify search_result

zoom in

In the Python grammar, there are no non-trivial situations where two
expressions can be separated by whitespace and not punctuation. (The
trivial exception is string concatentation.)

String constants like 'www.openstreetmap.org' and identifiers like
open_browser are expressions, and if you try to separate them with
whitespace you get a syntax error.

So you can't make an internal DSL like this that uses Python's built-
in grammar. You'd have to hack the parser or settle for an external
preprocessor.


Martin Fowler recommends "Method Chaining" to build internal DSLs:

 Browser("http://www.openstreetmap.org/") \
        .search("Von-Gumppenberg-Strasse, Schmiechen") \
        .zoom_in()
 <<<

I guess that it is possible to argue that this means the same.
Nevertheless I do not like all the parentheses and punctuation
necessary to satisfy the Python interpreter.

The reason why I need this is that I want to have non technical people
review the files written in the DSL.

I already know that there are parser frameworks available but I want
to build it as internal DSL in Python (Yes, I know ANTLR, Ply, and
whatnot).

How would one approach this in Python? Do I need to build a custom
loader which compiles *.dsl files to *.pyc files? Is it possible to
switch between the custom DSL and the standard Python interpreter?

I don't know specifically what you mean my "custom loader", or
"switching between the custom DSL and the standard Python
interpreter".

However, the gist of it seems to be that you want to be able to write
files in your DSL that can be imported just like a regular Python
module. Yes, that can be done.

See PEP 302, Import Hooks:

http://www.python.org/dev/peps/pep-0302/

Python's standard importer looks for files with *.py, *.pyc, *.pyd, or
*.so extensions. You could write an importer that looks for *.dsl
files, and, instead of loading it as a Python file, invokes your DSL
parser.


Carl Banks
 
J

Jonathan Gardner

It is generally discouraged unless there's a reason for it.



I think you are missing the distinction.

Lisp expressions are also data structures. A Lisp expression can be
passed to functions and macros to be operated on before being
executed. When you're writing Lisp source, you're basically looking at
the AST on one level and when you start writing macros for your program,
you're creating a "DSL" or interface to that AST. Lisp source is
eventually expanded to a giant list that is consed by the evaluator (as
far as I understand it. I'm just getting into the compiler stuff
myself).

I think you misunderstood what I was trying to explain. Yes, you can
do those wonderful things with Lisp.

You can also do wonderful things with Python. Consider programs that
take some text written in some other language besides Python. Those
programs interpret and translate the text to Python. Then the programs
feed the translations to the Python interpreter. Tada! You have a DSL
in Python.

No, it's not built in, nor is there any standard, but it is entirely
possible and people are doing it today. That's how the variety of
templating solutions work in the Python world. It's why I can write ${x
+y} in Mako and get a Python program that will do the right thing.

Alternatively, you can skip the Python interpreter altogether, and
write your own interpreter for the language. If it's a simple language
(like the original poster hinted at), this is very easy to do.
 
J

Jonathan Gardner

The OP wants a Ruby-style DSL by which he means "something that lets
me write words instead of expressions". The ruby syntax is amenable to
this, python (and lisp, for that matter) syntax is not and you can't
implement that style of internal DSL in those languages.

The answer to the OP is "you can't - use Ruby or modify your requirements".

As far as putting the code into Python, yeah, you can't put it in
Python. The best you can do is store it in a string and then interpret
the string with some function later on.
 
J

J Kenneth King

Jonathan Gardner said:
As far as putting the code into Python, yeah, you can't put it in
Python. The best you can do is store it in a string and then interpret
the string with some function later on.

That's what I'm saying.

It seems we're defining "DSL" in two different ways.

You can't write a DSL in Python because you can't change the syntax and
you don't have macros.

You can write a compiler in Python that will compile your "DSL."

As another poster mentioned, eventually PyPy will be done and then
you'll get more of an "in-Python" DSL.
 
K

Kay Schluehr

As another poster mentioned, eventually PyPy will be done and then
you'll get more of an "in-Python" DSL.

May I ask why you consider it as important that the interpreter is
written in Python? I see no connection between PyPy and syntactical
Python extensions and the latter isn't an objective of PyPy. You can
write Python extensions with virtually any Python aware parser.
M.A.Lemburg already mentioned PLY and PLY is used for Cython. Then
there is ANTLR which provides a Python grammar. I also know about two
other Python aware parsers. One of them was written by myself.
 
M

mark

So you can't make an internal DSL like this that uses Python's built-
in grammar.  You'd have to hack the parser or settle for an external
preprocessor.

This time it is really hard for me but I begin accepting the fact that
I will have to build an external DSL. I experimented some weeks ago
with ANTLR and the tools work fine but I do not like the extra effort
to learn and maintain the extra tooling. I think that in the beginning
the DSL language will have to change a very often as new features are
added. To implement a standardized rock solid language like SQL ANTLR
might be the perfect tool but to develop something from scratch that
will be expanded interactively a internal DSL has huge benefits.

Please not that I really like ANTLR. It is just the first tool I used
for this task and I want to double check if other tools fit better to
my needs.

I will look into Ply and Pyparsing over the next weeks unless someone
points out that there is some special tool that makes growing a new
"fast evolving" language as easy as building an internal DSL. Maybe
this is all overkill and there is a hacking ruby-style DSLs with
regular expressions recipe out there? So far I could not find one.
However, the gist of it seems to be that you want to be able to write
files in your DSL that can be imported just like a regular Python
module.  Yes, that can be done.

See PEP 302, Import Hooks:

http://www.python.org/dev/peps/pep-0302/

Python's standard importer looks for files with *.py, *.pyc, *.pyd, or
*.so extensions.  You could write an importer that looks for *.dsl
files, and, instead of loading it as a Python file, invokes your DSL
parser.

This is really helpful. Thanks for giving me directions.

Mark
 
J

J Kenneth King

Kay Schluehr said:
May I ask why you consider it as important that the interpreter is
written in Python?

I don't think it's important for Python to have a meta-circular
interpreter (though it can't hurt).
I see no connection between PyPy and syntactical
Python extensions and the latter isn't an objective of PyPy. You can
write Python extensions with virtually any Python aware parser.
M.A.Lemburg already mentioned PLY and PLY is used for Cython. Then
there is ANTLR which provides a Python grammar. I also know about two
other Python aware parsers. One of them was written by myself.

Because... there is no connection to see? I never mentioned any such
relation.

DSL's tend to be a natural side-effect of languages which can manipulate
their own expressions without extensive parsing.

Creating a new parser that can generate Python AST's is certainly a
valid approach (and probably the easiest one). It's not the only one.

It depends on your definition of a DSL.

My definition isn't satisfied with creating a parser, and so my answers
reflect that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top