using Python's AST generator for other languages


E

eliben

Hello,

I'm building a parser in Python, and while pondering on the design of
my ASTs had the idea to see what Python uses. I quickly got to the
compiler.ast module, and understood it's automatically generated. So I
went to the source, ast.txt and tools/compiler/astgen.py, where I was
this unexpected message:

"""Generate ast module from specification

This script generates the ast module from a simple specification,
which makes it easy to accomodate changes in the grammar. This
approach would be quite reasonable if the grammar changed often.
Instead, it is rather complex to generate the appropriate code. And
the Node interface has changed more often than the grammar.
"""

Now, to me the design of the AST in Python looks quite elegant,
especially from the point of view of the AST's user (using Visitors to
walk the AST). And astgen.py looks like a nice approach to generate
tons of boilerplate code.
So, my questions:
1) Is the compiler.ast module really employed during the compilation
of Python into .pyc files ?
2) What is the meaning of the comment in astgen.py ? Are the Python
maintainers unhappy with the design of the AST ?
3) What other approach would be recommended to generate a very
detailed AST hierarchy, if the one in astgen.py is dissapointing ?

Thanks in advance
Eli
 
Ad

Advertisements

B

Benjamin

Hello,

I'm building a parser in Python, and while pondering on the design of
my ASTs had the idea to see what Python uses. I quickly got to the
compiler.ast module, and understood it's automatically generated. So I
went to the source, ast.txt and tools/compiler/astgen.py, where I was
this unexpected message:

"""Generate ast module from specification

This script generates the ast module from a simple specification,
which makes it easy to accomodate changes in the grammar. This
approach would be quite reasonable if the grammar changed often.
Instead, it is rather complex to generate the appropriate code. And
the Node interface has changed more often than the grammar.
"""

Now, to me the design of the AST in Python looks quite elegant,
especially from the point of view of the AST's user (using Visitors to
walk the AST). And astgen.py looks like a nice approach to generate
tons of boilerplate code.
So, my questions:
1) Is the compiler.ast module really employed during the compilation
of Python into .pyc files ?

No, the comment refers to the builtin _ast module. The compiler
package is a compiler for Python written in Python.
2) What is the meaning of the comment in astgen.py ? Are the Python
maintainers unhappy with the design of the AST ?3

Node, I think, is talking about a node in the parse tree. (AST is
generated from another parse tree.) See PEP 339 for details.
3) What other approach would be recommended to generate a very
detailed AST hierarchy, if the one in astgen.py is dissapointing ?

astgen.py contains things that are specific to writing Python's AST C
code. Have a look at spark.py in the Parser dir. It is what astgen.py
is based on.
 
E

eliben

Node, I think, is talking about a node in the parse tree. (AST is
generated from another parse tree.) See PEP 339 for details.
<snip>

Thanks, PEP 339 clarified a lot to me. I wonder, though, at the need
for two Python compilation frameworks in the same code base. While
CPython uses the flow described in PEP 339 (parsing to an AST
generated from ASDL), the compiler module of the standard library
takes a different approach, with a custom AST description syntax in
ast.txt
Why aren't the two methods unified. I.e. can't the compiler.ast module
be also generated from ASDL, and provide a more unified interface to
the real thing ?

Eli
 
Ad

Advertisements

B

Benjamin

<snip>

Thanks, PEP 339 clarified a lot to me. I wonder, though, at the need
for two Python compilation frameworks in the same code base. While
CPython uses the flow described in PEP 339 (parsing to an AST
generated from ASDL), the compiler module of the standard library
takes a different approach, with a custom AST description syntax in
ast.txt
Why aren't the two methods unified. I.e. can't the compiler.ast module
be also generated from ASDL, and provide a more unified interface to
the real thing ?

You are correct on all points and this is one of the main reasons that
the compiler package is going away in 3.0.
 

Top