Testing complex new syntax

A

astromog

I have some significantly extended syntax for Python that I need to
create a reference implementation for. My new syntax includes new
keywords, statements and objects that are sort of like classes but not
really. The implementation is all possible using standard Python, but
the implementation isn't the point of what I'm doing. Speed and having
an extra step to run a program are not issues that I need to be
concerned with.
I'd like to create a preprocessor if possible, because it would
probably be easier than implementing the changes in the interpreter. I
could just drop in standard Python code that provides the functionality
when I encounter a part of my extended syntax. Modifying the
interpreter, on the other hand, sounds like it would be pretty nasty,
even though I have experience in interpreter hacking already.
So my question is: what's the easiest way to implement a preprocessor
system in Python? I understand I could use the tokenize module, but
that would still require a lot of manual parsing of the Python syntax.
Is it possible to use any of the parser module facilities to accomplish
this without them choking on the unknown syntax? Or, alternatively,
would modifying the interpreter ultimately be easier?
 
C

Carl Friedrich Bolz

Hi!

I have some significantly extended syntax for Python that I need to
create a reference implementation for. My new syntax includes new
keywords, statements and objects that are sort of like classes but not
really. The implementation is all possible using standard Python, but
the implementation isn't the point of what I'm doing. Speed and having
an extra step to run a program are not issues that I need to be
concerned with.
I'd like to create a preprocessor if possible, because it would
probably be easier than implementing the changes in the interpreter. I
could just drop in standard Python code that provides the functionality
when I encounter a part of my extended syntax. Modifying the
interpreter, on the other hand, sounds like it would be pretty nasty,
even though I have experience in interpreter hacking already.
So my question is: what's the easiest way to implement a preprocessor
system in Python? I understand I could use the tokenize module, but
that would still require a lot of manual parsing of the Python syntax.
Is it possible to use any of the parser module facilities to accomplish
this without them choking on the unknown syntax? Or, alternatively,
would modifying the interpreter ultimately be easier?

I cannot really say much about how easy it would be to just write a
preprocessor. However, I think what you are trying to do could be done
reasonably easy with the PyPy project:

http://codespeak.net/pypy

PyPy is an implementation of a Python interpreter written in Python.
(Disclaimer: I am a PyPy developer). It has a quite flexible
parser/bytecode compiler that could probably be tweaked to support your
new syntax (especially if the new constructs can be mapped to standard
python). Feel free to ask question on the pypy developer mailing list:

http://codespeak.net/mailman/listinfo/pypy-dev

Cheers,

Carl Friedrich Bolz
 
A

astromog

Carl said:
I cannot really say much about how easy it would be to just write a
preprocessor. However, I think what you are trying to do could be done
reasonably easy with the PyPy project:

http://codespeak.net/pypy

PyPy is an implementation of a Python interpreter written in Python.
(Disclaimer: I am a PyPy developer). It has a quite flexible
parser/bytecode compiler that could probably be tweaked to support your
new syntax (especially if the new constructs can be mapped to standard
python).

It looks like the lack of thread support means I can't just use PyPy by
itself, unfortunately. But the tokeniser, lexer, parser and AST builder
could do what I need with modification, then I could walk the generated
AST and produce standard Python code from that. How easy would it be to
separate these parts out from the rest of PyPy?
 
C

Carl Friedrich Bolz

It looks like the lack of thread support means I can't just use PyPy by
itself, unfortunately.

There is thread-support (using a GIL), but it is indeed not perfect yet.
This will definitively be a topic in the next months, though.
But the tokeniser, lexer, parser and AST builder
could do what I need with modification, then I could walk the generated
AST and produce standard Python code from that. How easy would it be to
separate these parts out from the rest of PyPy?

Should be reasonably easy, although I am no expert in that area of PyPy.
Especially since the output of the parser/compiler is regular python
2.4 bytecode.

Cheers,

Carl Friedrich Bolz
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top