ANN: Pyrr 0.1 - Lexer and LR(1)-Parser Generator for Python

H

Heiko Wundram

Hi list!

Not long ago I was looking for an easy to use, but powerful parser and lexer
generating tool for Python, and to my dismay, I found quite a number of
Python projects implementing an (LA)LR(1) parser generator, but none of them
seemed quite finished, or even pythonic.

As I required a parser generator for Python for one of my work projects, I set
out to write (yet another one), and currently am at (release-)version 0.1 for
Pyrr.ltk and ptk.

An example for Pyrr.ltk and ptk usage implementing a (very) simple calculator:

<<<
# -*- coding: iso-8859-15 -*-

from ltk import LexerBase, IgnoreMatch
from ptk import ParserBase
from operator import add, sub, mul, div

class NumLexer(LexerBase):

def number(self,value):
"""number -> r/[0-9]+/"""
return float(value)

def ws(self,*args):
"""ws -> r/\\s+/"""
raise IgnoreMatch

def ops(self,op):
"""addop -> /+/
-> /-/
mulop -> /*/
-> r/\\//"""
return op

class NumParser(ParserBase):
"""/mulop/: left
/addop/: left"""
__start__ = "term"

def term(self,value1,op,value2):
"""term -> term /addop/ term
-> term /mulop/ term"""
return {"+":add,"-":sub,"*":mul,"/":div}[op](value1,value2)

def value(self,value):
"""term -> /number/"""
return value

print NumParser.parse(NumLexer.tokenize("3 + 4 - 123 / 23"))
<<<

Grammar rules and lexemes are specified in docstrings, where lines not
matching a definition of a rule or lexeme are ignored. The resulting lexer
and parser class is, thus, very much self-documenting, which was one of my
biggest goals for the project.

I'm currently in the process of writing documentation for both packages (and
especially documenting the extensions to BNF-grammars that Pyrr.ptk allows,
such as your usual RE-operators ?, *, + and {x,y}, and forward arguments, and
documenting the stateful lexer support that Pyrr.ltk implements), but I
thought that I'd release early and often, so that people interested in this
project might have a look at it now to input suggestions and extensions that
they'd like me to add to make this a fully featured Python parser generating
toolkit which might be offered as a Python package.

Anyway, the sources can be downloaded (via subversion) from:

http://svn.modelnine.org/svn/Pyrr/trunk

where I'll check in the documentation that I've written so far and a Python
distutils distribution over the weekend, and make sure that I don't check in
brocken code from now on. And, Pyrr.* is Python 2.4 only at the moment, and I
have no plans to make it backwards-compatible, but if you're interested in
backporting it, feel free to mail me patches.

--- Heiko.
 
N

Norman Shelley

FWIW: This has a similiar look/feel to how sabbey wrapped dparser.
http://staff.washington.edu/sabbey/py_dparser/

Heiko said:
Hi list!

Not long ago I was looking for an easy to use, but powerful parser and lexer
generating tool for Python, and to my dismay, I found quite a number of
Python projects implementing an (LA)LR(1) parser generator, but none of them
seemed quite finished, or even pythonic.

As I required a parser generator for Python for one of my work projects, I set
out to write (yet another one), and currently am at (release-)version 0.1 for
Pyrr.ltk and ptk.

An example for Pyrr.ltk and ptk usage implementing a (very) simple calculator:

<<<
# -*- coding: iso-8859-15 -*-

from ltk import LexerBase, IgnoreMatch
from ptk import ParserBase
from operator import add, sub, mul, div

class NumLexer(LexerBase):

def number(self,value):
"""number -> r/[0-9]+/"""
return float(value)

def ws(self,*args):
"""ws -> r/\\s+/"""
raise IgnoreMatch

def ops(self,op):
"""addop -> /+/
-> /-/
mulop -> /*/
-> r/\\//"""
return op

class NumParser(ParserBase):
"""/mulop/: left
/addop/: left"""
__start__ = "term"

def term(self,value1,op,value2):
"""term -> term /addop/ term
-> term /mulop/ term"""
return {"+":add,"-":sub,"*":mul,"/":div}[op](value1,value2)

def value(self,value):
"""term -> /number/"""
return value

print NumParser.parse(NumLexer.tokenize("3 + 4 - 123 / 23"))
<<<

Grammar rules and lexemes are specified in docstrings, where lines not
matching a definition of a rule or lexeme are ignored. The resulting lexer
and parser class is, thus, very much self-documenting, which was one of my
biggest goals for the project.

I'm currently in the process of writing documentation for both packages (and
especially documenting the extensions to BNF-grammars that Pyrr.ptk allows,
such as your usual RE-operators ?, *, + and {x,y}, and forward arguments, and
documenting the stateful lexer support that Pyrr.ltk implements), but I
thought that I'd release early and often, so that people interested in this
project might have a look at it now to input suggestions and extensions that
they'd like me to add to make this a fully featured Python parser generating
toolkit which might be offered as a Python package.

Anyway, the sources can be downloaded (via subversion) from:

http://svn.modelnine.org/svn/Pyrr/trunk

where I'll check in the documentation that I've written so far and a Python
distutils distribution over the weekend, and make sure that I don't check in
brocken code from now on. And, Pyrr.* is Python 2.4 only at the moment, and I
have no plans to make it backwards-compatible, but if you're interested in
backporting it, feel free to mail me patches.

--- Heiko.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top