PEP/GSoC idea: built-in parser generator module for Python?


Peter Mawhorter

First of all, hi everyone, I'm new to this list.

I'm a grad student who's worked on and off with Python on various
projects for 8ish years now. I recently wanted to construct a parser
for another programing language in Python and was dissapointed that
Python doesn't have a built-in module for building parsers, which
seems like a common-enough task. There are plenty of different
3rd-party parsing libraries available, specialized in lots of
different ways (see e.g., [1]). I happened to pick one that seemed
suitable for my needs but didn't turn out to support the recursive
structures that I needed to parse. Rather than pick a different one I
just built my own parser generator module, and used that to build my
parser: problem solved.

It would have been much nicer if there were a fully-featured builtin
parser generator module in Python, however, and the purpose of this
email is to test the waters a bit: is this something that other people
in the Python community would be interested in? I imagine the route to
providing a built-in parser generator module would be to first canvass
the community to figure out what third-party libraries they use, and
then contact the developers of some of the top libraries to see if
they'd be happy integrating as a built-in module. At that point
someone would need to work to integrate the chosen third-party library
as a built-in module (ideally with its developers).
From what I've looked at PyParsing and PLY seem to be standout parser
generators for Python, PyParsing has a bit more Pythonic syntax from
what I've seen. One important issue would be speed though: an
implementation mostly written in C for low-level parsing tasks would
probably be much preferrable to one written in pure Python, since a
builtin module should be geared towards efficiency, but I don't
actually know exactly how that would work (I've both extended and
embedded Python with/in C before, but I'm not sure how that kind of
project relates to writing a built-in module in C).

Sorry if this is a bit rambly, but I'm interested in feedback from the
community on this idea: is a builtin parser generator module
desirable? If so, would integrating PyParsing as a builtin module be a
good solution? What 3rd-party parsing module do you think would serve
best for this purpose?

-Peter Mawhorter



Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question