Parsing library for Python?

H

Harry George

Viktor Rosenfeld said:
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor
http://www.python.org/sigs/parser-sig/

I used Ply for a project a while ago. It felt comfortable.
http://systems.cs.uchicago.edu/ply/
 
V

Viktor Rosenfeld

Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor
 
Y

Yermat

Viktor Rosenfeld a écrit :
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor

YAPPS : http://theory.stanford.edu/~amitp/Yapps/

and all those cite on http://www.python.org/sigs/parser-sig/
 
C

Christophe Delord

Hi,

I need to create a parser for a Python project, and I'd like to use
process kinda like lex/yacc. I've looked at various parsing packages
online, but didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't
even compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give
a BNF grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor

Have you seen this page :
http://www.python.org/sigs/parser-sig/



Christophe.
 
D

Diez B. Roggisch


Yes, I'd recommend that, too - its an early-parser implementation, which is
very powerful and allows e.g. left-recursive rules. however, you can't feed
it a ebnf directly, instead you do things like this (*->* is ebnf, ::= is
spark) :

rule -> term?

becomes

rule ::=
rule ::= term

rule -> (term)*

becomes

rule ::= rule term
rule ::=
 
E

Edward C. Jones

Viktor said:
I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.
 
M

Mike C. Fletcher

Viktor said:
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.
SimpleParse is based on mxTextTools, but is EBNF-driven. You can find
it here:

http://simpleparse.sourceforge.net/

Have fun,
Mike

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/
 
M

M.-A. Lemburg

You should have a look at SimpleParse which converts BNF to
the tag tables used by mxTextTools:

http://simpleparse.sourceforge.net/

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Feb 23 2004)________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
 
T

Tim Roberts

Edward C. Jones said:
When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.

Are you kidding with this? I can't tell.

C, C++, and Fortran are parsing nightmares, where end-of-line and spacing
are important sometimes and ignored at other times, and so on.

I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all
but the most mature parser generators.
 
H

Harry George

Edward C. Jones said:
Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping
C libraries in Python. I use ANTLR because it comes with a good C
grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.


Yes, things can be parsed without a grammar, or at least without a
conventional CFG. Ad hoc parsers are so messy, of course, that we try
to avoid that in modern languages. But I've parsed textual documents
at times with context-sensitive RR(2) approaches and other oddities.

The point is that FORTRAN predates clear understanding of
line-independent lexing and Context Free grammars (CFG's). It uses
constructs which are not handled by the classic
scanner/lexer/parser/AST tools. I don't know how the pros handle
this, but when I run into a non-std grammar, I preprocess to tag it
with additional tokens, and then run it through a std lexer/parser.
Basically a tree re-writer approach.

C++ is (I think) classically lexable, but the semantics are so complex
that parsing (or understanding what to do with the parse) is a pain.
I wasn't in that business, but I understand C compiler vendors bombed
out trying to just upgrade C compilers and had to start fresh with a
much richer type model. SWIG also ran into this.

For parsing of "bad html", see "tidy". Its lexer/parser is ad hoc
(not generated by parser toolkits).
 
?

=?iso-8859-1?Q?Fran=E7ois?= Pinard

[Tim Roberts]
C, C++, and Fortran are parsing nightmares, where end-of-line and
spacing are important sometimes and ignored at other times, and so on.

End-of-line processing does not look too difficult for these languages.
But spaces in FORTRAN always looked difficult to parse, at least in
the original FORTRAN where they might appear anywhere, even inside an
identifier, while not even being required between "words".

One routine which was popular, at one place I worded, was named INGMTR,
as people used to always call it this way:

CALLING MTR(... ARGUMENTS ...)

One traditional amusement was writing obscure programs, like:

DO 50 I = 3

that had nothing to do with DO loops. I wonder how FORTRAN parsers
worked to sort out such things. Did later FORTRAN use more strict (or
at least usual) rules on white space?
 
E

Edward C. Jones

Tim said:
Are you kidding with this? I can't tell.

C, C++, and Fortran are parsing nightmares, where end-of-line and spacing
are important sometimes and ignored at other times, and so on.

I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all
but the most mature parser generators.

Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping C
libraries in Python. I use ANTLR because it comes with a good C grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.
 
P

Paul McGuire

Edward C. Jones said:
Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping C
libraries in Python. I use ANTLR because it comes with a good C grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.

I'm looking for the C grammar in ANTLR. Do you mean the tinyC example?
That leaves out a *lot*. (There are grammars for Java and Pascal included,
and they look pretty complete.)

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,074
Latest member
StanleyFra

Latest Threads

Top