Recursive descent algorithm able to parse Python?

S

seberino

I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.

Chris
 
M

Michael

I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.

Python is relatively simple to parse using a recursive descent parser. If
you're rolling your own as a learning exercise, the grammar for python is
included in one of the top level directories of the source distribution for
python.

The one area that is slightly different from usual is the emitting of INDENT
and DEDENT tokens by the lexer due to handling of whitespace. But that's
not really that complex either.

A couple of years ago I decided to see what it would be like to try to parse
a python-like language with no keywords and to see if you could do it test
first (for fun :). If you do this, you discover the grammar is very short
but you need terminator tokens to close if..elif...elif...else... like
statements (eg if..elif...elif...else...end,
try...except...except...except...endtry).

I put that code up here: http://cerenity.org/SWP/ if you're curious.
That and the python grammar file should probably help you roll your own
parser. (It emits an AST that assumes everything is a function. I was
pondering giving it a lisp backend or transforming to lisp but never
got a round tuit)

If however you're doing this because you're not aware of the compiler
module, it's worth knowing that compiler.parse is a pretty useful
function :)


Michael.
 
D

Diez B. Roggisch

I'm a compiler newbie and curious if Python grammar is able to
be parsed by a recursive descent parser or if it requires
a more powerful algorithm.

I might be mistaken, but isn't recursive descent one of the more
powerful parsing techniques - for the price of non-linearity and even
potentially non-termination?

Under that assumption: certainly!

Diez
 
L

Lawrence D'Oliveiro

I might be mistaken, but isn't recursive descent one of the more
powerful parsing techniques - for the price of non-linearity and even
potentially non-termination?

No, you're thinking of LR(k) bottom-up parsers. Recursive descent is a
top-down parser--might be the same thing as LL(1), I'm not sure. It's easy
to implement and easy to understand, to the point where there is strong
pressure on programming-language designers to make sure their languages can
be parsed with recursive descent.
 
G

Gabriel Genellina

Python is relatively simple to parse using a recursive descent parser. If
you're rolling your own as a learning exercise, the grammar for python is
included in one of the top level directories of the source distribution for
python.

And it will stay so. From PEP 3099: "The parser won't be more complex
than LL(1). Simple is better than complex. This idea extends to the
parser. Restricting Python's grammar to an LL(1) parser is a
blessing, not a curse. It puts us in handcuffs that prevent us from
going overboard and ending up with funky grammar rules like some
other dynamic languages that will go unnamed, like Perl."



Gabriel Genellina
Softlab SRL





__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
 
D

Diez B. Roggisch

Lawrence said:
No, you're thinking of LR(k) bottom-up parsers. Recursive descent is a

No, I'm not.
top-down parser--might be the same thing as LL(1), I'm not sure. It's easy
to implement and easy to understand, to the point where there is strong
pressure on programming-language designers to make sure their languages
can be parsed with recursive descent.

http://en.wikipedia.org/wiki/Recursive_descent_parser

"""
Recursive descent with backup is a technique that determines which
production to use by trying each production in turn. Recursive descent with
backup is not limited to LL(k) grammars, but is not guaranteed to terminate
unless the grammar is LL(k). Even when they terminate, parsers that use
recursive descent with backup may require exponential time.
"""

I have to admit that I have difficulties to compare LR(k) to recursive
descent, but the fact that the latter contains backtracking makes it at
least more powerful than LL(k)

Diez
 
L

Lawrence D'Oliveiro

I have to admit that I have difficulties to compare LR(k) to recursive
descent, but the fact that the latter contains backtracking makes it at
least more powerful than LL(k)

LR(k) is more powerful than LL(k).
 
P

Piet van Oostrum

Diez B. Roggisch said:
DBR> """
DBR> Recursive descent with backup is a technique that determines which
DBR> production to use by trying each production in turn. Recursive
DBR> descent with backup is not limited to LL(k) grammars, but is not
DBR> guaranteed to terminate unless the grammar is LL(k). Even when they
DBR> terminate, parsers that use recursive descent with backup may require
DBR> exponential time.
DBR> """

This is the first time I see this called `backup'. I have always used and
seen the term `backtracking'.
 
H

hanumizzle

I know _that_, the point was that recursive descent is more power full
than LL(k)

Correct me if I'm wrong, but recursive descent parsers are top-down, yes?

Parse::RecDescent in Perl is a top-down parser. It can even do some
context-sensitive grammars and can almost certainly handle Python.

Is Python context-free? Not sure. Does left parenthesis opening either
a function call (x) or an empty tuple () count as context-sensitive,
for instance? Or can () be considered a terminal symbol unto itself?

I am new to most of this :)

-- Theerasak
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top