OT: C++ from bison/flex.

D

DaLoverhino

Hello. I have to read about flex and bison for my co's next project.

But it got me wondering: Were/are c++ compilers written using flex/
bison? Or do you need another tool?



Searching for LALR and C++, it looks like it could be possible but not
practical. Is this true? If so, then what tools do compiler writers
use or do they write their own. (Yikes!)
 
A

Aaron Gray

DaLoverhino said:
Hello. I have to read about flex and bison for my co's next project.

But it got me wondering: Were/are c++ compilers written using flex/
bison? Or do you need another tool?



Searching for LALR and C++, it looks like it could be possible but not
practical. Is this true? If so, then what tools do compiler writers
use or do they write their own. (Yikes!)

Try comp.compilers news group.

Aaron
 
M

Me

Hello. I have to read about flex and bison for my co's next project.

But it got me wondering: Were/are c++ compilers written using flex/
bison? Or do you need another tool?



Searching for LALR and C++, it looks like it could be possible but not
practical. Is this true? If so, then what tools do compiler writers
use or do they write their own. (Yikes!)


Someone writing a compiler is going to need to do lexical analysis and
create a parse tree for a given grammar. Therefore, lex and yacc, or flex
and bison can certainly be used.

I don't know why it would be "impractical" to use those tools for
implementing the C++ language. Remember though, those are only the first
two steps in the process. For a good compiler you still have a lot of
additional steps such as intermediate code generation, optimization, etc.
 
J

James Kanze

Someone writing a compiler is going to need to do lexical analysis and
create a parse tree for a given grammar. Therefore, lex and yacc, or flex
and bison can certainly be used.
I don't know why it would be "impractical" to use those tools for
implementing the C++ language. Remember though, those are only the first
two steps in the process. For a good compiler you still have a lot of
additional steps such as intermediate code generation, optimization, etc.

Yacc generates an LRLA(1) grammar; LRLA(1) grammars can only
handle certain sub-classes of context free grammars. C++ isn't
even context free much less LRLA(1); to make it acceptable to
yacc requires a lot of hacking. Flex has an extension which
allows handling ambiguous productions in parallel, only reducing
once the ambiguity has been raised. I don't know if this is
enough to allow it to handle C++, however---I've never tried.

The few C++ compilers where I know what is used use recursive
descent. Theoretically, recursive descent can handle an even
more restricted set of grammars, but in practice, it's a lot
easier to hack, and lends itself to backtracking when necessary.
 
J

Jerry Coffin

Hello. I have to read about flex and bison for my co's next project.

But it got me wondering: Were/are c++ compilers written using flex/
bison? Or do you need another tool?

The grammar for C++ doesn't fit Bison's requirements, so writing a C++
parser with Bison requires lots of ugliness. Both cfront and older
versions of gcc were apparently written that way, but cfront is
discontiued and I believe gcc now uses a hand-written parser.

There are other parser generators that could/would probably do a better
job of coping with C++'s grammar, but I'm not sure whether anybody's
actually implemented a C++ compiler using one of them.
 
R

REH

The grammar for C++ doesn't fit Bison's requirements, so writing a C++
parser with Bison requires lots of ugliness. Both cfront and older
versions of gcc were apparently written that way, but cfront is
discontiued and I believe gcc now uses a hand-written parser.

There are other parser generators that could/would probably do a better
job of coping with C++'s grammar, but I'm not sure whether anybody's
actually implemented a C++ compiler using one of them.

Antlr can handle LL(k) grammers, and there is a C++ one available for
download (though I don't know its quality): http://www.antlr.org/grammar/list

REH
 
M

Me

The few C++ compilers where I know what is used use recursive descent.
Theoretically, recursive descent can handle an even more restricted set
of grammars, but in practice, it's a lot easier to hack, and lends
itself to backtracking when necessary.


Are you talking about hand coding a recursive descent parser or using a
parser generator that creates recursive descent parse trees? About 20
years ago (college) I wrote some recursive descent code for small grammars
to do statistical evaluation but could not fathom hand coding one for a
full scale language with a non-trival grammar. I believed an LALR type
parser generator was the de-facto standard for implementing non-trivial
projects.

And yes, it would seem that recursive descent would be easier to hack but
setting the thing up in the first place seems like a nightmare.

Anyway, tis OT but still fascinating stuff.
 
J

James Kanze

Are you talking about hand coding a recursive descent parser
or using a parser generator that creates recursive descent
parse trees?

The one I know most about (Sun CC) is hand coded. I think g++
is too. If you're going to use a parser generator, you might as
well use one which uses LR, rather that LL, since it will be
able to handle a larger set of grammars. The advantage of a
hand coded recursive descent parser is that it is easier to
"fudge" cases where the language doesn't fit the grammar.
(Also, typically, you get much better error messages and error
recovery. Colloquially expressed: with an LL parser, such as
recursive descent, you know what you're looking for, which helps
greatly in formulating error messages and in resynchronizing;
with an LR parser, you know what you got, but in the case of an
error, all you really know is that you cannot do anything with
it.)
About 20
years ago (college) I wrote some recursive descent code for small grammars
to do statistical evaluation but could not fathom hand coding one for a
full scale language with a non-trival grammar. I believed an LALR type
parser generator was the de-facto standard for implementing non-trivial
projects.

I think that the general consensus is that LALR parsers are OK
for medium sized projects---they're overkill for very small
things, and the grammar soon becomes too wieldy, especially if
you start adding error productions, for very large languages.
Recursive descent has the advantage that it modularizes very
well.

And of course, almost no real languages can be described
directly by an LR grammar. Even something as simple as Pascal
requires a few simple hacks, and C++ is almost hopeless---the
recursive descent implementations I've heard of use backtracking
(which isn't formally allowed:)).
And yes, it would seem that recursive descent would be easier
to hack but setting the thing up in the first place seems like
a nightmare.

Not really. It can be done in a very modular fashion, with each
function being fairly simple. And of course, anything that can
handle standard C++ isn't going to be simple; you need the
modularity to manage it.
 
J

James Kanze

The grammar for C++ doesn't fit Bison's requirements, so writing a C++
parser with Bison requires lots of ugliness. Both cfront and older
versions of gcc were apparently written that way, but cfront is
discontiued and I believe gcc now uses a hand-written parser.

Is this still true, now that Bison supports GLR parsing
(basically, parsing several alternatives in parallel until you
know which one to use)? CFront, of course, didn't use Bison,
but yacc, which still doesn't have this possibility, and I don't
know when Bison added it (but I'm pretty sure it wasn't present
in the early 1990's).
 
J

Jerry Coffin

[ ... ]
Is this still true, now that Bison supports GLR parsing
(basically, parsing several alternatives in parallel until you
know which one to use)?

I'm honestly not certain, but it's _probably_ not true of versions of
Bison sufficiently recent to support GLR parsing.
CFront, of course, didn't use Bison,
but yacc, which still doesn't have this possibility, and I don't
know when Bison added it (but I'm pretty sure it wasn't present
in the early 1990's).

Yes -- current versions of Bison would fit much more closely in the
group that may well support C++ parsing more cleanly, but TTBOMK hasn't
actually been used to write a C++ compiler.
 
T

Tavysor

Hello. I have to read aboutflexand bison for my co's next project.

But it got me wondering: Were/are c++ compilers written usingflex/
bison? Or do you need another tool?

Searching for LALR and C++, it looks like it could be possible but not
practical. Is this true? If so, then what tools do compiler writers
use or do they write their own. (Yikes!)

give me please some information about flex
what about it`s this program
 
O

Ole Nielsby

But it got me wondering: Were/are c++ compilers written usingflex/
bison? Or do you need another tool?

Searching for LALR and C++, it looks like it could be possible but not
practical. Is this true?

Quite the opposite. It's like the flight of the bumble-bee: impossible
in theory, yet it flies. The grammar doesn't fit the requirements of the
parser generators. Yet GCC uses (or did use?) a parser generator,
with some hacks (disambiquation and using symbol tables in the lexer).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,119
Latest member
IrmaNorcro
Top