Grammars for C++

J

Jon Slaughter

Anyone know where I can get a (E)BNF like grammar for C++ to possibly use
either for analysis or for yacc/bison(++)?

On a side note I was thinking about writing my own lexer and syntaxer and
was thinking about ways to approach it. The way I see it is that most HLL
languages use nesting structures and if I could somehow define a grammar
that brings out this structure then I could use it for an application that I
need which is basicaly a source to source translator for C++(but I'd like to
design the parser and lexer to work for any grammar(sorta like bison).

Anyways, what I was thinking is that every block in say C++ is either a
block that is part of some structure defining or part of some code execution
grouping.

Maybe a grammer for it would be something like

program = block

block = block | header_block | structures_block | function_block |
conditional_block | loop_block | code | ....

maybe structures_block is something like

structures_block = structure_block | struct_block | class_block |
template_class_block | ....

class_block = 'class' name [: (base_classes,)] '{' structures_blocks |
private_block | public_block | protected_block'}'

private_block = ':' (structure_blocks | code)

......

So the grammar is kinda defined in a hierarchy of blocks. (Note that I'm
just starting to learn grammars and so the stuff above isn't going to be
perfect or make complete sense... its just for an idea I'm trying to get
across... and I know one can get some stupid things from it such as a
conditional block that isn't in some type of code block.. but just an
example(off the top of my head)).


Now, it may turn out that it doesn't matter how one writes the grammar(if
they represent the same language) that the parsing will always return the
same "parse tree"(which is what I'm kinda thinking but I'm not sure).

The reason I kinda need the "block" idea is that I want to reference the
code structure sorta like as an object.

lets say I have the program code as follows:

#include <iostream>
#include <string>

class test
{
private:
int i;
public:
test()
{
i = 0;
}
}

void main()
{
return 0;
}

then I want to represent that code sorta like an "object":

program.includes[0] = "iostream";
program.includes[1] = "string";
program.classes[0].name = "test";
program.classes[0].private.data[0].type = "int";
program.classes[0].private.data[0].name = "i";
program.classes[0].constructor[0].arg[0] = "";
program.classes[0].constructor[0].code.statement[0] = "i = 0";
program.functions[0].return_type = "void";
program.functions[0].name = "main";
program.functions[0].code.statement[0] = "return 0;";
etc....

or something like that.

so say I wanted to add some code to the "main" function I could then do
somethign like

int i = find_block("main", program.functions);

then program.functions.name = "main"; and I could do something like

insert_code(program.functions.code, "test t;", 0);

and that would insert the test t; line as

void main()
{
test t;
return 0;
}


Anyone have any ideas about doing this? Can I simply use flex and bison to
parse the original C++ code and easily "formulate" it into the "object" like
referencing of the different blocks?

Thanks,
Jon
 
A

AbdulMunaf

You can find the complete grammer of C++ in book, The C++ Programming
Language by Bjarne Stroustrup (Addison Wesley). Its given in the
Appendix A of the book.
 
A

AbdulMunaf

The auther says about the grammer in the introducton:

This summary of C++syntax is intended to be an aid to comprehension. It
is not an exact statement of the language. In particular, the grammar
described here accepts a superset of valid C++constructs.
Disambiguation rules (§A.5, §A.7) must be applied to distinguish
expressions from declarations. Moreover, access control, ambiguity, and
type rules must be used to weed out syntactically valid but meaningless
constructs.
 
I

Ira Baxter

Jon Slaughter said:
Anyone know where I can get a (E)BNF like grammar for C++ to possibly use
either for analysis or for yacc/bison(++)?

Most people that try this think the issue is the grammar.
If you get past the "hills" of troubles with getting
a trustworthy grammar, let alone an LALR(1)
grammar, you'll discover
the semantics of C++ on the other side are rather
like the Himalayas in comparison: ambiguous
rules, preprocessor, ambiguous
include files, name/type resolution, templates.
If you succeed there, you then get to think
about building machinery to actually carry
out analyses of interest, make changes to
the code, and then regenerate it all without
making any mistakes. Finally, you get to fight
with the fact that C++ comes in a bunch of dialects...

We've spent the better part of an elapsed decade
and/or a man-century depending on your perspective,
building transformational machinery to carry out general
parsing/analysis/transformation, and a signficant
chunk of the last 5 years building robust C++
parser front ends for that machinery, using
extremely experienced computer science language
experts.

I don't want to rain on your parade,
but kids, please don't try this at home,
unless your goals are extremely limited.

See www.semdesigns.com/Products/FrontEnds/CppFrontEnd.html
 
M

Markus.Elfring

Jon said:
Anyone know where I can get a (E)BNF like grammar for C++ to possibly use
either for analysis or for yacc/bison(++)?

On a side note I was thinking about writing my own lexer and syntaxer and
was thinking about ways to approach it. The way I see it is that most HLL
languages use nesting structures and if I could somehow define a grammar
that brings out this structure then I could use it for an application that I
need which is basicaly a source to source translator for C++(but I'd like to
design the parser and lexer to work for any grammar(sorta like bison).

Please look at the following information sources and tools.
1. Edward D. Willink: Meta-Compilation for C++ - transformation and
filtering of a superset to the target language
http://www.computing.surrey.ac.uk/research/dsrg/fog/
http://citeseer.ist.psu.edu/251920.html

2. James F. Power, Tanton H. Gibbs and Brian A. Malloy: Keystone -
token decoration
http://keystone.sourceforge.net/research.shtml

3. Article "Parsing C++"
http://www.nobugs.org/developer/parsingcpp/

4. http://en.wikipedia.org/wiki/OpenC_Plus_Plus

5. http://synopsis.fresco.org/

6. http://doxygen.org/

Regards,
Markus
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top