J
Jon Slaughter
Anyone know where I can get a (E)BNF like grammar for C++ to possibly use
either for analysis or for yacc/bison(++)?
On a side note I was thinking about writing my own lexer and syntaxer and
was thinking about ways to approach it. The way I see it is that most HLL
languages use nesting structures and if I could somehow define a grammar
that brings out this structure then I could use it for an application that I
need which is basicaly a source to source translator for C++(but I'd like to
design the parser and lexer to work for any grammar(sorta like bison).
Anyways, what I was thinking is that every block in say C++ is either a
block that is part of some structure defining or part of some code execution
grouping.
Maybe a grammer for it would be something like
program = block
block = block | header_block | structures_block | function_block |
conditional_block | loop_block | code | ....
maybe structures_block is something like
structures_block = structure_block | struct_block | class_block |
template_class_block | ....
class_block = 'class' name [: (base_classes,)] '{' structures_blocks |
private_block | public_block | protected_block'}'
private_block = ':' (structure_blocks | code)
......
So the grammar is kinda defined in a hierarchy of blocks. (Note that I'm
just starting to learn grammars and so the stuff above isn't going to be
perfect or make complete sense... its just for an idea I'm trying to get
across... and I know one can get some stupid things from it such as a
conditional block that isn't in some type of code block.. but just an
example(off the top of my head)).
Now, it may turn out that it doesn't matter how one writes the grammar(if
they represent the same language) that the parsing will always return the
same "parse tree"(which is what I'm kinda thinking but I'm not sure).
The reason I kinda need the "block" idea is that I want to reference the
code structure sorta like as an object.
lets say I have the program code as follows:
#include <iostream>
#include <string>
class test
{
private:
int i;
public:
test()
{
i = 0;
}
}
void main()
{
return 0;
}
then I want to represent that code sorta like an "object":
program.includes[0] = "iostream";
program.includes[1] = "string";
program.classes[0].name = "test";
program.classes[0].private.data[0].type = "int";
program.classes[0].private.data[0].name = "i";
program.classes[0].constructor[0].arg[0] = "";
program.classes[0].constructor[0].code.statement[0] = "i = 0";
program.functions[0].return_type = "void";
program.functions[0].name = "main";
program.functions[0].code.statement[0] = "return 0;";
etc....
or something like that.
so say I wanted to add some code to the "main" function I could then do
somethign like
int i = find_block("main", program.functions);
then program.functions.name = "main"; and I could do something like
insert_code(program.functions.code, "test t;", 0);
and that would insert the test t; line as
void main()
{
test t;
return 0;
}
Anyone have any ideas about doing this? Can I simply use flex and bison to
parse the original C++ code and easily "formulate" it into the "object" like
referencing of the different blocks?
Thanks,
Jon
either for analysis or for yacc/bison(++)?
On a side note I was thinking about writing my own lexer and syntaxer and
was thinking about ways to approach it. The way I see it is that most HLL
languages use nesting structures and if I could somehow define a grammar
that brings out this structure then I could use it for an application that I
need which is basicaly a source to source translator for C++(but I'd like to
design the parser and lexer to work for any grammar(sorta like bison).
Anyways, what I was thinking is that every block in say C++ is either a
block that is part of some structure defining or part of some code execution
grouping.
Maybe a grammer for it would be something like
program = block
block = block | header_block | structures_block | function_block |
conditional_block | loop_block | code | ....
maybe structures_block is something like
structures_block = structure_block | struct_block | class_block |
template_class_block | ....
class_block = 'class' name [: (base_classes,)] '{' structures_blocks |
private_block | public_block | protected_block'}'
private_block = ':' (structure_blocks | code)
......
So the grammar is kinda defined in a hierarchy of blocks. (Note that I'm
just starting to learn grammars and so the stuff above isn't going to be
perfect or make complete sense... its just for an idea I'm trying to get
across... and I know one can get some stupid things from it such as a
conditional block that isn't in some type of code block.. but just an
example(off the top of my head)).
Now, it may turn out that it doesn't matter how one writes the grammar(if
they represent the same language) that the parsing will always return the
same "parse tree"(which is what I'm kinda thinking but I'm not sure).
The reason I kinda need the "block" idea is that I want to reference the
code structure sorta like as an object.
lets say I have the program code as follows:
#include <iostream>
#include <string>
class test
{
private:
int i;
public:
test()
{
i = 0;
}
}
void main()
{
return 0;
}
then I want to represent that code sorta like an "object":
program.includes[0] = "iostream";
program.includes[1] = "string";
program.classes[0].name = "test";
program.classes[0].private.data[0].type = "int";
program.classes[0].private.data[0].name = "i";
program.classes[0].constructor[0].arg[0] = "";
program.classes[0].constructor[0].code.statement[0] = "i = 0";
program.functions[0].return_type = "void";
program.functions[0].name = "main";
program.functions[0].code.statement[0] = "return 0;";
etc....
or something like that.
so say I wanted to add some code to the "main" function I could then do
somethign like
int i = find_block("main", program.functions);
then program.functions.name = "main"; and I could do something like
insert_code(program.functions.code, "test t;", 0);
and that would insert the test t; line as
void main()
{
test t;
return 0;
}
Anyone have any ideas about doing this? Can I simply use flex and bison to
parse the original C++ code and easily "formulate" it into the "object" like
referencing of the different blocks?
Thanks,
Jon