C++ language grammar?

D

Divick

Hi all,
can anyone tell me what kind of parser is used to parse C++
language i.e. LALR or LL (k?) ? Or is it implementation dependent?

The reason for this question is that I was trying to figure out what
the output of the following program be:

int main()
{
int a = 10;
printf("%d %d %d\n",++a,a++,a);
return 0;
}

and I am getting two different results on two different compilers. Can
one definitely answer based on how the laguage is parsed (lef to right
or right to left) to tell the output of the above program?

Thanks,
Divick
 
S

Sumit RAJAN

Divick said:
Hi all,
can anyone tell me what kind of parser is used to parse C++
language i.e. LALR or LL (k?) ? Or is it implementation dependent?

The reason for this question is that I was trying to figure out what
the output of the following program be:

int main()
{
int a = 10;
printf("%d %d %d\n",++a,a++,a);
return 0;
}

Please see:
http://c-faq.com/expr/evalorder2.html
http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-39.15
http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-39.16
http://www.angelikalanger.com/Articles/VSJ/SequencePoints/SequencePoints.html
 
K

Kaz Kylheku

Divick said:
Hi all,
can anyone tell me what kind of parser is used to parse C++
language i.e. LALR or LL (k?) ? Or is it implementation dependent?

The C and the C++ grammars are not context-free, so they cannot be
handled by these parsing techniques alone.

The lexical category of tokens depends on their semantics (how they are
declared).

In order to decide how to categorize a lexeme, the lexical analyzer
must sometimes make use of semantic information gathered up to that
point.
The reason for this question is that I was trying to figure out what
the output of the following program be:
int main()
{
int a = 10;
printf("%d %d %d\n",++a,a++,a);
return 0;
}

and I am getting two different results on two different compilers.

This is because of semantics, not grammar. In C, the grammar for a
function call is:

postfix-expression:
...
postfix-expression ( argument-expression-list_opt )
...

argument-expression-list:
assignment-expression
argument-expression-list , assignment-expression

As you can see, this is left recursion which generates a nice, linear
parse tree sloping to the left.

The shape of that tree has precisely nothing to do with the evaluation
order of those arguments, which is unspecified.
one definitely answer based on how the laguage is parsed (lef to right
or right to left) to tell the output of the above program?

What does /parsing/ have to do with /evaluation/ semantics?

You are overlooking the basic fact that control flow can jump around
arbitrarily within a program, in spite of that program being produced
from source code that is read from top to bottom.

If parsing determined evaluation order, then programs would execute
from top to bottom, and wouldn't have any loops or function calls.
 
D

Default User

Divick said:
Hi all,
can anyone tell me what kind of parser is used to parse C++
language i.e. LALR or LL (k?) ? Or is it implementation dependent?

The reason for this question is that I was trying to figure out what
the output of the following program be:

int main()
{
int a = 10;
printf("%d %d %d\n",++a,a++,a);
return 0;
}

and I am getting two different results on two different compilers. Can
one definitely answer based on how the laguage is parsed (lef to right
or right to left) to tell the output of the above program?


What you are doing is undefined behavior. You have modified the same
variable twice without an intervening sequence point.




Brian
 
J

James Bannon

Divick said:
Hi all,
can anyone tell me what kind of parser is used to parse C++
language i.e. LALR or LL (k?) ? Or is it implementation dependent?

The reason for this question is that I was trying to figure out what
the output of the following program be:

int main()
{
int a = 10;
printf("%d %d %d\n",++a,a++,a);
return 0;
}

and I am getting two different results on two different compilers. Can
one definitely answer based on how the laguage is parsed (lef to right
or right to left) to tell the output of the above program?

Thanks,
Divick
AFAIK the C++ grammar is not LR(k) for any k and that would certainly
eliminate LALR or LL as possible parsing engines that could
unambiguously parse the entire C++ grammar. To properly define the
grammar I think you would need something like Wingarten grammars as used
to define the Algol68 language but you really don't want to go there
unless you really want your head to hurt - even the C++ Standard doesn't
try that one.
 
I

Ira Baxter

James Bannon said:
AFAIK the C++ grammar is not LR(k) for any k and that would certainly
eliminate LALR or LL as possible parsing engines that could
unambiguously parse the entire C++ grammar. To properly define the
grammar I think you would need something like Wingarten grammars as used
to define the Algol68 language but you really don't want to go there
unless you really want your head to hurt - even the C++ Standard doesn't
try that one.

The answer is, parsing is implementation dependent.
GNU uses a hand coded parsers, and in fact almost all
the C++ compilers I know about do so, because of
the perceived nasty interactions of types and "legal syntax".

A GLR parser, however, can parse C++ just fine,
from a very clean (well, at least it matches the standard
closely) grammar, although it produces a number of ambiguous parses,
later resolved by knowing types. The end result
is the same after parsing and name/type resolution
is complete: a tree without ambiguities.

We use a GLR parser in our tools.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,813
Messages
2,569,699
Members
45,488
Latest member
MohammedHa

Latest Threads

Top