shift/reduce conflicts in the YACC grammar of C99

E

eliben

The "old standard" ANSI/ISO C (C89) had a known shift-reduce conflict
in its YACC grammar, because of the uncertainty of where to hang an
else in nested if statements. This conflict was resolved to "shift" by
YACC, which is fine for C's semantics.

It appears that C99 introduced two new shift/reduce conflicts with
these new derivations for postfix-expression:

( type-name ) { initializer-list }
( type-name ) { initializer-list , }

This is called compound literals and allows things like:

struct POINT p;
p = (struct POINT) {x, y};

int *ap;
ap = (int []) {1, 2, 3};

Can someone confirm that this indeed creates 2 new shift/reduce
conflicts in the grammar? Or am I defining it incorrectly?

This is the conflict report I'm getting from PLY (A particular
implementation of YACC, in Python):

(206) unary_expression -> SIZEOF LPAREN type_name RPAREN .
(221) postfix_expression -> LPAREN type_name RPAREN . LBRACE
initializer_list LBRACE
(222) postfix_expression -> LPAREN type_name RPAREN . LBRACE
initializer_list COMMA LBRACE

! shift/reduce conflict for LBRACE resolved as shift

Thanks in advance
 
B

Ben Bacarisse

eliben said:
The "old standard" ANSI/ISO C (C89) had a known shift-reduce conflict
in its YACC grammar, because of the uncertainty of where to hang an
else in nested if statements. This conflict was resolved to "shift" by
YACC, which is fine for C's semantics.

It appears that C99 introduced two new shift/reduce conflicts with
these new derivations for postfix-expression:

( type-name ) { initializer-list }
( type-name ) { initializer-list , }

This is called compound literals and allows things like:

struct POINT p;
p = (struct POINT) {x, y};

int *ap;
ap = (int []) {1, 2, 3};

Can someone confirm that this indeed creates 2 new shift/reduce
conflicts in the grammar? Or am I defining it incorrectly?

This is the conflict report I'm getting from PLY (A particular
implementation of YACC, in Python):

(206) unary_expression -> SIZEOF LPAREN type_name RPAREN .
(221) postfix_expression -> LPAREN type_name RPAREN . LBRACE
initializer_list LBRACE
(222) postfix_expression -> LPAREN type_name RPAREN . LBRACE
initializer_list COMMA LBRACE

! shift/reduce conflict for LBRACE resolved as shift

Yes, that looks right. '( type-name )' can appear in several places but
when it denotes a cast, the grammar can tell because the next token
can't be '{'. A sizeof expression can end after the ')' or the
expression may continue with '{' ... '}'.
 
E

eliben

This is the conflict report I'm getting from PLY (A particular
Yes, that looks right.  '( type-name )' can appear in several places but
when it denotes a cast, the grammar can tell because the next token
can't be '{'.  A sizeof expression can end after the ')' or the
expression may continue with '{' ... '}'.

I thought about it, but what sense does a full sizeof expression
followed by { make? Taking the longest derivation is supposed to be
built-in for LALR parsers like YACC, otherwise a lot of things would
not work, such as:

x = y + z;

Stop at 'y' or continue to '+ z'?

Eli
 
B

Ben Bacarisse

eliben said:
I thought about it, but what sense does a full sizeof expression
followed by { make?

What sense would ruling it out make? If you think that the result of

sizeof ( type-name ) { initializer-list }

is always the same as sizeof ( type-name ) then consider

sizeof (char *[]){"abc", "def", "ghi"}

Of course, the grammar could exclude compound literals from sizeof while
permitting a work around by adding parentheses:

sizeof ((char *[]){"abc", "def", "ghi"})

but, again, what is the advantage of adding all the complexity?
Taking the longest derivation is supposed to be
built-in for LALR parsers like YACC, otherwise a lot of things would
not work, such as:

x = y + z;

Stop at 'y' or continue to '+ z'?

Yes, but YACC switches to another mechanism for expressions and
deliberately does not report the myriad shift/reduce conflicts that
would otherwise be shown.
 
E

eliben

What sense would ruling it out make?  If you think that the result of

  sizeof ( type-name ) { initializer-list }

is always the same as sizeof ( type-name ) then consider

 sizeof (char *[]){"abc", "def", "ghi"}

Of course, the grammar could exclude compound literals from sizeof while
permitting a work around by adding parentheses:

  sizeof ((char *[]){"abc", "def", "ghi"})

but, again, what is the advantage of adding all the complexity?
Taking the longest derivation is supposed to be
built-in for LALR parsers like YACC, otherwise a lot of things would
not work, such as:
x = y + z;
Stop at 'y' or continue to '+ z'?

Yes, but YACC switches to another mechanism for expressions and
deliberately does not report the myriad shift/reduce conflicts that
would otherwise be shown.

Eventually it turned out to be a typo in the grammar definition that
was causing the trouble (*blush*). Fixed, it analyzes the full C99
grammar with only the single if-else conflict.

I apologize for the confusion, but at least there's now a fully
functional C99 grammar for YACC (although a Python version thereof -
PLY) available :)

Eli
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top