Operator precedence.

S

somenath

Hi All,

While browsing through the old article in this group I have come
across one article by Chrish Torek.
http://groups.google.com/group/gnu....ler+munching++++chrish+torek#d97a0f655e96ce16


In this article he said.

"Finally, let me say one last time that the Standard C grammar DOES
NOT use operator precedence. (At most, one can say that the grammar
"implies" a precedence. I believe the word "specifies" is too
strong, despite its appearance in a non-normative footnote."

So the following expression

2+3*5;
is equal to 17. For human * has higher precedence than + so *
operator will be first operated on 3*5 and then the resultant will be
added to 2. So the result will be 2+(3*5) .

Then my question is how C compiler with out bothering about precedence
produce the same result as 2+(3*5).

I tried to explain myself the outcome of 2+3*5 the way Chris Torek
has explained the expression -- - x but I am not able to construes
how the precedence of * will be enforced by the C grammar with out
using operator precedence.

Regards,
Somenath
 
S

Stefan Ram

somenath said:
2+3*5;
Then my question is how C compiler with out bothering about precedence
produce the same result as 2+(3*5).

That whole expression cannot be parsed as a multiplicative expression,
because »2+3« is not a multiplicative expression (see N1570 6.5.5p1),
so it has to be an additive expression (6.5.6p1) with »3*5« being a
multiplicative expression (6.5.5p1).
 
J

James Kuyper

Hi All,

While browsing through the old article in this group I have come
across one article by Chrish Torek.
http://groups.google.com/group/gnu....ler+munching++++chrish+torek#d97a0f655e96ce16


In this article he said.

"Finally, let me say one last time that the Standard C grammar DOES
NOT use operator precedence. (At most, one can say that the grammar
"implies" a precedence. I believe the word "specifies" is too
strong, despite its appearance in a non-normative footnote."

So the following expression

2+3*5;
is equal to 17. For human * has higher precedence than + so *
operator will be first operated on 3*5 and then the resultant will be
added to 2. So the result will be 2+(3*5) .

Then my question is how C compiler with out bothering about precedence
produce the same result as 2+(3*5).

Without going into the details, let me state that 2, 3, and 5 all
qualify as "cast-expressions".

The relevant grammar rules are:

multiplicative-expression:
cast-expression
multiplicative-expression * cast-expression

additive-expression:
multiplicative-expression
additive-expression + multiplicative-expression

The parse tree is as follows (monospaced font required):

2 + 3 * 5
cast-expression + cast-expression * cast-expression
cast-expression +(multiplicative-expression * cast-expression)
multiplicative-expression + multiplicative-expression
(additive-expression + multiplicative-expression)
additive-expression

These grammar rules encode the equivalent of operator precedence in the
fact that the grammar rules for a additive expression require
multiplicative expressions as operands, whereas multiplicative
expressions do not allow additive expressions as operators. If the C
grammar followed a similar pattern for all of the operators, it would be
equally accurate to describe it in terms of precedence or grammar rules.

However, the grammar rules for the conditional expressions do not allow
for a simple precedence-based interpretation:

conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-express

Any kind of expression can occur between the ? and the :, which would
imply that conditional expressions have the lowest possible precedence.
However, neither the first nor the third operand of a conditional
expression can be an assignment expression or a comma expression, which
would imply that it has higher precedence than assignment operators or
the comma operator.
 
K

Kaz Kylheku

has explained the expression -- - x but I am not able to construes
how the precedence of * will be enforced by the C grammar with out
using operator precedence.

Like this example grammar:

expression := term
| expression + term
| expression - term


term := factor
| term * factor
| term / factor


factor := ( expression )
| number
| identifier

No associativity or precedence is expressed directly by the grammar.

The precedence of * and / is implicity higher than that of + and -
because * and / occur only inside a term, and an expression is made
of terms separated by + or -. The rules are implicitly hierarchical.

Associativity is implicitly left to right because the rules are
left recursive. There is no ambiguity because we do not have ambiguous
rules like this:

expression := expression + expression

but rather

expression := expression + term

So if you have this:

1 + 2 + 3

the only way it can match the pattern "expression + term" is if "1 + 2"
is "expression" and "3" is "term". So it means that 3 is added to the
sum of 1 + 2.
 
J

James Kuyper

somenath wrote: ....

I can't do it either.
I use the precedence table in chapter 2 of K&R instead.

What does that table say about the relative precedence of ||, ?:, =, and
the comma operator? Using that table and the corresponding information
about the associativity of the operators, how do you parse the two
following expressions:

a || b ? d , e : f ? g : h
a ? b : c ? d , e : f = g

Try to determine the correct parsings before reading any farther. The
correct parsings according to the C grammar cannot both be reproduced by
applying a single consistent precedence order to all of the relevant
operators. They are:

(a || b) ? (d , e) : (f ? g : h)
(a ? b : (c ? (d , e) : f)) = g

The second one is a constraint violation - the C standard requires both
that it be parsed that way, and that having been so parsed, it must be
diagnosed as a constraint violation.

Those particular expressions were chosen for a message I sent out
several months ago explaining the differences between C and C++ in the
parsing of such expressions. If you're interested, try the C++ examples:

a || b ? d , e : f = g
a ? b : c ? d , e : f , g

The peculiar spacing was chosen to allow enough room to insert the
parentheses needed to indicate the correct parse, and to line up the
corresponding variables when viewed in a monospaced font.
 
B

Ben Bacarisse

James Kuyper said:
What does that table say about the relative precedence of ||, ?:, =, and
the comma operator? Using that table and the corresponding information
about the associativity of the operators, how do you parse the two
following expressions:

a || b ? d , e : f ? g : h
a ? b : c ? d , e : f = g

Try to determine the correct parsings before reading any farther. The
correct parsings according to the C grammar cannot both be reproduced by
applying a single consistent precedence order to all of the relevant
operators. They are:

(a || b) ? (d , e) : (f ? g : h)
(a ? b : (c ? (d , e) : f)) = g

How do you get this second parse? It's a complex expression but it's
only the top level of it that's bugging me.
The second one is a constraint violation - the C standard requires both
that it be parsed that way, and that having been so parsed, it must be
diagnosed as a constraint violation.

<snip>
 
J

James Kuyper

How do you get this second parse? It's a complex expression but it's
only the top level of it that's bugging me.

I should have reviewed the entire discussion before posting that part of
it. I'd forgotten that you'd already corrected me on that issue the last
time I discussed it. The left operand of an assignment operator must be
a unary expression, so that's a syntax error, not a constraint
violation. That's not fatal to what I was trying to say, but it does
mean that this isn't the correct way to say it.
I need to modify the expression to make the point I was trying to make.
I don't have time to think about it right now, but I'll see what I can
put together later.
 
B

Ben Bacarisse

James Kuyper said:
I should have reviewed the entire discussion before posting that part of
it. I'd forgotten that you'd already corrected me on that issue the last
time I discussed it.

I don't recall that. In fact I seem to recall making a similar mistake
about what the syntax says because I took what gcc reports (that the LHS
of and assignment must be an lvalue) too literally. Anyway, this has
certainly come up before.
The left operand of an assignment operator must be
a unary expression, so that's a syntax error, not a constraint
violation. That's not fatal to what I was trying to say, but it does
mean that this isn't the correct way to say it.
I need to modify the expression to make the point I was trying to make.
I don't have time to think about it right now, but I'll see what I can
put together later.

I think the fact that assignment does not follow the same pattern as the
other operators is all you need to make the point that a single
precedence for each operator can't capture the formal syntax of C.

The conditional operator is tricky, but you can fix the problem by
treating the "?<expression>:" part as the operator. Then it does behave
like a right-associative operator with a precedence between that of ||
and the assignment operators.

gcc's error reports suggest that it treats C expressions as an operator
precedence grammar and that this cases no problems because everything
that is formally a syntax error gets picked up later. As far as the
standard is concerned, a diagnostic is a diagnostic.

Putting these two together -- treating "?<exp>:" as an operator and
giving assignment the expected priority -- means that, as programmers,
we won't go wrong just as gcc does not go wrong. Problems will only
occur in newsgroups where the reason *why* an expression is incorrect
might matter.

I've not checked all the detail, but I don't recall every being misled
in practice by thinking of C expressions as having an operator
precedence grammar.
 
B

BartC

Then my question is how C compiler with out bothering about precedence
produce the same result as 2+(3*5).

I tried to explain myself the outcome of 2+3*5 the way Chris Torek
has explained the expression -- - x but I am not able to construes
how the precedence of * will be enforced by the C grammar with out
using operator precedence.

It defines the syntax so that the "+" and "*" have to be processed in a
particular order. Then that has the same effect as using operator
priorities, parsing to: (+ (2 (* 3 5) instead of (* (+2 3) 5)

Apparently the C syntax is complicated that way (using well over a dozen
different levels of expressions) because of a few cases that need special
treatment. But that didn't really work because these cases are still
confusing even for the experts. And there are too many precedence levels
anyway...
 
J

James Kuyper

I don't recall that. In fact I seem to recall making a similar mistake
about what the syntax says

You did, initially. But you got it right in your message with the
following headers:

Subject: In how many ways should this fail?
Date: Wed, 01 Feb 2012 16:40:09 +0000

....
I was trying to demonstrate the precise limits of the use of precedence
and associativity to describe the C grammar. I provided one example
where every operand of the ?: operator had the lowest precedence allowed
by the grammar, and one example where what would otherwise qualify as
the left and right operands of the ?: operator were expressions with the
highest precedence that could not so qualify. However, I forgot to make
sure that the second expression had a valid parse. Here's the corrected
version of the second expression:

a ? b : c ? d , e : f , g

and the parse required by the C standard:

(a ? b : (c ? (d , e) : f)) , g

I was really trying to make several points:

1. The grammar of C expressions that contain no syntax errors or
conditional operators can be correctly described using precedence and
associativity.
2. Many possible expressions are syntax errors for reasons that cannot
be explained using precedence and associativity; they can only be
explained by reference to the grammar.
3. Some expressions (mainly involving the middle operand of the ?:
operator) are allowed by the C grammar which cannot be explained purely
in terms of precedence and associativity.

Insofar as item 1 is true, precedence and associativity are redundant
with the grammar; because of item 2, the grammar is NOT redundant with
precedence and associativity. Those two facts alone are sufficient to
justify relying upon the grammar, rather than precedence and
associativity. The standard should NOT use both methods of description;
every key point in the standard should be expressed exactly once in the
normative text. If a requirement is expressed in two different places,
the two different expressions might conflict with each other.

Item 3 makes it clear that precedence and associativity are not merely
redundant with the grammar, but are (in rare circumstances) in conflict
with it.

I'm not opposed to thinking in terms of precedence and associativity;
they're useful concepts for most C operators. It's just important to
learn the limits of applicability of those concepts.
 
J

Joe keane

And there are too many precedence levels anyway...

Precedence in C is pretty logical.

**
* /
+ - (unary)
+ - (binary)
//
FROM TO
.EQ. .GE. .GT. .LE. .LT. .NE.
.NOT.
.AND.
.OR.
.EQV. .NEQV.
,
( )

This in turn is taken from math.
 
B

BartC

Joe keane said:
Precedence in C is pretty logical.

**
* /
+ - (unary)
+ - (binary)
//
FROM TO
.EQ. .GE. .GT. .LE. .LT. .NE.
.NOT.
.AND.
.OR.
.EQV. .NEQV.
,
( )

This in turn is taken from math.

Well they don't look like C. And I'm not sure why unary ops are included in
that lot. And whether they are logical or not, too many precedences are
difficult to remember. I tend to use a pattern like this:

6 LOGICAL OR
5 LOGICAL AND
4 COMPARE/LOGICAL XOR [not sure what this one is doing here]
3 ADD/SUB/BITWISE AND/OR/XOR
2 MUL/DIV/MOD/SHIFT
1 POWER (right-to-left)

Since C doesn't have a power op, that leaves just 5 levels; only two of
which are needed for normal arithmetic. A bit easier or not?
 
K

Keith Thompson

Precedence in C is pretty logical.

**
* /
+ - (unary)
+ - (binary)
//
FROM TO
.EQ. .GE. .GT. .LE. .LT. .NE.
.NOT.
.AND.
.OR.
.EQV. .NEQV.
,
( )

This in turn is taken from math.

Did you mean to say "Precedence in Fortran is pretty logical"?

I don't think anyone would argue that C's precedence levels are entirely
logical.
 
J

James Kuyper

Well they don't look like C. ...

It looks like Fortran to me; I'm not quite sure what he meant by that.
... And I'm not sure why unary ops are included in
that lot. ...

I'm not sure why you consider that odd.
The precedence of unary operators matters; otherwise *a++ could be
either (*a)++ or *(a++), and *a-b could be either *(a-b) or (*a)-b. C
gives postfix operators the highest precedence, and unary prefix
operators second highest, a distinction that neither his list nor your
list addresses. As a result, *a->b is *(a->b), not (*a)->b.

.... And whether they are logical or not, too many precedences are
difficult to remember. I tend to use a pattern like this:

6 LOGICAL OR
5 LOGICAL AND
4 COMPARE/LOGICAL XOR [not sure what this one is doing here]
3 ADD/SUB/BITWISE AND/OR/XOR
2 MUL/DIV/MOD/SHIFT
1 POWER (right-to-left)

Where would =, op=, and the comma operator fit into the above scheme?
Since C doesn't have a power op, that leaves just 5 levels; only two of
which are needed for normal arithmetic. A bit easier or not?

C has 16 levels, but that's partly because it covers 21 operators not
present in your list. However, the items covered by your list span 10 of
C's precedence levels, so your point still holds.

There's always a trade-off. For instance, your approach makes no
distinction between (a&b) | (c&d) and (a|b) & (c|d); they both need
parentheses in order to work as intended. C's approach is based upon the
reasonable assumption that expressions of the first form are
substantially more common than the second, so it gives binary & higher
precedence than |, rendering the parentheses unnecessary in the first form.

Of course, for people who have trouble remembering C's order of
precedence, the parenthesis are, effectively, mandatory either way, so
it's not much of a difference.
 
B

Ben Bacarisse

BartC said:
Well they don't look like C. And I'm not sure why unary ops are
included in that lot. And whether they are logical or not, too many
precedences are difficult to remember. I tend to use a pattern like
this:

6 LOGICAL OR
5 LOGICAL AND
4 COMPARE/LOGICAL XOR [not sure what this one is doing here]
3 ADD/SUB/BITWISE AND/OR/XOR
2 MUL/DIV/MOD/SHIFT
1 POWER (right-to-left)

Since C doesn't have a power op, that leaves just 5 levels; only two
of which are needed for normal arithmetic. A bit easier or not?

At least one component of "easier" is "logical" so it may help to look
at C's operators in that light.

Because of the strong mathematical the connection between &/| and */+ it
is natural to separate these pairs by precedence. Since C also has
&&/|| that gives us 6 levels right away. You could argue that these
need not all be distinct, but logical connections (&& and ||) are often
made between arithmetic and bit-wise expressions and almost never the
other way round, and although the case for separating &/| from */+ is
more pragmatic it still strong.

Order relationships are most often made between arithmetic expressions
and they are often joined by logical connectives, so another level seems
natural for these. This gives us 7 levels.

Likewise, assignment should allow almost anything to be assigned without
parentheses, but sequencing (,) should allow assignment without any
because it's main purpose, after all, is to permit a explicit
side-effect, to occur in sequence. That takes us to up to 9.

Unary operators need a 10th level just to be consistent with normal
mathematical notation, but because C has postfix unary operators as well
we should ask if they can share the level with their prefix cousins.
Expressions like -a[10] and *f(0) are common, so it seems impractical to
require parentheses for them and, because the expected meaning is not
easily expressed using associativity, this leads to an 11th level.

C has 15 levels -- the other 4 coming from bit-wise XOR, the shift
operators, the conditional operator, and from separating equality from
the other comparison operators.

Once you have 11, it seems to me you may as well have 15. Putting ^ in
with, say, |; or putting the shift operators in with * would probably be
as confusing as giving them their own level. And separating == and !=
from < and friends has strong arguments in its favour, which there is no
reason to ignore once you already have so many levels.

In short, I don't think C would be logical with fewer than 11 levels,
and once you have 11, does it matter if the distinctions between
operators are emphasized by having 15? Would the degree of
arbitrariness required to reduce the number eliminate the benefit of
having fewer?
 
B

BartC

James Kuyper said:
It looks like Fortran to me; I'm not quite sure what he meant by that.

But what are those funny ones like // and FROM? If // is integer division,
then why does it need to be different from /?
I'm not sure why you consider that odd.
The precedence of unary operators matters; otherwise *a++ could be
either (*a)++ or *(a++), and *a-b could be either *(a-b) or (*a)-b. C
gives postfix operators the highest precedence, and unary prefix
operators second highest, a distinction that neither his list nor your
list addresses. As a result, *a->b is *(a->b), not (*a)->b.

Unary ops are executed before binary ones, so they shouldn't even be in the
same list. I'd've thought that was pretty standard (in the same way that you
don't need to add parentheses to such a list.

And they would naturally be executed from the inside outwards, so precedence
doesn't matter there.

It's only an issue where unary ops can appear on either side of a term, as
in your examples. The precedence table in K&R2 suggests that a series of
unary ops such as op1 op2 A op3 op4 should be calculated as op1(op2((A op4)
op3)) (with 3 and 4 interchanged). They are confusing.
... And whether they are logical or not, too many precedences are
difficult to remember. I tend to use a pattern like this:

6 LOGICAL OR
5 LOGICAL AND
4 COMPARE/LOGICAL XOR [not sure what this one is doing here]
3 ADD/SUB/BITWISE AND/OR/XOR
2 MUL/DIV/MOD/SHIFT
1 POWER (right-to-left)

Where would =, op=, and the comma operator fit into the above scheme?

If you write the following:

A = B op C;

you would naturally expect B op C, what op happens to be, to be done before
the assignment. This suggests assignment should always be done last, and
belongs (with assign-op combinations) at the top of my list).
C has 16 levels, but that's partly because it covers 21 operators not
present in your list. However, the items covered by your list span 10 of
C's precedence levels, so your point still holds.

16 is too many (so is 15, the number listed by K&R). Why is == different
from <= for example? While <<,>> belong with multiply and divide. And if I
read the table properly, then a & b == c & d actually means a & (b==c) & d;
that obviously doesn't look right, not for bitwise & anyway.
There's always a trade-off. For instance, your approach makes no
distinction between (a&b) | (c&d) and (a|b) & (c|d); they both need
parentheses in order to work as intended.

Those bitwise and/or levels may be a mistake; & should be the same as
multiply, and | the same as add (because & and * can be both be used for set
intersection, and | and + can both be used for set union; and yes I know C
doesn't have sets...).

This all means only 2-3 levels to remember for normal arithmetic, and
another 3 (compare, and, or) for conditional expressions.

Stuff such as assignments, () and 'comma' I would consider language
constructs not operators. The ?: op I would also implement as a construct,
but with mandatory parentheses.
 
J

James Kuyper

But what are those funny ones like // and FROM? If // is integer division,
then why does it need to be different from /?

I've no idea - the last time I did any significant Fortran programming
was 2 decades ago, in VMS Fortran. There have been two or three?
revisions to the Fortran standard since then, including a lot of new
features that make it a very different language. I can't vouch for
whether // and FROM might be two of those new features, but the rest of
list looked like Fortran to me.
Unary ops are executed before binary ones, so they shouldn't even be in the
same list.

This is about hierarchy within the parse tree, not execution order. The
parse tree imposes some constraints on the execution order, but does not
determine it. In the expression *a + ~b * !c, *a need not be evaluated
until after the multiplication.

The features of the grammar that give postfix and prefix operators
higher precedence than the other operators are precisely why they they
go before the other ones.
... I'd've thought that was pretty standard (in the same way that you
don't need to add parentheses to such a list.

And they would naturally be executed from the inside outwards, so precedence
doesn't matter there.
It's only an issue where unary ops can appear on either side of a term, as

Which is precisely why unary ops in C occupy two different precedence
levels, one for postfix and one for prefix operators.
in your examples. The precedence table in K&R2 suggests that a series of
unary ops such as op1 op2 A op3 op4 should be calculated as op1(op2((A op4)
op3)) (with 3 and 4 interchanged). They are confusing.

That's a rather confusing statement. Why did you write "with 3 and 4
interchanged" rather than simply interchanging 3 and 4?
... And whether they are logical or not, too many precedences are
difficult to remember. I tend to use a pattern like this:

6 LOGICAL OR
5 LOGICAL AND
4 COMPARE/LOGICAL XOR [not sure what this one is doing here]
3 ADD/SUB/BITWISE AND/OR/XOR
2 MUL/DIV/MOD/SHIFT
1 POWER (right-to-left)

Where would =, op=, and the comma operator fit into the above scheme?

If you write the following:

A = B op C;

you would naturally expect B op C, what op happens to be, to be done before
the assignment. This suggests assignment should always be done last, and
belongs (with assign-op combinations) at the top of my list).

So you need an additional level for that, too. In C, it's not the lowest
precedence, so a , b = c is parsed as a, (b = c), not (a,b)=c (which
would be a constraint violation).

Because C uses grammar productions rather than precedence tables, the
expression a ? b = c : d is not the syntax error (a?b) = (c:d), but is
instead a ? (b=c) : d.
16 is too many (so is 15, the number listed by K&R). Why is == different
from <= for example? While <<,>> belong with multiply and divide. And if I
read the table properly, then a & b == c & d actually means a & (b==c) & d;
that obviously doesn't look right, not for bitwise & anyway.

Not all of the choices made for operator precedence were good ones;
IIRC, Ritchie himself acknowledged that fact. In a new language where
backwards compatibility was not an issue, they should be done
differently. But I don't think that fewer precedence levels is the way
to go.
 
B

BartC

James Kuyper said:
On 05/11/2012 11:42 AM, BartC wrote:

This is about hierarchy within the parse tree, not execution order. The
parse tree imposes some constraints on the execution order, but does not
determine it. In the expression *a + ~b * !c, *a need not be evaluated
until after the multiplication.

For unary operators, I think hierarchy matches execution order.
The features of the grammar that give postfix and prefix operators
higher precedence than the other operators are precisely why they they
go before the other ones.


Which is precisely why unary ops in C occupy two different precedence
levels, one for postfix and one for prefix operators.

Do they? In the K&R table, they seem to occupy one level.
That's a rather confusing statement. Why did you write "with 3 and 4
interchanged" rather than simply interchanging 3 and 4?

Because you might have thought it a typo otherwise. K&R says unary ops
associate from right to left. So op4 first.
So you need an additional level for that, too. In C, it's not the lowest
precedence, so a , b = c is parsed as a, (b = c), not (a,b)=c (which
would be a constraint violation).

Yes, you can think of it as an extra level, but it can just be the grammar.
The main thing you don't need to stop and think about it.
Because C uses grammar productions rather than precedence tables, the
expression a ? b = c : d is not the syntax error (a?b) = (c:d), but is
instead a ? (b=c) : d.

The three-way ?: is too weird to even have as operator. How would you even
parenthesise it so that ?: is done before the assignment? Besides, according
to the precedence table, ?: comes before assignment!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,067
Latest member
HunterTere

Latest Threads

Top