Stephen Sprunk said:
My compiler accepts x&(&y), though it warns about converting a pointer
to integer without a cast.
It's a constraint violation, so the only requirement is that the
compiler must issue a diagnostic. Either printing a fatal error
message and failing to translate, or printing a warning (and probably
performing a rather silly implicit pointer-to-integer conversion) is
valid behavior, allowed by the standard.
To all the comments about the lexing/parsing/tokens/etc., I'll admit I
don't know enough about compiler design to understand the
arguments. However, I would expect that the compiler would be able to
distinguish the ** in "x**y" from the ** in "**p" due to the preceding
object/value (or lack thereof). Perhaps that expectation is not
correct, in which case I'll grant there may be technical difficulties
in implementing a binary ** operator.
Take a look at the translation phases defined in C99 5.1.1.2.
Preprocesing tokens are resolved in phase 3. In that phase, there is
no "preceding object/value", there's only a stream of tokens, and no
distinction between
x ** y
where y is the name of a pointer object, and
x ** y
where y is the name of an integer object. That information doesn't
exist until phase 7. (Preprocessor tokens are converted into tokens
in phase 7, but a preprocessor token is always converted to a single
token.)
Compilers don't necessarily implement each phase as a separate program
(for example, the first several phases are typically combined into the
preprocessor), but the each phase does reflect the information that's
available at that point in translation.
In summary, && is *always* treated as a single token (<OT>even in C++,
where x&(&y) can be legal due to operator overloading</OT>); treating
** as either one or two tokens depending on the context would require
information that's not available at the point where the decision needs
to be made.
Note that typedefs complicate this model a bit. This expression:
(x)-1
might be either a cast applied to a unary minus, or a subtraction
whose left operand is a parenthesized expression. Which one it is
depends on whether x is a currently visible typedef -- which means
that the parser requires feedback from the symbol table. (This gave
me a few gray hairs some years ago, when I was working on a C parser.)
But parsing and symbol table management both take place during phase
7, so in principle at least it's not as much of an issue. The C
grammar *could* have been designed, and some language grammars are
designed, so that a translation unit can be parsed without any
semantic analysis (knowledge of declarations, etc.), but since
typedefs were added relatively late that wasn't practical.