If I do
p = (char *)p2;
Is this an "explicit" cast? Or just a cast?
I would call it just "a cast". Specifically, the cast (or
"cast expression" or "cast-expression") is the part on the right
hand side of the assigment. From a C99 draft:
6.3.4 Cast operators
Syntax
[#1]
cast-expr:
unary-expr
( type-name ) cast-expr
...
Semantics
[#4] Preceding an expression by a parenthesized type name
converts the value of the expression to the named type.
This construction is called a cast.73 A cast that specifies
no conversion has no effect on the type or value of an
expression.74
__________
73. A cast does not yield an lvalue. Thus, a cast to a
qualified type has the same effect as a cast to the
unqualified version of the type.
74. If the value of the expression is represented with
greater precision or range than required by the type
named by the cast (6.2.1.7), then the cast specifies a
conversion even if the type of the expression is the
same as the named type.
I will note that the phrase "explicit cast" *does* appear in the
(draft) Standard, right in the section I elided above:
Constraints
[#2] Unless the type name specifies a void type, the type
name shall specify qualified or unqualified scalar type and
the operand shall have scalar type.
[#3] Conversions that involve pointers, other than where
permitted by the constraints of 6.3.16.1, shall be specified
by means of an explicit cast.
However, this is the only occurrence of that phrase, and it appears
after the following:
6.2 Conversions
[#1] Several operators convert operand values from one type
to another automatically. This subclause specifies the
result required from such an implicit conversion, as well as
those that result from a cast operation (an explicit
conversion). The list in 6.2.1.7 summarizes the conversions
performed by most ordinary operators; it is supplemented as
required by the discussion of each operator in 6.3.
[#2] Conversion of an operand value to a compatible type
causes no change to the value or the representation.
Paragraph #1 here makes it clear that a cast is an explicit
conversion, and there are other kinds of conversions that are
implicit.
p = p2;
Here are p & p2 are pointers but the compiler "implicitly" converts
one to the other since they are different base pointer types.
You need to show declarations (or at least those base types) first,
here.
To make it "work right" let me supply them now, along with the rest
of the original code:
char *p;
void *p2;
... /* presumably this code sets p2 */
p = (char *)p2;
...
p = p2;
Both assignments are now correct. There is only one cast, on the
right hand side of the first ordinary assignment operator.
Now to my really confused mind, it would be ok to call [the
second expression] an "implicit cast"?
I think it is much better to use the phrase "implicit conversion"
(in this case, from "void *" to "char *"). Reserve the word "cast"
to refer only to the syntactic (source-code) construct.
(I do not know if you have ever worked on the "compiler geek" side
of the computer, but if you have -- or if you can recall the
appropriate college courses -- you might note that modern compilers
generally comprise a number of loosely, or at least not overly
tightly, coupled pieces. One or two front end parts break up the
input into "lexemes" or "tokens" that can be recognized by simple
regular expressions. These feed into a "parser" that turns
syntactically-correct "sentences" into, typically, a "parse tree".
In a toy compiler, this parse tree is almost unconnected to the
original source, although in real compilers the tree is highly
decorated so that each piece can be tracked back to the appropriate
source line or even character, both for error messsages -- which
need to refer you back to the source code -- and for associating
final machine instructions with source code for the debuggers.
The parse tree is then manipulated by a semantic analyzer that
replaces the original tree with something equivalent. Semantics
analysis is where most optimization occurs. Consider the following
C parse tree fragment:
(LIST
(ASSIGN (VAR sum) (DOUBLECONST 0.0))
(ASSIGN (VAR i) (INTCONST 0))
(LOOP-WHILE (LESS-THAN (VAR i) (INTCONST 10))
(ASSIGN (VAR sum)
(PLUS (VAR sum)
(DEREF
(PLUS (ADDR arr)
(TIMES (VAR i) (INTCONST 8))))))
(ASSIGN (var i) (PLUS (VAR i) (INTCONST 1))))
)
This parse tree might arise out of the source code:
sum = 0.0;
for (i = 0; i < 10; i++)
sum += arr
;
where "sum" is a double and "i" is an int. The "INTCONST 8" is
simply the size of a double, and the DEREF nodes occur because
arr "means" *(arr + i), so the syntactic front-end handler simply
generates a (DEREF (PLUS ...)) node in every case.
The semantic analyzer's job includes discovering that the variable
"i" is not used after the loop, so the loop can -- if this is
profitable -- be strength-reduced to eliminate the embedded
multiply:
(LOOP-WHILE (LESS-THAN (VAR i) (INTCONST 80))
(ASSIGN (VAR sum)
(PLUS (VAR sum)
(DEREF
(PLUS (ADDR arr) (VAR i)))))
(ASSIGN (var i) (PLUS (VAR i) (INTCONST 8))))
This loop is equivalent if and only if "i" is dead after this point.
If not, the strength-reduction can still be done by rewriting each
(VAR i) with (VAR i2), where i2 is created by the compiler, and
adding a final (ASSIGN (VAR i) 10) after the loop. [Depending on
the compiler, the strength-reducer may simply automatically insert
a new variable, and let later dead-code optimization remove the
unneeded one(s).]
Following -- or in some, hairier compilers, intermingled with --
semantic analysis and overall high-level optimization, the compiler
must do instruction selection/generation and scheduling. On
modern CPUs, instruction scheduling is just as hard a problem
as semantic analysis, and the selection and generation do affect
which optimizations are appropriate, so this is basically where
all the money is.
On the compiler-geek side, syntax is "mere" syntax: the Lisp-like
parse tree above is just as good as the C fragment, and in many
cases quite preferable. Of course, not all humans feel the same
way about "mere" syntax. A spoonful of "syntactic sugar" often
seems to help the medicine go down.)