sizeof takes more than lvalues ...
Yes; and it also has more than one syntax:
sizeof expr
sizeof ( type-name )
An expression can, but need not, include outer parentheses; but
to use sizeof on a type-name you must use the parentheses.
consider...
sizeof(int *)
This one requires the parentheses.
sizeof('A')
sizeof(33.029e-3LD)
These two do not. (But "e-3LD" is syntactically wrong; I assume
you mean "e-3L", to make it a long-double constant.)
so are types and constants objects too (note, size of directly taking
the objects as input above, not just lvalues that refer to the objects)
No, but they *are* expressions.
here is another example
sizeof("String Literal")
here, size is receiving only a pointer to the first element of the
string ('S'), so its equivalent to: sizeof(char *)
but that's not what we get, sizeof actually returns the size of the
whole string literal
I don't think sizeof fits cleanly with the theory of lvalues/rvalues.
This is a more interesting case, because of an earlier comp.lang.c
discussion about string literals as initializers:
char s1[] = "this is OK";
char s2[] = ("but this is not");
A string literal -- which is a source code construct, rather than
something you might see at runtime -- can be used as an initializer
for an object of type "array N_opt of char", but if it is to be
used this way, it *must not* be enclosed in parentheses. A number
of compilers allow the parentheses anyway, no doubt because their
parsers have stripped them off by the time the partially-digested
token-sequence is delivered to the part of the compiler front-end
that finishes decorating the parse tree (adjusting types, adding
conversions where implied, and so on).
All of this is something of an aside, though, because given:
char buf[20];
we know that:
sizeof(buf) == sizeof buf
and both arguments to the equality operator are (size_t)20. The
implication here is that, although an array may be surrounded by
parentheses in an expression, it remains an array: it does not
undergo the "degeneration" or "decay", as some like to call it,
that converts "array N of T" to "pointer to T" merely because it
is parenthesized. (It merely happens that some compilers do this
parentheses-stripping a bit "overzealously", as it were, so that
the string-literal-as-initializer works even when a diagnostic is
required.)
The whole point of the "object context" vs "value context" that
Luke Wu brings up is to maintain, within the compiler's parse-tree
code, the notion of whether we want to convert array-objects to
pointer-values by computing &arr[0]. (In addition, we must also
remember whether we need to fetch the value of an ordinary object,
so that in:
int a = 3, b = 5;
... any (or no) code that does not change a or b ...
a = b;
we put the value/"rvalue" of b -- 5 -- into the ["lvalue"] object
a, rather than fetching a's previous value of 3, and trying to put
b's value into 3.) Inside the compiler, this context is generally
implicit: we know, based on the operator(s), whether we want to
find the actual *value* of "a" (3, in this case), or simply remember
the *name*:
=
/ \
a b
can be optimized to:
=
/ \
a 5
(because b is still known to be 5), but not to:
=
/ \
3 5
which is nonsensical. This property of "I want a value on the
right, but an object on the left" is associated with the ordinary
assignment operator "=".
Now, there *is* a significant difference between the sizeof and
assignment operators here, in that sizeof permits any expression
of any (legal) type *as well as* an object-name, while "=" demands
*only* an object-name ("lvalue") on the left: "3 = 5;" is an
error, but "sizeof 3" is OK.
All this means is that, in the part of the compiler that deals
with an "=" operator, we have:
/* assume "struct tree *tree" and tree->op is the op, tree->left
is the LHS and tree->right is the RHS, with tree->monad #defined
as either tree->left or tree->right for monadic (unary)
operators */
switch (tree->op) {
...
case ASSIGN:
if (!is_lvalue(tree->left))
error("assignment operator requires an lvalue");
tree->right = rvalue_convert(tree->right, get_typecode(tree->left));
/* rvalue_convert produces the error if the conversion is invalid */
break;
while in the code for "sizeof" we have:
case SIZEOF:
typecode = get_typecode(tree->monad);
if (is_incomplete_type(typecode)) /* includes sizeof(void) */
error("sizeof incomplete type");
if (is_function_type(typecode))
error("cannot take size of function");
tree->type = TYPE_SIZE_T;
tree->value = type_size(typecode) / type_size(TYPE_CHAR);
tree->is_constant = 1;
/* the division is to get the size in bytes rather than bits */
tree_releasenode(tree->monad); /* no longer needed */
break;
In other words, we do not need to *check* whether the argument to
sizeof is an object or a value, nor do we have to pass it into the
part of the compiler that extracts an rvalue from an lvalue (which
I called "rvalue_convert" here) if necessary, because all we care
about, in evaluating the "sizeof" operator, is the *type* of the
argument to sizeof. (This is no longer true in C99, where we have
to check whether the argument is a VLA and perhaps generate code
at runtime rather than just marking the result as "is a constant".
But C99 is a much more complicated language than C89.)