lvalues and rvalues

  • Thread starter Nicklas Karlsson
  • Start date
K

Keith Thompson

Isn't that addition redundant? I can't think of *any* context where an
lvalue that doesn't desginate an object can be safely evaluated.

Yes, you're right. (I get the feeling this isn't the first time I've
made that mistake.)

The current standard says (C99 6.3.2.1p1):

An lvalue is an expression with an object type or an incomplete
type other than void;

if an lvalue does not designate an object when it is evaluated,
the behavior is undefined.

The second part of the sentence is ok. For example:

int i;
int *ptr = NULL;

i = *ptr;

*ptr is an lvalue, it's being evaluated in a context that doesn't
require an lvalue, and the behavior is undefined because it doesn't
currently designate an object. <OT>(I think the way C++ expresses it
is that it's evaluated as an lvalue, and then an lvalue-to-rvalue
conversion is performed.)</OT>

The real problem is that the first part of the sentence is too broad;
it makes 42 an lvalue, which it clearly is not intended to be.

The C90 standard's definition (an lvalue is an expression ... that
designates an object) was too narrow; it implied that *ptr is an
lvalue if and only if ptr currently contains a valid non-null pointer
value, whereas lvalue-ness needs to be determinable at compile time.

Either an lvalue needs to be defined as an expression that
*potentially* designates an object (defining "potentially" is tricky),
or the definition needs to enumerate the set of expressions that are
lvalues (I think they can be defined syntactically), or it can just
say that an lvalue is an expression whose description in section 6.5
says it's an lvalue.
 
N

Nicklas Karlsson

"I'd say "address" rather than "starting address". In C, an address
is
typed. It doesn't point to the beginning of an object, it points to
the object."

So since an address is typed, int* for example, one says that it
points to the entire object? It points to an "int", but nevertheless
it still points to the first allocated byte of size(int) bytes
allocated for an int

"No, an lvalue, like any expression, has a type."

I thougt that the result yielded by an expression has a type, not the
expression itself, could you give an example of how an expression has
a type?

"Too many words. An rvalue is the value of an expression, that's all.
It's also a term that the Standard doesn't use, and I see little point
in using it while discussing C. <OT>(C++ is another matter.)</OT>"

OK, but even tho the term is avoided, could I re-sentence and say
something along the lines with: An rvalue is the result (value)
yielded any expression, except if the expression is an lvalue
appearing on LHS?
 
K

Keith Thompson

Nicklas Karlsson said:
"I'd say "address" rather than "starting address". In C, an address
is typed. It doesn't point to the beginning of an object, it points
to the object."

So since an address is typed, int* for example, one says that it
points to the entire object? It points to an "int", but nevertheless
it still points to the first allocated byte of size(int) bytes
allocated for an int

Well, that's one way to look at it, but I find it clearer to think of
an address (pointer value) as pointing to the entire object. It's one
way in which a C pointer value is conceptually different, and at a
higher level, than a machine address.
"No, an lvalue, like any expression, has a type."

I thougt that the result yielded by an expression has a type, not the
expression itself, could you give an example of how an expression has
a type?

An expression has the type of the value that it yields. For example,
the expression ``2 + 2'' is of type int, as does the value, 4, that it
yields.

I'm not 100% certain that the standard expresses it this way, but it
seems reasonable.
"Too many words. An rvalue is the value of an expression, that's all.
It's also a term that the Standard doesn't use, and I see little point
in using it while discussing C. <OT>(C++ is another matter.)</OT>"

OK, but even tho the term is avoided, could I re-sentence and say
something along the lines with: An rvalue is the result (value)
yielded any expression, except if the expression is an lvalue
appearing on LHS?

I suppose so.
 
S

Stefan Ram

Nicklas Karlsson said:
"I'd say "address" rather than "starting address". In C, an
address is typed. It doesn't point to the beginning of an
object, it points to the object."

int main( void ){ char a[ 9 ]; char * p = a; return !p; }

Above, p »points« to the beginning of the object a.

int main( void ){ char a[ 9 ]; char( *p )[ 9 ]= &a; return !p; }

Above, p indeed points to a.
 
E

Eric Sosman

Well, that's one way to look at it, but I find it clearer to think of
an address (pointer value) as pointing to the entire object. It's one
way in which a C pointer value is conceptually different, and at a
higher level, than a machine address.

It seems to me that an int* could address any byte of an
int object (maybe even no byte at all), so long as the platform
knows how to get at the entire int given the int* value.
Converting to a char* must produce a pointer to the [0] element
of a sizeof(int)-byte array that overlays the int object, but
that's no obstacle: conversion need not be a bit-for-bit copy.

Here's a challenge: If an int* does *not* address the first
byte of an int object, can a portable[*] program detect the fact?

[*] "Portable" instead of "strictly conforming," because the
output of an S.C. program cannot depend on implementation-defined,
unspecified, or undefined behavior. Since the representation of
an int* value is unspecified, an S.C. program that detected something
about it could not report what it discovered. But let's take an
informal notion of "portable" to mean that we're not allowed to
inspect the bits of the int* itself and try to find the bits of
a char* somewhere inside it. What can a "portable" program do to
show or refute "An int* holds the address of the byte just after
the int object," for example?
 
K

Keith Thompson

Keith Thompson said:
An expression has the type of the value that it yields. For example,
the expression ``2 + 2'' is of type int, as does the value, 4, that it
yields.

I'm not 100% certain that the standard expresses it this way, but it
seems reasonable.
[...]

In fact the standard does talk about the type of an expression.
Just a few examples:

C99 6.5.1p3:
A constant is a primary expression. Its type depends on its form
and value, as detailed in 6.4.4.

p4:
A string literal is a primary expression. It is an lvalue with
type as detailed in 6.4.5.

p5:
A parenthesized expression is a primary expression. Its type and
value are identical to those of the unparenthesized expression.

6.5.2.1p1:
One of the expressions shall have type ‘‘pointer to object
_type_’’, the other expression shall have integer type, and the
result has type ‘‘_type_’’.
(Underscores denote italics.)
 
P

Phil Carmody

Keith Thompson said:
Well, that's one way to look at it, but I find it clearer to think of
an address (pointer value) as pointing to the entire object. It's one
way in which a C pointer value is conceptually different, and at a
higher level, than a machine address.

Yeah, but no, but yeah, but no, but most architectures have some
kind of indexed indirect addressing mode; so an address register
could be numerically equal to the address of the start of an object,
but used to access the whole object.

And in fact on my alpha, the machine address points to *more*
than the object, if the object is less than 64 bits - char access
requires post-processing to get the single octet desired from the
whole word.

Phil
 
K

Keith Thompson

Phil Carmody said:
Yeah, but no, but yeah, but no, but most architectures have some
kind of indexed indirect addressing mode; so an address register
could be numerically equal to the address of the start of an object,
but used to access the whole object.

And an address register might very well contain something that looks
like, and can be operated on as, an integer.

A C pointer value, conceptually at least, does not.
 
P

Phil Carmody

Eric Sosman said:
Well, that's one way to look at it, but I find it clearer to think of
an address (pointer value) as pointing to the entire object. It's one
way in which a C pointer value is conceptually different, and at a
higher level, than a machine address.

It seems to me that an int* could address any byte of an
int object (maybe even no byte at all), so long as the platform
knows how to get at the entire int given the int* value.
Converting to a char* must produce a pointer to the [0] element
of a sizeof(int)-byte array that overlays the int object, but
that's no obstacle: conversion need not be a bit-for-bit copy.

Here's a challenge: If an int* does *not* address the first
byte of an int object, can a portable[*] program detect the fact?

Sometimes. If the int* compares equal to 0, then it clearly
does not address the first byte of an int object. The answer
can not be 'always', IMHO, as I can't see any way a portable
program can know if a pointer's been freed or not.

Phil
 
I

Ike Naar

"No, an lvalue, like any expression, has a type."

I thougt that the result yielded by an expression has a type, not the
expression itself, could you give an example of how an expression has
a type?

If pd has type pointer-to-double, then the expression ``*pd''
has type double, even when pd does not point to an object.
 
P

Phil Carmody

Keith Thompson said:
And an address register might very well contain something that looks
like, and can be operated on as, an integer.

A C pointer value, conceptually at least, does not.

I don't see how that addresses my point. A machine's address
register can be used to refer to more than just an atomic
entity at a single address.

Phil
 
K

Keith Thompson

Nicklas Karlsson said:
"I'd say "address" rather than "starting address". In C, an
address is typed. It doesn't point to the beginning of an
object, it points to the object."

int main( void ){ char a[ 9 ]; char * p = a; return !p; }

Above, p »points« to the beginning of the object a.

It points to the first element of the object a. It does so because
the expression ``a'' (of type char[9]) was implicitly converted to
type char*.

Another example:

struct big_struct {
/* a bunch of member declarations */
};
struct big_struct arr[10];
struct big_struct *ptr = arr; /* another implicit conversion */

Now the expression ``ptr'' points to the first element of arr,
which is of type struct big_struct. It doesn't, I would say,
point to *the beginning* of the structure; it points to the entire
structure. On the other hand, ``(char*)ptr'' points to the first
byte of the structure.

On the other hand, it's a bit difficult to say what ``(void*)ptr''
points to.

I'm not claiming that my way of looking at it (pointers point to
entire objects) is right and yours (pointers point to the beginning of
objects) is wrong. I just find my way clearer and more consistent
with C semantics. It more clearly reflects the ways in which C
pointers are not the same as machine-level addresses.

[...]
 
K

Keith Thompson

Phil Carmody said:
I don't see how that addresses my point. A machine's address
register can be used to refer to more than just an atomic
entity at a single address.

I'm not sure exactly what your point is. Can you clarify? C doesn't
describe pointers in terms of the behavior of a machine's address
register.

A machine address register typically contains, say, 32 or 64
bits of data that usually refer to some location in memory.
The meaning of those 32 or 64 (or whatever) bits of data depends
on what instructions are used to manipulate them. (Some machines
don't have address registers; they just store addresses in general
purpose registers.)

A C address / pointer value, on the other hand, has a type associated
with it (though this association probably exists only in the source
code and at compile time).

If you have an array of structures, then a pointer to the entire
array, a pointer to the first element of the array, and a pointer
to the first member of the first element of the array are likely
to have the same representation, but I don't think that implies
that a pointer doesn't (conceptually) point to the entire object.
 
S

Seebs

If you have an array of structures, then a pointer to the entire
array, a pointer to the first element of the array, and a pointer
to the first member of the first element of the array are likely
to have the same representation, but I don't think that implies
that a pointer doesn't (conceptually) point to the entire object.

In particular, you can have three pointers which, converted to (void *),
compare equal, but which point to different objects. One could point
to the array, one to the first structure in that array, and one to the
first element in that structure.

More interestingly, so far as I can tell, the three pointers could have
different logical bounds, such that a bounds-checking implementation could
react differently to attempts to memcpy a large number of bytes from or
to them, even though the pointers compare equal.

-s
 
S

Stephen Sprunk

A machine address register typically contains, say, 32 or 64
bits of data that usually refer to some location in memory.
The meaning of those 32 or 64 (or whatever) bits of data depends
on what instructions are used to manipulate them. (Some machines
don't have address registers; they just store addresses in general
purpose registers.)

A C address / pointer value, on the other hand, has a type associated
with it (though this association probably exists only in the source
code and at compile time).

A good point. A C pointer has a type, which implies a size; a machine
address does not have a type, so the particular instruction(s) that
use/manipulate that (typeless) address must imply the size of the
object. The compiler uses pointer type information to choose those
instructions, though...
If you have an array of structures, then a pointer to the entire
array, a pointer to the first element of the array, and a pointer
to the first member of the first element of the array are likely
to have the same representation, but I don't think that implies
that a pointer doesn't (conceptually) point to the entire object.

True, at a C level, but is there really a difference at the machine
level on real-world implementations?

More importantly, assuming a T is a multi-byte object and pointers are
simple memory addresses, is it possible for &T to be the address of a
byte _other than_ the first (i.e. lowest) byte of the object?

S
 
S

Stephen Sprunk

"I'd say "address" rather than "starting address". In C, an address
is typed. It doesn't point to the beginning of an object, it points to
the object."

So since an address is typed, int* for example, one says that it
points to the entire object?
Yes.

It points to an "int", but nevertheless it still points to the first
allocated byte of size(int) bytes allocated for an int

I think that would fall into the "implementation detail" category. In
theory, the representation _could_ be the address of the _last_ byte of
the object, and conversion to (char *) could subtract sizeof(int)-1
bytes; a conforming program couldn't tell the difference. I doubt any
implementation (other than the famed DS9k) is actually that evil, though.
"No, an lvalue, like any expression, has a type."

I thougt that the result yielded by an expression has a type, not the
expression itself, could you give an example of how an expression has
a type?

Consider sizeof(1+1). The expression is not evaluated, so there is no
result and therefore no type to the result. Where would sizeof get the
type information it needs if not from the expression itself?

The type of the result of evaluating an expression is the same as the
type of the expression itself, though, so it's generally not an
important distinction.
"Too many words. An rvalue is the value of an expression, that's all.
It's also a term that the Standard doesn't use, and I see little point
in using it while discussing C. <OT>(C++ is another matter.)</OT>"

OK, but even tho the term is avoided, could I re-sentence and say
something along the lines with: An rvalue is the result (value)
yielded any expression, except if the expression is an lvalue
appearing on LHS?

The LHS of an assignment operator is just the most obvious place one
needs an lvalue; there are others (e.g. the operand of ++).

S
 
K

Keith Thompson

Stephen Sprunk said:
A good point. A C pointer has a type, which implies a size; a machine
address does not have a type, so the particular instruction(s) that
use/manipulate that (typeless) address must imply the size of the
object. The compiler uses pointer type information to choose those
instructions, though...

Note that the size is just one of many attributes of the type. The
type also affects which operations can be performed on the pointed-to
object. I suggest that the emphasis on size implies an concentration
on the machine-level behavior rather than on C semantics.
True, at a C level, but is there really a difference at the machine
level on real-world implementations?

Probably not, but there's no fundamental reason why there couldn't be.

Except that there are systems where a pointer to char is represented
differently than a pointer to int. On word-addressed machines, char*
pointers need to store offset information as well as the machine
address.
More importantly, assuming a T is a multi-byte object and pointers are
simple memory addresses, is it possible for &T to be the address of a
byte _other than_ the first (i.e. lowest) byte of the object?

No, because it isn't the address of a byte at all. It's the address
of an object of type T.

I know that's not what you meant to ask, so here are a couple of ways
to say what I think you meant.

Is it possible for the address of T to have a different representation
than the address of T's first byte? Alternatively, is it possible for
the result of converting the address of T to, say, uintptr_t to
differe from the result of converting the address of T's first byte to
uintptr_t?

I think the answer is yes. I think I would be possible (but perverse)
for an implementation to represent the address of T the same way as
the address of the *last* byte of T, or just past the end of T. If
so, then converting from some_big_type* to void* or to char* would
have to internally subtract sizeof T (or sizeof T - 1) from the
machine address.

And think about "fat pointers". Imagine a system where a C pointer
object more than just a machine address. It might, for example,
contain the base address of the largest containing object, the size of
that object, the offset of the pointed-to object, and the size of the
pointed-to object. Such a representation is useful for run-time
bounds checking.
 
K

Keith Thompson

Nicklas Karlsson said:
"I'd say "address" rather than "starting address". In C, an address
is
typed. It doesn't point to the beginning of an object, it points to
the object."

So since an address is typed, int* for example, one says that it
points to the entire object? It points to an "int", but nevertheless
it still points to the first allocated byte of size(int) bytes
allocated for an int
[...]

Can you please use the conventional "> " prefix for quoted text,
and leave the attribution line in place? It makes it much easier to
follow the discussion. I see you're posting through Google Groups;
it should handle this for you automatically.
 
F

frank

Close.  Evaluation happens at run time; the compiler doesn't
necessarily know which object is being designated.  It does know
the type of the object -- or rather the type imposed by the lvalue.
This might not be the declared type of the object itself.  In fact,
the object might not even have a declared type.

Some examples (I haven't taken the time to test these):

    int x, y, *ptr;
    switch (rand() % 3) {
        case 0: ptr = NULL; break;
        case 1: ptr = &x;   break;
        case 2: ptr = &y;   break;
    }
    *ptr; /* an lvalue, but the compiler has no idea whether it
             designates x, y, or no object at all */

    void *vptr = malloc(sizeof(int));
    assert(vptr != NULL);
    /* We've just created an object, but it has no inherent type. */
    *(int*)vptr;      /* the type is imposed by the lvalue */
    *(unsigned*)vptr; /* same object, different lvalue, different type */


Too many words.  An rvalue is the value of an expression, that's all.
It's also a term that the Standard doesn't use, and I see little point


I'd say "address" rather than "starting address".  In C, an address is
typed.  It doesn't point to the beginning of an object, it points to
the object.


No, an lvalue, like any expression, has a type.

At least 2 good things happened in this thread. OP surmised what he
had gathered. You included referent source to illustrate. Makes for
good reading.
--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top