When shorts are longer than longs !

K

Keith Thompson

I'm not so sure about the second part of that -- it seems to me that
evaluating an lvalue expression that doesn't designate an object is bad
news no matter what the context.

You're right. I think was still stuck on the 42 example when I wrote
that, but of course that's irrelevant once the definition is fixed.

So this:

int *ptr = NULL;
int i = *ptr;

is undefined behavior; ``*ptr'' appears in a context that doesn't
require an lvalue, but the behavior is undefined anyway.
 
T

Tim Rentsch

Richard Heathfield said:
Tim Rentsch said:


Although the definition I posted was indeed flawed (as I knew it had
to be, because it's too short to be right!), it is a reasonable
reflection of the mental model I have of lvalues. I suspect that
this might be true for other people, too. Therefore, whilst it is
inadequate from a technical perspective, from a common-sense
perspective it more or less works, and cuts away a lot of confusing
side-issues that are rarely relevant to the working programmer. I'm
not suggesting that those side-issues aren't sometimes important,
but I think they're rare enough that, to all intents and purposes,
my definition - though flawed - is a reasonable rule of thumb. And
I think that's the point I was trying to make, although it was some
time ago now that I made it and a lot of water has passed under
this particular bridge since then.

If that's clear, I'm glad. If it's not, I'm sorry, but it's the best
I can do!

Interesting. This didn't come through at all (at least not to
me) out of the earlier posting. So the followups produced
some real value in this case.

Something I find interesting is that you seem to think of
"lvalue" in terms of run-time notions. For me "lvalue"
is associated with compile-time notions. I might say it
this way (done very fast, and off the top of my head):

The term lvalue is a compile-time notion that refers to
expressions of a certain form (identified in later sections).
An lvalue expression is meant to designate an object at
run-time. If evaluated, an lvalue expression that does
not designate an object is undefined behavior.

(Footnote) At run-time, an lvalue is an "address"; it
might be an actual machine address, or it might be some
sort of special memory address, such as a register number.
The "address" of the lvalue serves to "locate" the object
that is associated with the lvalue expression for this
evaluation.

Even ignoring the run-time/compile-time differences, I would
never say an lvalue is an object; clearly our mental models
differ significantly in this respect. This difference explains
why I had trouble thinking your earlier posting was meant
seriously. (And on that point, sorry for the confusion,
although something good has consequenced as a result.)
 
T

Tim Rentsch

Keith Thompson said:
Tim Rentsch said:
[snip]
My biggest pet peeve is the use of "definitions" that aren't really
definitions as I understand the word. A definition of a "foobar"
should allow me to determine unambiguously, for any given entity,
whether that entity is a foobar or not. The determination needn't be
trivial; it can depend on other definitions (and ultimately it has
to). But an arbitrary statement *about* foobars, isn't necessarily a
definition of the word "foobar".

I'm with you on this one. The Standard needs to distinguish (and
distinguish clearly) between "definitions" that (a) exactly
define a term, (b) define some constituent elements of a term but
leave others out, and (c) impose some requirements on what items
qualify to be put under the heading of some term (but don't
necessarily tell the whole story). There are numerous examples,
I'm pretty sure, in each of the three categories, appearing in
the Standard. [snip]
For lvalue, I don't know whether it's just poor wording, or
if there was some other more conceptual difficulty. Certainly
the definition of lvalue could stand some improvement.

The term "lvalue" is actually difficult to define. [snip C90 case]

C99 replaced this flaw with another one:

An _lvalue_ is an expression with an object type or an incomplete
type other than void; if an lvalue does not designate an object
when it is evaluated, the behavior is undefined.

As a definition, the notion of lvalue is flawed not just slightly
but horribly. Taken literally, this sentence means an expression like

i + 1

would be undefined behavior.

Considering that, I take back my earlier statement. The problem
with the "lvalue" definition is not just poor wording; it's
inherently flawed, and should be redone from the ground up.
 
T

Tim Rentsch

Richard Heathfield said:
Tim Rentsch said:



Yes, although I tend to think of it more as being in terms of the
"abstract machine".

In this context (ie, run-time vs compile-time) I would say
that's a distinction without a difference. Or to say that
another way, I wasn't meaning to differentiate (and it
doesn't seem especially helpful) to differentiate between
abstract machine and physical machine in this particular
regard.
<I've read your mental model of lvalue with interest, but snipped it
here because I have nothing specific to say about it.>

Yes. I suppose I think of a computer program mostly in terms of its
runtime behaviour rather than its lexical aspects. Obviously there
are times where lexicography(?!?) is important - sizeof *p being an
obvious contender - but I'm talking generally here.

Certainly there's nothing wrong with thinking about programs in
terms of their execution. What I found confusing is that the
Standard clearly uses "lvalue" to indicate a compile-time notion.
Given that it does, it was surprising (or perhaps even astonishing)
to find it used to indicate a run-time notion, especially so since
there was no remark explaining the shift. It might be helpful to
reflect on that, if you're interested in giving better explanations
to the newsgroup. I know I've learned something here.
 
J

James Kuyper

Joe Wright wrote:
....
OK, I'll start. In order to be an lvalue the expression must designate
an object and must be on the LHS of an assignment. It makes no sense (to
me) to define:

static int i;

and later declare i an lvalue in a value context. For example
'printf("%d\n",i)' will print 0 and treats i as an rvalue. Again the
expression i becomes an lvalue only on the LHS of an assignment.

How about increment and decrement operations?
 
T

Tim Rentsch

Joe Wright said:
Tim said:
Keith Thompson said:
[snip]
My biggest pet peeve is the use of "definitions" that aren't really
definitions as I understand the word. A definition of a "foobar"
should allow me to determine unambiguously, for any given entity,
whether that entity is a foobar or not. The determination needn't be
trivial; it can depend on other definitions (and ultimately it has
to). But an arbitrary statement *about* foobars, isn't necessarily a
definition of the word "foobar".
I'm with you on this one. The Standard needs to distinguish (and
distinguish clearly) between "definitions" that (a) exactly
define a term, (b) define some constituent elements of a term but
leave others out, and (c) impose some requirements on what items
qualify to be put under the heading of some term (but don't
necessarily tell the whole story). There are numerous examples,
I'm pretty sure, in each of the three categories, appearing in
the Standard. [snip]
For lvalue, I don't know whether it's just poor wording, or
if there was some other more conceptual difficulty. Certainly
the definition of lvalue could stand some improvement.
The term "lvalue" is actually difficult to define. [snip C90 case]

C99 replaced this flaw with another one:

An _lvalue_ is an expression with an object type or an incomplete
type other than void; if an lvalue does not designate an object
when it is evaluated, the behavior is undefined.

As a definition, the notion of lvalue is flawed not just slightly
but horribly. Taken literally, this sentence means an expression like

i + 1

would be undefined behavior.

Considering that, I take back my earlier statement. The problem
with the "lvalue" definition is not just poor wording; it's
inherently flawed, and should be redone from the ground up.

OK, I'll start. In order to be an lvalue the expression must designate
an object and must be on the LHS of an assignment. It makes no sense (to
me) to define:

static int i;

and later declare i an lvalue in a value context. For example
'printf("%d\n",i)' will print 0 and treats i as an rvalue. Again the
expression i becomes an lvalue only on the LHS of an assignment.

Seems like a good attempt, but I think it goes a bit in the wrong
direction. As the Standard uses the term, expressions can be
lvalues even on the right hand side of an assignment, or as
operands of other operators such as +. (In Algol 68 terms, an
lvalue is an expression that is of some 'ref' type.) I do think
the definition for lvalue should be redone, but without having to
make wholesale changes to all the uses. For example, there is
6.3.2.1 p 2

Except when it is the operand of the sizeof operator, the
unary & operator, the ++ operator, the -- operator, or the
left operand of the . operator or an assignment operator, an
lvalue that does not have array type is converted to the
value stored in the designated object (and is no longer an
lvalue).

It seems like a good idea to keep this meaning of lvalue even
while re-doing the definition.

Also, I don't think lvalue-ness should depend on any run-time
condition (such as whether an expression designates an actual
object). Whether an expression is an lvalue or not should be
a static property of the code; if that expression is evaluated
and needs to designate an object, fine, make that undefined
behavior, but that shouldn't change whether the expression is
an lvalue. Now that I think about it, if lvalue-ness is allowed
to depend on run-time behavior, it could be an "lvalue" one
moment and not an "lvalue" the next. Such consequences seem
fraught with difficulty.
 
T

Tim Rentsch

Richard Heathfield said:
Tim Rentsch said:


Um, I meant the abstract machine as opposed to the abstract
language, I think. Help! How can I put this? Okay... I see a
program in terms of control flow and runtime data structures,
rather than nouns and verbs and stuff. Is that better?

I think I see. You mean the same thing I do ("run-time"
as opposed to "compile-time") but are just using a different
label.

As I have pointed out upthread, I know it's a flawed notion, but
it's still one that I find useful.

Oh, I didn't mean to say the notion was flawed. (Perhaps it is
flawed, but I certainly didn't mean to say that.) What confused
me was putting the word in a different universe of discourse, as
for example defining the word "green" to mean "sounds of
frequency higher than middle C". That might be a useful piece of
terminology, but I expect it would catch most people pretty off
guard.

I'm not sure that I can explain it any clearer than I already have.
That doesn't mean I think I've been terribly clear - it just means
I'm kind of struggling to find the right words to describe how I
think of lvalues and why.

I think what I'm suggesting is that your model for what sorts of
things the term applies to is different from the model of many
(most?) other people who read the newsgroup. If the models
really are different, then first it's good to explain that; once
the domain of discourse is identified and understood on both
sides, the explanation of what is meant becomes much easier.

Perhaps not the deepest thought I've ever had, but hey, it's
Saturday morning. :)
 
K

Keith Thompson

Joe Wright said:
Tim said:
Keith Thompson said:
[snip]
My biggest pet peeve is the use of "definitions" that aren't really
definitions as I understand the word. A definition of a "foobar"
should allow me to determine unambiguously, for any given entity,
whether that entity is a foobar or not. The determination needn't be
trivial; it can depend on other definitions (and ultimately it has
to). But an arbitrary statement *about* foobars, isn't necessarily a
definition of the word "foobar".
I'm with you on this one. The Standard needs to distinguish (and
distinguish clearly) between "definitions" that (a) exactly
define a term, (b) define some constituent elements of a term but
leave others out, and (c) impose some requirements on what items
qualify to be put under the heading of some term (but don't
necessarily tell the whole story). There are numerous examples,
I'm pretty sure, in each of the three categories, appearing in
the Standard. [snip]
For lvalue, I don't know whether it's just poor wording, or
if there was some other more conceptual difficulty. Certainly
the definition of lvalue could stand some improvement.
The term "lvalue" is actually difficult to define. [snip C90 case]

C99 replaced this flaw with another one:

An _lvalue_ is an expression with an object type or an incomplete
type other than void; if an lvalue does not designate an object
when it is evaluated, the behavior is undefined.

As a definition, the notion of lvalue is flawed not just slightly
but horribly. Taken literally, this sentence means an expression like

i + 1

would be undefined behavior.

Considering that, I take back my earlier statement. The problem
with the "lvalue" definition is not just poor wording; it's
inherently flawed, and should be redone from the ground up.

OK, I'll start. In order to be an lvalue the expression must designate
an object and must be on the LHS of an assignment. It makes no sense
(to me) to define:

static int i;

and later declare i an lvalue in a value context. For example
'printf("%d\n",i)' will print 0 and treats i as an rvalue. Again the
expression i becomes an lvalue only on the LHS of an assignment.

I think by "on the LHS of an assignment", you mean (or you should
mean) any context that requires an lvalue. The LHS of an assignment
is the most obvious such context, but there are a number of others,
such as the operand of a unary "&".

And your definition has the same problem that the C90 definition had:
it requires an lvalue to actually designated an object, something that
cannot always be determined until execution time. Given
int *ptr = NULL;
the expression ``*ptr'' doesn't designate an object, but it is an
lvalue (and evaluating it invokes UB).

But I don't really see the need to restrict lvalue-ness to contexts
that require an lvalue. For example:

int x, y = 0;
x = y;

In the assignment, x is clearly an lvalue, and it needs to be. y is
also an lvalue, even though a non-lvalue would be valid in that
context, so its lvalue-ness is irrelevant. In effect the lvalue is
converted to a non-lvalue by taking the value of the object y.
 
T

Tim Rentsch

Joe Wright said:
Tim said:
Joe Wright said:
Tim Rentsch wrote:


[snip]
My biggest pet peeve is the use of "definitions" that aren't really
definitions as I understand the word. A definition of a "foobar"
should allow me to determine unambiguously, for any given entity,
whether that entity is a foobar or not. The determination needn't be
trivial; it can depend on other definitions (and ultimately it has
to). But an arbitrary statement *about* foobars, isn't necessarily a
definition of the word "foobar".
I'm with you on this one. The Standard needs to distinguish (and
distinguish clearly) between "definitions" that (a) exactly
define a term, (b) define some constituent elements of a term but
leave others out, and (c) impose some requirements on what items
qualify to be put under the heading of some term (but don't
necessarily tell the whole story). There are numerous examples,
I'm pretty sure, in each of the three categories, appearing in
the Standard.
[snip]
For lvalue, I don't know whether it's just poor wording, or
if there was some other more conceptual difficulty. Certainly
the definition of lvalue could stand some improvement.
The term "lvalue" is actually difficult to define. [snip C90 case]

C99 replaced this flaw with another one:

An _lvalue_ is an expression with an object type or an incomplete
type other than void; if an lvalue does not designate an object
when it is evaluated, the behavior is undefined.

As a definition, the notion of lvalue is flawed not just slightly
but horribly. Taken literally, this sentence means an expression like

i + 1

would be undefined behavior.

Considering that, I take back my earlier statement. The problem
with the "lvalue" definition is not just poor wording; it's
inherently flawed, and should be redone from the ground up.
OK, I'll start. In order to be an lvalue the expression must designate
an object and must be on the LHS of an assignment. It makes no sense (to
me) to define:

static int i;

and later declare i an lvalue in a value context. For example
'printf("%d\n",i)' will print 0 and treats i as an rvalue. Again the
expression i becomes an lvalue only on the LHS of an assignment.

Seems like a good attempt, but I think it goes a bit in the wrong
direction. As the Standard uses the term, expressions can be
lvalues even on the right hand side of an assignment, or as
operands of other operators such as +. (In Algol 68 terms, an
lvalue is an expression that is of some 'ref' type.) I do think
the definition for lvalue should be redone, but without having to
make wholesale changes to all the uses. For example, there is
6.3.2.1 p 2

Except when it is the operand of the sizeof operator, the
unary & operator, the ++ operator, the -- operator, or the
left operand of the . operator or an assignment operator, an
lvalue that does not have array type is converted to the
value stored in the designated object (and is no longer an
lvalue).

It seems like a good idea to keep this meaning of lvalue even
while re-doing the definition.

Also, I don't think lvalue-ness should depend on any run-time
condition (such as whether an expression designates an actual
object). Whether an expression is an lvalue or not should be
a static property of the code; if that expression is evaluated
and needs to designate an object, fine, make that undefined
behavior, but that shouldn't change whether the expression is
an lvalue. Now that I think about it, if lvalue-ness is allowed
to depend on run-time behavior, it could be an "lvalue" one
moment and not an "lvalue" the next. Such consequences seem
fraught with difficulty.

I am trying to keep it simple. Does that quote from 6.3.2.1 really make
sense to you? I think it says that not only is i an lvalue, ++i is as
well because it has object type. Not for me. If I can't do '++i = 9' I
don't have an lvalue. As pointed out previously, 3 has object type.

I think it's better for lvalue-ness to be a property of just an
expression, independent of the context in which the expression is
used. The paragraph in 6.3.2.1 (and yes it does make sense to
me) is one example of that, but the principle is more general
than the one example.

What purpose is served to impute lvalue status on an expression not used
in an assignment. All such expressions yield the value stored in the
designated object in that case. An rvalue.

I think you haven't thought this through as thoroughly as
it could be. Let me try an example and see if it helps.

Consider two expressions: 'i' and '3'. The first is an lvalue,
the second is not. We compare some uses of these expressions:

(i) = 0 (3) = 0
&(i) &(3)
++(i) ++(3)

The expressions on the left are legal; the expressions on the
right are not. How are the two different? They are different
because on the left the expressions inside the parentheses are
lvalues, and on the right they aren't.

Now, you might think that "lvalue" isn't a good term for the
property needed here, and maybe that's right. However, I think
you'll agree that this property is an important property to
distinguish, and the current standard uses "lvalue" as the
term that identifies it[*]. So I think it's better to keep
using "lvalue" to identify those expressions that are legal
in contexts like the ones (on the left) shown above.

[*] Not counting some minor glitches in the current definition
of lvalue, which doesn't affect the basic point.

In a footnote on 'lvalue' the Standard explains the 'l' stands for left
operand of an assignment and might well stand for 'locator value'. It
mentions that 'rvalue' for Standard purposes is the 'value of an
expression'.

Footnote 53 in n1256 says:

The name ``lvalue'' comes originally from the assignment
expression E1 = E2, in which the left operand E1 is required to
be a (modifiable) lvalue. It is perhaps better considered as
representing an object ``locator value''. What is sometimes
called ``rvalue'' is in this International Standard described as
the ``value of an expression''.

The term "lvalue" predates the C Standard (indeed, it predates C
itself), and the footnote is explaining that historical origin,
not what the term means in the Standard. The second sentence
(starting "It is perhaps better considered...") makes a comment
about how "lvalue" should be understood as it is now used in
the Standard document.
 
K

Keith Thompson

Joe Wright said:
In my opinion, ++i having object type (int) does not make it an lvalue.

No, of course not. The point is that the *operand* of "++" (``i'' in
this case) must be an lvalue. Your proposed definition refers to the
LHS of an assignment; that's only one of the many contexts in which
an lvalue is required. Other such contexts include the operand of
"++" or "--", the operand of the unary "&", and probably some others.
(Assignment already covers the compound assignment operators.)

(Another issue is whether, for example, 42 is an lvalue. It clearly
isn't, but a literal reading of C99 6.3.2.1p1 implies that it is,
since it's an expression with an object type. The only sensible
conclusion is that C99 6.3.2.1p1 needs to be fixed, both because
it doesn't express the intent and because a literal reading pretty
much breaks the language. There's more to lvalue-ness than being
an expression with an object type (or an incomplete type other
than void).)
 
K

Keith Thompson

[snip good examples]

It's conceptually simpler, IMHO, for lvalue-ness to depend on the
expression itself, not on the context in which it appears. An lvalue
is a kind of expression. ``2+2'' is an additive-expression,
and it continues to be an additive-expression even when used in
a context that doesn't require an additive-expression. ``obj''
is an lvalue, and it continues to be an lvalue even when used in a
context that doesn't require an lvalue. (And similarly, ``0'' is
both an octal-constant and a null pointer constant, regardless of
where it appears, even though some might find that counterintuitive.)
Now, you might think that "lvalue" isn't a good term for the
property needed here, and maybe that's right. However, I think
you'll agree that this property is an important property to
distinguish, and the current standard uses "lvalue" as the
term that identifies it[*]. So I think it's better to keep
using "lvalue" to identify those expressions that are legal
in contexts like the ones (on the left) shown above.

[*] Not counting some minor glitches in the current definition
of lvalue, which doesn't affect the basic point.
In a footnote on 'lvalue' the Standard explains the 'l' stands for left
operand of an assignment and might well stand for 'locator value'. It
mentions that 'rvalue' for Standard purposes is the 'value of an
expression'.

Footnote 53 in n1256 says:

The name ``lvalue'' comes originally from the assignment
expression E1 = E2, in which the left operand E1 is required to
be a (modifiable) lvalue. It is perhaps better considered as
representing an object ``locator value''. What is sometimes
called ``rvalue'' is in this International Standard described as
the ``value of an expression''.

The term "lvalue" predates the C Standard (indeed, it predates C
itself), and the footnote is explaining that historical origin,
not what the term means in the Standard. The second sentence
(starting "It is perhaps better considered...") makes a comment
about how "lvalue" should be understood as it is now used in
the Standard document.

As I understand it, the C standard's definition of lvalue, even
setting aside the glitches, is quite different from the original
meaning. We're stuck with the standard's meaning, but it's worth
exploring the original meaning.

In C, an lvalue is a particular kind of expression. Thus it's
something that exists in a program, and is recognized during
translation phase 7.

In the original meaning, as I understand it an lvalue is, as the name
implies, a kind of *value*, and an rvalue is another kind of value.
Note that expressions exist in a program, and values exist during
the execution of a program, so they're very different (but closely
related) things. A value is what you get during execution as
the result of evaluating an expression (something that exists in a
program and is recognized during TP7). The *expression* 42 consists
of a token, which in turn consists of the two source characters '4'
and '2'. When the expression is evaluated, it yields a *value*
that might be stored in, say, a 16-bit word as the bit pattern
0000000000101010; the '4' and the '2' no longer exist.

Using the non-C meanings of the terms, an expression can be
evaluated in one of two ways, depending on the context in which it
appears. In some contexts, particularly the LHS of an assignment,
an expression can be evaluated for its lvalue. The resulting value
is the identity of an object. In other contexts, particularly the
RHS of an assignment, an expression can be evaluated for its rvalue.
An rvalue is a value of some type; it might, for example, be the
result of retrieving the contents of an object, or of applying an
operator to one or more operands.

So given:
x = y;
the expression ``x'' is evaluated for its lvalue (in the non-C
sense), and the result is the identity of the object x; this is
unrelated to whatever value is currently stored in x. The expression
``y'' is evaluated for its rvalue, and the result is whatever value
is retrieved from the object y.

It's tempting to think of this non-C lvalue as an address, but it's
a different thing. An address, such as the result of ``&obj''
is an rvalue, not an lvalue. An lvalue isn't an address; it's
the identity of an object. An lvalue might identify a register
object or a bit field, neither of which has an address -- but they
do have identities.

If C had stuck with these meanings, then we'd say, not that some
expressions *are* lvalues, but that some expressions *have* lvalues.
An expression in certain contexts, such as the LHS of an assignment,
is evaluated for its lvalue. If the expression doesn't have an
lvalue, such as 42, then it's a constraint violation. If the
expression can be evaluated for its lvalue but the result doesn't
identify any object (such as *ptr where ptr==NULL), then the behavior
is undefined.

Given this formulation, it would be very handy to have a simple
term to refer to an expression that has an lvalue, so we could make
statements about what kind of expression is required on the LHS of
an assignment. Perhaps "l-expression".

But the term the authors of the C standard chose for this kind of
expression is "lvalue". And the standard keeps "rvalue" with its
original meaning ("the value of an expression"), but relegates it
to a footnote and doesn't use it.
 
K

Keith Thompson

Joe Wright said:
Indeed, I must be more inclusive. An expression that designates an
object perhaps rather than just a value? Done.

I don't see the problem. ``*ptr'' is indeed an lvalue expression. It
is up to the programmer to assign a proper value to ptr before
attempting to dereference it.

As you wrote, "In order to be an lvalue the expression must designate
an object ...". But ``*ptr'' doesn't designate an object (because
ptr==NULL).

But we're all in agreement, I think, that ``*ptr'' is an lvalue,
regardless of the current value of ptr. In fact, ptr has no current
value in the context of determining whether ``*ptr'' is an lvalue,
since that determination takes place during compilation.

As I said, that's the same problem the C90 definition of "lvalue" had.

My proposed fix is to say that an lvalue is an expression that
"potentially designates" an object. This leaves the problem of
defining "potentially". The intent is that it can be resolved at
compile time.
Agreed. An lvalue expression designates an object. In cases where its
lvalue-ness is irrelevant it yields the value of the object, an rvalue.

Right. This is covered in C99 6.3.2.1p2:

Except when it is the operand of the sizeof operator, the unary &
operator, the ++ operator, the -- operator, or the left operand of
the . operator or an assignment operator, an lvalue that does not
have array type is converted to the value stored in the designated
object (and is no longer an lvalue). If the lvalue has qualified
type, the value has the unqualified version of the type of the
lvalue; otherwise, the value has the type of the lvalue. If the
lvalue has an incomplete type and does not have array type, the
behavior is undefined.

Briefly, if an lvalue appears in a context that doesn't require an
lvalue, an lvalue-to-rvalue conversion occurs; the conversion (which
occurs at run-time) grabs the value stored in the designated object.

This also, I think, gives us a complete list of contexts in which an
lvalue is required: The operand of unary "&", "++, "--", the LHS
of ".", and the LHS of any assignment operator, simple or compound.

"sizeof" is an odd case; it doesn't require an lvalue (``sizeof 42''
is valid), but if you give it one the lvalue-to-rvalue conversion
doesn't happen. (I might have to think about what that implies for
``sizeof vla''. Later.)
 
B

Ben Bacarisse

If C had stuck with these meanings, then we'd say, not that some
expressions *are* lvalues, but that some expressions *have* lvalues.
An expression in certain contexts, such as the LHS of an assignment,
is evaluated for its lvalue. If the expression doesn't have an
lvalue, such as 42, then it's a constraint violation. If the
expression can be evaluated for its lvalue but the result doesn't
identify any object (such as *ptr where ptr==NULL), then the behavior
is undefined.

Given this formulation, it would be very handy to have a simple
term to refer to an expression that has an lvalue, so we could make
statements about what kind of expression is required on the LHS of
an assignment. Perhaps "l-expression".

But the term the authors of the C standard chose for this kind of
expression is "lvalue". And the standard keeps "rvalue" with its
original meaning ("the value of an expression"), but relegates it
to a footnote and doesn't use it.

There are a few remnants of the classical meaning still in the
standard. The description of the * operator includes:

"If the operand points to a function, the result is a function
designator; if it points to an object, the result is an lvalue
designating the object."

and for those expression forms that are not lvales like the cast,
conditional, and comma operators a footnote says that it "... does not
yield an lvalue". This is not, of course, the classical usage but it
is also not quite the wording I'd expect to see when lvalue is used to
describe an essentially a syntactic property.
 
R

Richard Bos

Joe Wright said:
In my opinion, ++i having object type (int) does not make it an lvalue.

No, but how about the i _in_ the ++i? That's not an assignment
(syntactically, that is), but it does modify i. Therefore, the operand
of ++ and -- must be an lvalue.

Richard
 
T

Tim Rentsch

Keith Thompson said:
Tim Rentsch said:
Now, you might think that "lvalue" isn't a good term for the
property needed here, and maybe that's right. However, I think
you'll agree that this property is an important property to
distinguish, and the current standard uses "lvalue" as the
term that identifies it[*]. So I think it's better to keep
using "lvalue" to identify those expressions that are legal
in contexts like the ones (on the left) shown above.

[*] Not counting some minor glitches in the current definition
of lvalue, which doesn't affect the basic point.
In a footnote on 'lvalue' the Standard explains the 'l' stands for left
operand of an assignment and might well stand for 'locator value'. It
mentions that 'rvalue' for Standard purposes is the 'value of an
expression'.

Footnote 53 in n1256 says:

The name ``lvalue'' comes originally from the assignment
expression E1 = E2, in which the left operand E1 is required to
be a (modifiable) lvalue. It is perhaps better considered as
representing an object ``locator value''. What is sometimes
called ``rvalue'' is in this International Standard described as
the ``value of an expression''.

The term "lvalue" predates the C Standard (indeed, it predates C
itself), and the footnote is explaining that historical origin,
not what the term means in the Standard. The second sentence
(starting "It is perhaps better considered...") makes a comment
about how "lvalue" should be understood as it is now used in
the Standard document.

As I understand it, the C standard's definition of lvalue, even
setting aside the glitches, is quite different from the original
meaning. We're stuck with the standard's meaning, but it's worth
exploring the original meaning.

In C, an lvalue is a particular kind of expression. Thus it's
something that exists in a program, and is recognized during
translation phase 7.

In the original meaning, as I understand it an lvalue is, as the name
implies, a kind of *value*, and an rvalue is another kind of value.
Note that expressions exist in a program, and values exist during
the execution of a program, so they're very different (but closely
related) things. A value is what you get during execution as
the result of evaluating an expression (something that exists in a
program and is recognized during TP7). The *expression* 42 consists
of a token, which in turn consists of the two source characters '4'
and '2'. When the expression is evaluated, it yields a *value*
that might be stored in, say, a 16-bit word as the bit pattern
0000000000101010; the '4' and the '2' no longer exist.

I'm used to thinking of the original "LVALUE" as a kind of evaluation
(and similarly "RVALUE"). So, for example, LVALUE(I) is the address
of I, and RVALUE(I) is the contents of I. To relate LVALUE and
C's lvalue, an expression E is an lvalue in C if LVALUE(E) is
meaningful.

Using the non-C meanings of the terms, an expression can be
evaluated in one of two ways, depending on the context in which it
appears. In some contexts, particularly the LHS of an assignment,
an expression can be evaluated for its lvalue. The resulting value
is the identity of an object. In other contexts, particularly the
RHS of an assignment, an expression can be evaluated for its rvalue.
An rvalue is a value of some type; it might, for example, be the
result of retrieving the contents of an object, or of applying an
operator to one or more operands.

So given:
x = y;
the expression ``x'' is evaluated for its lvalue (in the non-C
sense), and the result is the identity of the object x; this is
unrelated to whatever value is currently stored in x. The expression
``y'' is evaluated for its rvalue, and the result is whatever value
is retrieved from the object y.

It's tempting to think of this non-C lvalue as an address, but it's
a different thing. An address, such as the result of ``&obj''
is an rvalue, not an lvalue. An lvalue isn't an address; it's
the identity of an object. An lvalue might identify a register
object or a bit field, neither of which has an address -- but they
do have identities.

To say again, I believe LVALUE and RVALUE were originally different
kinds of evaluation, not different kinds of values. Not that it
matters really, just might be helpful to some folks to think of
it this way.

If C had stuck with these meanings, then we'd say, not that some
expressions *are* lvalues, but that some expressions *have* lvalues.
An expression in certain contexts, such as the LHS of an assignment,
is evaluated for its lvalue. If the expression doesn't have an
lvalue, such as 42, then it's a constraint violation. If the
expression can be evaluated for its lvalue but the result doesn't
identify any object (such as *ptr where ptr==NULL), then the behavior
is undefined.

Given this formulation, it would be very handy to have a simple
term to refer to an expression that has an lvalue, so we could make
statements about what kind of expression is required on the LHS of
an assignment. Perhaps "l-expression".

But the term the authors of the C standard chose for this kind of
expression is "lvalue". And the standard keeps "rvalue" with its
original meaning ("the value of an expression"), but relegates it
to a footnote and doesn't use it.

An lvalue is an expression that has an LVALUE; an lvalue expression
used in a context that doesn't require lvalue-ness produces the
RVALUE(that expression). Actually all expressions used in a context
that doesn't require lvalue-ness produce the RVALUE(that expression),
except operands of sizeof, which we might say produce the TVALUE (for
Type Value) of the operand expression. (For RVALUE to work this way
we would have to define RVALUE(something with array type) suitably,
to match how C deals with arrays, and similarly for function types.)

At least, that's how I think of it; hopefully some other people
will find that view helpful.
 
T

Tim Rentsch

Ben Bacarisse said:
There are a few remnants of the classical meaning still in the
standard. The description of the * operator includes:

"If the operand points to a function, the result is a function
designator; if it points to an object, the result is an lvalue
designating the object."

and for those expression forms that are not lvales like the cast,
conditional, and comma operators a footnote says that it "... does not
yield an lvalue". This is not, of course, the classical usage but it
is also not quite the wording I'd expect to see when lvalue is used to
describe an essentially a syntactic property.

Aren't the cases you named just expressing which syntax
productions can be used as lvalues? ISTM these statements
serve simply to define the syntactic property in question
(for those kinds of expressions).
 
B

Ben Bacarisse

Tim Rentsch said:
Aren't the cases you named just expressing which syntax
productions can be used as lvalues? ISTM these statements
serve simply to define the syntactic property in question
(for those kinds of expressions).

(Assuming you mean *can't* be used as lvalues.) Yes, that is all the
wording does, but I find the phrase an odd one, that's all. I would
have expected to see simply that (for example) "a comma expression is
not an lvalue" just as the standard tells us that some other
expression forms are lvalues. E.g. in 6.5.1.2 an identifier
designating an object "is an lvalue" not "yields an lvalue"; a
parenthesised expression "is an lvalue" (under the right
circumstances) and so on.
 
T

Tim Rentsch

Ben Bacarisse said:
(Assuming you mean *can't* be used as lvalues.)

Sorta. It's expressing which syntax productions can be
used as lvalues by identifying some syntax productions
that /can't/ be used as lvalues. But I think the meaning
was understood as I meant it.

Yes, that is all the
wording does, but I find the phrase an odd one, that's all. I would
have expected to see simply that (for example) "a comma expression is
not an lvalue" just as the standard tells us that some other
expression forms are lvalues. E.g. in 6.5.1.2 an identifier
designating an object "is an lvalue" not "yields an lvalue"; a
parenthesised expression "is an lvalue" (under the right
circumstances) and so on.

Okay, now I see what you're getting at.

I agree the language is a little weird considering how "lvalue"
is defined, but I attribute the funny language to something else.
Partly it's stylistic -- the sections are talking about the
operators, and generally the (sometimes implied) subject is
the operator in question. Hence, "the result (of the * operator)
is an lvalue." To say something about a whole * expression would
be changing the subject. This style is followed pretty consistently
in 6.5.N talking about operators, even in the footnotes: "A
comma operator does not yield an lvalue" -- the operator has to be
the subject, not an expression with the operator in it.

Also, the Standard is often not very good at distinguishing between
compile-time and run-time. For example, 6.5.2.3 p 3 says

If the first expression has qualified type, the result has the
so-qualified version of the type of the designated member.

This statement is either sloppy language, or it's just nonsense.
The result of an expression is a run-time notion, but type is a
compile-time notion -- it's expressions that have type, not the
results of expressions. (There is a run-time notion in that
objects have an /effective type/, but this is not the same as
/type/ (even though it uses the same space to identify items in
the two notions).) For compile-time notions like type or lvalue,
saying "the result (some predicate about the compile-time notion)"
is simply sloppy language arising from not sharply distinguishing
between compile-time and run-time. I don't think "lvalue" had an
original meaning that was a run-time notion; rather, like "type",
I think there was confusion about whether it was a compile-time
notion or a run-time notion (most people implicitly assumed that
the two notions were the same). The language used with lvalue
reflects a confusion in the original meaning, not a change in
meaning.
 
T

Tim Rentsch

Joe Wright said:
I don't distinguish between compile-time and run-time notions.

Too bad. It's a useful distinction.

I think
the original meaning of lvalue has been extended beyond assignment to
include all notation or expression which designates an object.

The original meaning of lvalue wasn't limited to assignment.
The term itself was suggested by where an expression appeared
in an assignment statement, but the notion was understood to
be a generally useful notion, even before C came on the scene.

K&R said "An lvalue is an expression referring to an object." and
suggests the expression is one kind of thing and the object is another
kind of thing. This may be where the confusion started. Better stated,
"An lvalue is an expression designating an object." or "An lvalue is an
expression of an object." or eventually "An lvalue is an object."

int i;

Is i an expression or an object? The answer is yes.

No, i is an identifier. The identifier i may be used in
a context where it also qualifies as an expression, but
it is never an object. The expression (i) refers to an
object at run-time, but the expression is not the object,
and the identifier i is not the object. Clearly this is
true, because if we have a function like

unsigned int
fact( unsigned int n ){
return n < 2 ? 1 : n * fact( n-1 );
}

the identifier n (and each place n appears as an expression)
does refer to an object, but the same expression refers
to more than one object (namely, one object per level of
recursion). The function above has exactly three expressions
containing just the identifier n; yet those three expressions
may refer to a thousand different objects.
 
K

Keith Thompson

Joe Wright said:
int i;

Is i an expression or an object? The answer is yes.

As Tim said, i is an identifier.

As an identifier, i is the name of an object. We refer to that object
as i, but there are at least three very different things we can mean
when we say i: the identifier, the object, and an expression that
refers to that object (and the latter can mean very different things
depending on whether it's used as an lvalue or not).

Usually these distinctions are sufficiently clear from context, but
when we're talking about these distinctions it's important not to blur
them.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,144
Latest member
KetoBaseReviews
Top