assigning a pointer the address of local variable

G

Guest

Harald said:
Flash said:
(e-mail address removed) wrote:
Harald van Dijk wrote:
Keith Thompson wrote:
[...]
No, it isn't. When s reaches the end of its lifetime (at the end of
foo(), the value of p becomes indeterminate. Dereferencing p, or even
looking at its value, invokes undefined behavior. (The latter isn't
likely to cause any visible problems on most systems, but you should
still avoid it.)
The word "looking" is vague. Does it mean that I can't do something
like this?

#include <stdio.h>

static int* foo(void);

int main(void)
{
int *p;
p = foo();/* legal? */
printf("Pointer was %p\n", (void*)p);/* legal? */
return 0;
}

static int* foo(void)
{
int i;
return &i;
}
Any reference to the value of p after foo() returns invokes undefined
behavior. For that matter, I think assigning the result of foo() to p
in the first place invokes UB.
A question: what about just "foo();" (without an assignment)? Is the value
allowed to be read before discarding it?

No, I would say it isn't; otherwise the implementation would
be doing something not done by the abstract machine, violating
the as-if rule. To say it another way, the implementation is
allowed to read the value, but must behave as if it doesn't.

I agree with ena8t8si and would add in support that in C89 at least if
you fell off the end of the function without returning a value the
program would still be strictly conforming *if* you did not use the
value. So if "foo();" is fine if nothing is returned it must surely
still be fine if a value you are not allowed to use is returned.

Thanks. Now for my followup question: how does this affect an
expression statement consisting of a single volatile variable? Is a
read allowed?

Not just allowed, but required.

So the result of an expression-statement must not be read if it is a
function call, and must be read if it is a variable (and all exceptions
are results of the as-if rule)? Where does the standard make this
distinction?
 
C

Chris Torek

[Originally, we had code equivalent to the following:

int *foo(void) { int i; return &i; }
int main(void) { foo(); return 0; }

I myself do not know whether this code is well-defined. We can
say for sure that "i" does exist at the point of the "return"
statement that takes its address, and no longer exists by the
time foo() has returned to main(), so the "value" returned by
the function has undefined behavior if it is used -- but it is
not "used".]

So the result of an expression-statement must not be read if it is a
function call, and must be read if it is a variable (and all exceptions
are results of the as-if rule)? Where does the standard make this
distinction?

This is an interesting point. Here is Program P:

extern volatile int hardware_reg;

int main(void) {
volatile int *p = &hardware_reg;

*p; /* or, equivalently, "hardware_reg;" (without quotes) */
return 0;
}

When run (after compiling with special Linker Magic to make the
external variable "hardware_reg" correspond to a hardware register
on the system), this program clears the register -- it is one of
those "self-clearing upon read" registers -- and is thus useful
after some particular kinds of errors. (Perhaps the error it clears
has to do with the audio subsystem, so it makes the sound work on
the machine again.)

Now we modify Program P a bit, giving Program Q:

extern volatile int hardware_reg;
volatile int *f(void) { return &hardware_reg; }
int main(void) { *f(); return 0; }

This obviously does the same as Program P. But note that we used
the unary "*" operator in main(). If we had done that with the
original program, now modified to become Program R:

int *foo(void) { int i; return &i; }
int main(void) { *foo(); return 0; }

we would definitely (I believe) have undefined behavior. So now
we might write Program S:

volatile int *foo(void) { int i; return &i; }
int main(void) { foo(); return 0; }

The key seems, in this case, to be the unary "*" operator. It
is OK to skip "reading" the value returned by some function,
even if it is a "pointer to volatile", as long as we do not
actually follow the pointer.

If a function could return a volatile-qualified type (and it cannot),
and if the volatile-qualified type were a C++-like reference (which
C lacks), we could rewrite one of the above to produce Program T:

volatile int &f(void) { return hardware_reg; }
int main(void) { f(); return 0; }

In this case the actual read of the hardware register would occur
(logically at least) in main(), rather than in f(), giving rise to
the potential conflict that Harald is asking about. Fortunately,
we cannot express the problem in C in the first place. :)
 
E

ena8t8si

Chris said:
[Originally, we had code equivalent to the following:

int *foo(void) { int i; return &i; }
int main(void) { foo(); return 0; }

I myself do not know whether this code is well-defined. We can
say for sure that "i" does exist at the point of the "return"
statement that takes its address, and no longer exists by the
time foo() has returned to main(), so the "value" returned by
the function has undefined behavior if it is used -- but it is
not "used".]

Here's why it's defined:

The value &i is valid when the expression is evaluated
before doing the return.

What the return does with the value is defined in
6.8.6.4 paragraph 3 - "the value is returned to the
caller".

An indeterminate value isn't illegal in and of itself;
it's only certain uses of indeterminate values that
cause undefined behavior:

6.2.6.1#5 - being read out of an object
6.2.6.1#5 - being stored into an object (perhaps indirectly)
6.2.6.2#4 - logical operators that would produce one
6.5.3.2#4 - operand of * operator

What a return does isn't any of these things; in
particular, a value of an expression isn't an object,
since it doesn't have an address. The implicit
conversion to (void) then discards the value, which
also doesn't violate any of the UB's.

Since the semantics are defined, and don't transgress any
provisions that would make them undefined, the result is defined.
 
E

ena8t8si

Harald said:
Harald said:
Flash Gordon wrote:
(e-mail address removed) wrote:
Harald van Dijk wrote:
Keith Thompson wrote:
[...]
No, it isn't. When s reaches the end of its lifetime (at the end of
foo(), the value of p becomes indeterminate. Dereferencing p, or even
looking at its value, invokes undefined behavior. (The latter isn't
likely to cause any visible problems on most systems, but you should
still avoid it.)
The word "looking" is vague. Does it mean that I can't do something
like this?

#include <stdio.h>

static int* foo(void);

int main(void)
{
int *p;
p = foo();/* legal? */
printf("Pointer was %p\n", (void*)p);/* legal? */
return 0;
}

static int* foo(void)
{
int i;
return &i;
}
Any reference to the value of p after foo() returns invokes undefined
behavior. For that matter, I think assigning the result of foo() to p
in the first place invokes UB.
A question: what about just "foo();" (without an assignment)? Is the value
allowed to be read before discarding it?

No, I would say it isn't; otherwise the implementation would
be doing something not done by the abstract machine, violating
the as-if rule. To say it another way, the implementation is
allowed to read the value, but must behave as if it doesn't.

I agree with ena8t8si and would add in support that in C89 at least if
you fell off the end of the function without returning a value the
program would still be strictly conforming *if* you did not use the
value. So if "foo();" is fine if nothing is returned it must surely
still be fine if a value you are not allowed to use is returned.

Thanks. Now for my followup question: how does this affect an
expression statement consisting of a single volatile variable? Is a
read allowed?

Not just allowed, but required.

So the result of an expression-statement must not be read if it is a
function call, and must be read if it is a variable (and all exceptions
are results of the as-if rule)? Where does the standard make this
distinction?

Chris Torek gave already a good answer to this, but let
me see if I can add to it.

My earlier answer was misleading, because "read" was being
used in two different senses. In both cases:

foo();
volatile_variable;

a value is produced, and in both cases a value is discarded.
The difference is, in the case of foo(), producing the
value is ok because the value is valid at the time it's
produced, but in the case of volatile_variable, the value
in the variable is already invalid when the variable is
accessed.

To say that another way, in neither case is the value of
the expression "read", but in the variable case an object
with an invalid value must be accessed, and it is this
access ("read" in the other sense) that causes the
undefined behavior. It isn't the value of the expression
that's read, but reading the variable to produce the
value of the expression, that causes the undefined
behavior; the value is subsequently discarded, but by
then it's too late, the UB has already happened.

Incidentally, the variable doesn't have to be volatile
to get undefined behavior.

See sections 6.3.2.1 #2, 6.2.6.1 #5 in the Standard.
 
G

Guest

Chris said:
[Originally, we had code equivalent to the following:

int *foo(void) { int i; return &i; }
int main(void) { foo(); return 0; }

I myself do not know whether this code is well-defined. We can
say for sure that "i" does exist at the point of the "return"
statement that takes its address, and no longer exists by the
time foo() has returned to main(), so the "value" returned by
the function has undefined behavior if it is used -- but it is
not "used".]

Here's why it's defined:

The value &i is valid when the expression is evaluated
before doing the return.

What the return does with the value is defined in
6.8.6.4 paragraph 3 - "the value is returned to the
caller".

An indeterminate value isn't illegal in and of itself;
it's only certain uses of indeterminate values that
cause undefined behavior:

6.2.6.1#5 - being read out of an object
6.2.6.1#5 - being stored into an object (perhaps indirectly)
6.2.6.2#4 - logical operators that would produce one
6.5.3.2#4 - operand of * operator

What a return does isn't any of these things; in
particular, a value of an expression isn't an object,
since it doesn't have an address. The implicit
conversion to (void) then discards the value, which
also doesn't violate any of the UB's.

Since the semantics are defined, and don't transgress any
provisions that would make them undefined, the result is defined.

Thanks for the clear explanation, I think you're right. Now here's yet
another question:

int *foo(void) { int i; return &i; }
int *bar(void) { int i; return &i; }
int main(void) { return foo() == bar(); }

The invalid pointer values are not read by an lvalue expression, so
6.2.6.1p5 doesn't apply, and 6.5.9p6 says:

"Two pointers compare equal if and only if both are null pointers, both
are pointers to the same object (including a pointer to an object and a
subobject at its beginning) or function, both are pointers to one past
the last element of the same array object, or one is a pointer to one
past the end of one array object and the other is a pointer to the
start of a different array object that happens to immediately follow
the ï¬rst array object in the address space.92)"

Neither value is a null pointer, neither value points to an object, and
neither value points to one past the last element of an array. Because
of the "if and only if", it seems that the above program is strictly
conforming and is guaranteed to return 0. That can't possibly be
intended, and at least three different compilers can all be convinced
to make the program return 1 in at least one mode which aims to conform
to C89 or C99.
 
E

ena8t8si

Harald said:
Chris said:
[Originally, we had code equivalent to the following:

int *foo(void) { int i; return &i; }
int main(void) { foo(); return 0; }

I myself do not know whether this code is well-defined. We can
say for sure that "i" does exist at the point of the "return"
statement that takes its address, and no longer exists by the
time foo() has returned to main(), so the "value" returned by
the function has undefined behavior if it is used -- but it is
not "used".]

Here's why it's defined:

The value &i is valid when the expression is evaluated
before doing the return.

What the return does with the value is defined in
6.8.6.4 paragraph 3 - "the value is returned to the
caller".

An indeterminate value isn't illegal in and of itself;
it's only certain uses of indeterminate values that
cause undefined behavior:

6.2.6.1#5 - being read out of an object
6.2.6.1#5 - being stored into an object (perhaps indirectly)
6.2.6.2#4 - logical operators that would produce one
6.5.3.2#4 - operand of * operator

What a return does isn't any of these things; in
particular, a value of an expression isn't an object,
since it doesn't have an address. The implicit
conversion to (void) then discards the value, which
also doesn't violate any of the UB's.

Since the semantics are defined, and don't transgress any
provisions that would make them undefined, the result is defined.

Thanks for the clear explanation, I think you're right. Now here's yet
another question:

int *foo(void) { int i; return &i; }
int *bar(void) { int i; return &i; }
int main(void) { return foo() == bar(); }

The invalid pointer values are not read by an lvalue expression, so
6.2.6.1p5 doesn't apply, and 6.5.9p6 says:

"Two pointers compare equal if and only if both are null pointers, both
are pointers to the same object (including a pointer to an object and a
subobject at its beginning) or function, both are pointers to one past
the last element of the same array object, or one is a pointer to one
past the end of one array object and the other is a pointer to the
start of a different array object that happens to immediately follow
the ï¬rst array object in the address space.92)"

Neither value is a null pointer, neither value points to an object, and
neither value points to one past the last element of an array. Because
of the "if and only if", it seems that the above program is strictly
conforming and is guaranteed to return 0. That can't possibly be
intended, and at least three different compilers can all be convinced
to make the program return 1 in at least one mode which aims to conform
to C89 or C99.

Yes, I think your analysis is right, and this
shows a weakness in the existing wording. You
might want to post the question in comp.std.c
in a new thread and see what other people think.
"If any pointer value is a trap representation
the behavior is undefined," or words to that
effect, should be added somewhere in 6.5.9,
and probably a few other places.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,151
Latest member
JaclynMarl
Top