Dereference an array pointer... UB?

  • Thread starter Tomás Ó hÉilidhe
  • Start date
K

Kaz Kylheku

Keith Thompson:


    Yes but an array type isn't a value -- which is the very reason why
arrays decay to a pointer to their first element, so that we can actually
get a value out of them.

And what value will you get from a nonexistent, imaginary array, after
decaying it to a pointer to a nonexistent first element?
 
M

Martin

Presumably what you meant is that there's no such thing as an
array value.  I think the standard is vague on this point, but I
disagree; there *is* such a thing as an array value.  The
language just provides very few contexts in which array values
become visible.

In its explanation of a string the C89 Standard explains the term
'value' as "the sequence of the values of the contained characters, in
order."

I wonder if this provides a clue as to the meaning of the value of an
array (or at least a character array in this instance).
 
C

CBFalconer

Kaz said:
.... snip ...


We could also argue that ``nothing'' happens when you merely
increment a pointer out of bounds.

Piggybacking. Nonsense. Dereferencing an invalid pointer means
attempting to access memory that is not available to you. A system
that detects all errors should crash. Many won't.
 
B

Ben Bacarisse

Kaz Kylheku said:
And what value will you get from a nonexistent, imaginary array, after
decaying it to a pointer to a nonexistent first element?

You would get exactly the pointer the OP wanted -- to the int
immediately following the 1D array. I say "would" because I think the
example is UB, though only because the utility of applying * to an
array pointer (pointing "one past" a whole array) was missed when
drawing up the rule about applying * to these "one past" pointers.
 
B

Ben Bacarisse

Kaz Kylheku said:
    Do you think we can reach any kind of consensus on whether the
following code's behaviour is undefined by the Standard?

    int my_array[5];

    int const *const pend = *(&my_array + 1);

You may have a pointer one element past the last element of an array
object. However, my_array as whole is not an element of an array. So
&myarray + 1 is invalid.

That is not a problem. There is explicit permission to this.
Anything that is not an array element is to be treated as it it were
an array of length one.
What you are doing is similar to computing p below:

int i, j[1];
int *p = &i + 1; // not right, i is not an array object

Expressly permitted. You can apply the * to this pointer, but you may
calculate the inter value and store it.
int *q = &j + 1; // okay, since j is an array object

<snip>
 
K

Keith Thompson

Kaz Kylheku said:
    Do you think we can reach any kind of consensus on whether the
following code's behaviour is undefined by the Standard?

    int my_array[5];

    int const *const pend = *(&my_array + 1);

You may have a pointer one element past the last element of an array
object. However, my_array as whole is not an element of an array. So
&myarray + 1 is invalid.

No, &myarray + 1 is valid. C99 6.5.6p7 (Additive operators):

For the purposes of these operators, a pointer to an object that
is not an element of an array behaves the same as a pointer to the
first element of an array of length one with the type of the
object as its element type.

&my_array + 1 is a valid pointer value of type int(*)[5], pointing
just past the end of my_array.

Since this pointer value doesn't point to an object, attempting to
dereference it invokes UB, so *(&my_array + 1) is invalid.
What you are doing is similar to computing p below:

int i, j[1];
int *p = &i + 1; // not right, i is not an array object
int *q = &j + 1; // okay, since j is an array object

Again, &i + 1 is valid.

[snip]
 
K

Kaz Kylheku

You would get exactly the pointer the OP wanted -- to the int
immediately following the 1D array.

Of course, you would only get that if the implementation didn't throw
a diagnostic in your face and stop the program first. :)
 I say "would" because I think the
example is UB, though only because the utility of applying * to an
array pointer (pointing "one past" a whole array) was missed when
drawing up the rule about applying * to these "one past" pointers.

Exactly. Correctness is not just about getting the right value, but
about how you got it. What is 64/16? Ah, numerator 6 cancels the
denominator 6 so we get 4/1 = 4. :)
 
B

Ben Bacarisse

Kaz Kylheku said:
Of course, you would only get that if the implementation didn't throw
a diagnostic in your face and stop the program first. :)

Do you get one? gcc is silent about

int array[5] = {0};
int *pe1 = *(&array + 1);

even with

-std=c89 -pedantic -pedantic -W -Wall -Wextra -Wformat-nonliteral
-Wcast-align -Wpointer-arith -Wbad-function-cast -Wstrict-prototypes
-Winline -Wundef -Wnested-externs -Wcast-qual -Wshadow -Wconversion
-Wwrite-strings -ffloat-store

in effect!
Exactly. Correctness is not just about getting the right value, but
about how you got it. What is 64/16? Ah, numerator 6 cancels the
denominator 6 so we get 4/1 = 4. :)

I agree, but do you really think the method _is_ flawed? Both your
remarks have :) appended which makes me think you are not very
serious your objections.

If the standard says "X is UB" (as it does in this case, I think) we
are entitled to ask "why?". Is it because the construct might be
unimplementable on some architectures (that would be my reason for
outlawing the construction of a "one before" pointer), or is it
because it constrains the implementation too much (the reason for
having an unspecified evaluation order)? What is the reasoning being
outlawing the above?
 
O

Old Wolf

I would say that the reason that the behavior is undefined is that the
committee didn't realize (or appreciate) the potential utility of defining
the meaning of the unary * operator on pointer values derived from pointers
to objects, but not themselves a pointer to an object.

The rule is currently very simple. Only pointers to valid
objects can be dereferenced. Or, in standardese, the
behaviour is undefined if an lvalue does not designate
an object.

Why you would want to start adding exceptions to this
simple rule in order to support some IOCCC-esque syntax
for doing something that there is already at least two
correct ways of doing, is anybody's guess.

T x; (&x)[1] is invalid for all T. Why make an exception
for T being array type? How would you feel if you were
maintaining code and you saw that?
 
P

Peter Nilsson

Old Wolf said:
Why you would want to start adding exceptions to this
simple rule

You have it wrong. It's the standard that unnecessarily adds
an exception to a simple rule.
in order to support some IOCCC-esque syntax
for doing something that there is already at least two
correct ways of doing, is anybody's guess.

You don't have to guess, indeed I thought it was clear:
There's no obvious reason for making it UB, particularly
as the Committee went out of it's way to bedding down
what &x means when x is an array.
T x; (&x)[1] is invalid for all T. Why make an exception
for T being array type? How would you feel if you were
maintaining code and you saw that?

As quickly as you can, please tell me which of the
following functions has UB. Then tell me how you so
_easily_ identified which function was ioccc-esque
and/or difficult to maintain.

double sum_last_row_v1(const double m[3][3])
{
const double *p;
double s;
for (s = 0, p = &m[2][0]; p < &m[2][3]; p++)
s += *p;
return s;
}

double sum_last_row_v2(const double m[3][3])
{
const double *p;
double s;
for (s = 0, p = &m[2][0]; p < &m[3][0]; p++)
s += *p;
return s;
}

I submit that the UB case has some elegance and that it's
a pity it's double UB!
 
R

Richard Heathfield

Peter Nilsson said:

As quickly as you can, please tell me which of the
following functions has UB.

Both of them. (Both do illegal pointer comparisons.)
 
P

Peter Nilsson

Richard Heathfield said:
Peter Nilsson said:



Both of them. (Both do illegal pointer comparisons.)

Are you saying that because that's your "quickly as
you can" response, or because you genuinely think this
to be true?

If the latter, c&v for version 1 would be appreciated.
 
R

Richard Heathfield

Peter Nilsson said:
Are you saying that because that's your "quickly as
you can" response, or because you genuinely think this
to be true?

No, you're right - I read the article too quickly. Apologies.
 
C

Chris Torek

Here's something to chew on. It probably says something about the
original question, but I'm not sure what.

int main(void)
{
struct s {
int x;
int y[2];
} ;
volatile struct s obj = { 10, { 20, 30 } };

obj; /* Computes and discards the value of obj.
Must access obj.x, obj.y[0], and obj.y[1]. */

This seems reasonable, although I would be unsurprised to find
compilers that did not in fact access the three "int"s.
obj.x; /* Computes and discards the value of obj.x.
Must access obj.x. */

And must not access obj.y[0] and obj.y[1] (I believe).
obj.y; /* Computes and discards the address of obj.y[0].
Must this access obj.y[0] and obj.y[1]?
*May* it do so?
C&V? */

I think the answer to this is "no and no" but I cannot prove it.

If the answer *is* "no and no", I think this guarantees that the
OP's construct (not included in this follow-up) is strictly conforming.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,190
Latest member
Martindap

Latest Threads

Top