Undefined behaviour with Non-static, non-polymorphic + null pointer?

A

Alan Woodland

Hi,

I'm fairly sure this is undefined behaviour, despite the fact that
it compiles and 'runs' (prints "this doesn't exist") on all my platforms:

#include <iostream>

class foo {
public:
void bar() {
std::cout << "hello evil world!" << std::endl;
if (this) {
std::cout << "this exists" << std::endl;
}
else {
std::cout << "this doesn't exist!" << std::endl;
}
}
};

int main() {
foo *inst = 0;
inst->bar();

return 0;
}

Can someone please quote chapter and verse on this one and help me win
my current "this is a very bad idea" argument I'm having? I'd have
expected it to be forbidden under some general rule, and no exceptions
to have been made for it? Or is it actually really legal and defined
because nothing ever dereferences the this pointer?

Thanks,
Alan
 
V

Victor Bazarov

Alan said:
I'm fairly sure this is undefined behaviour, despite the fact that
it compiles and 'runs' (prints "this doesn't exist") on all my
platforms:

#include <iostream>

class foo {
public:
void bar() {
std::cout << "hello evil world!" << std::endl;
if (this) {
std::cout << "this exists" << std::endl;
}
else {
std::cout << "this doesn't exist!" << std::endl;
}
}
};

int main() {
foo *inst = 0;
inst->bar();

Here you're "using" the pointer that has an invalid value (does not
point to any object). That's undefined behavour. I could not quickly
locate the exact passage in the Standard that says that it is, but I
am sure you can find a mention of it in the archives, just search for
"dereference null pointer".
return 0;
}

Can someone please quote chapter and verse on this one and help me win
my current "this is a very bad idea" argument I'm having? I'd have
expected it to be forbidden under some general rule, and no exceptions
to have been made for it? Or is it actually really legal and defined
because nothing ever dereferences the this pointer?

The expression

inst->bar()

is in fact

(*inst).bar()

which already dereferences the null pointer 'inst'.

V
 
A

Alan Woodland

Victor said:
Here you're "using" the pointer that has an invalid value (does not
point to any object). That's undefined behavour. I could not quickly
locate the exact passage in the Standard that says that it is, but I
am sure you can find a mention of it in the archives, just search for
"dereference null pointer".


The expression

inst->bar()

is in fact

(*inst).bar()

which already dereferences the null pointer 'inst'.

Thanks. It's funny, I'd never actually though about the implications of
that in this context before. Just found the following quote which ought
to convince certain people:

The Standard says that "p->" is
converted to "(*p)." (see section 5.2.5) and no matter how you slice it,
*p is a dereference. Dereferencing a null pointer results in undefined
behaviour.

Some compilers may ignore the conversion, but that's part of the
"undefined" part of the behaviour. You cannot rely on it happening on
all compilers - not even future releases of your current compiler.

Alan
 
M

Marco Manfredini

Alan said:
Thanks. It's funny, I'd never actually though about the implications
of that in this context before. Just found the following quote which
ought to convince certain people:

Event funnier: According to 5.2.5, this code:
struct X { static const int x=0; };
int main() {
X*x=0;
x->n;
}

invokes UB.

And if I'd was not to lazy to look it up, I could tell you if

struct X { enum {x=0}; };
int main() {
X*x=0;
x->n;
}

invokes UB or not.

(I mean, they could really make an appendix "Authoritative List of
UB's", because it's really a nuisance to find these only scattered
around in the Standard)
 
V

Victor Bazarov

Marco said:
Event funnier: According to 5.2.5, this code:
struct X { static const int x=0; };
int main() {
X*x=0;
x->n;
}

invokes UB.

And if I'd was not to lazy to look it up, I could tell you if

struct X { enum {x=0}; };
int main() {
X*x=0;
x->n;
}

invokes UB or not.

(I mean, they could really make an appendix "Authoritative List of
UB's", because it's really a nuisance to find these only scattered
around in the Standard)

I am not sure how such a list would help. You would still have to
understand that the postfix expression (x->) dereferences the pointer
regardless what's following it. How would mentioning that if one
dereferences a null pointer it's UB help understanding that x->n
does in fact dereference 'x' (if 'n' is a static member)?

V
 
W

werasm

The Standard says that "p->" is
converted to "(*p)." (see section 5.2.5) and no matter how you slice it,
*p is a dereference. Dereferencing a null pointer results in undefined
behaviour.

Yes, but doing this:

sizeof( static_cast<P*>(0)->member ); //or
sizeof( *static_cast<P*>(0)->member )

would not invoke cause behavior (for interest sake) as
this dereference is "sliced" at compile time.

W
 
M

Marco Manfredini

Victor said:
I am not sure how such a list would help. You would still have to
understand that the postfix expression (x->) dereferences the pointer
regardless what's following it. How would mentioning that if one
dereferences a null pointer it's UB help understanding that x->n
does in fact dereference 'x' (if 'n' is a static member)?

Well, for an example 5.2.5 just says that x->y is dereferenced during
evaluation. So glancing over the paragraph I might remember that
"dereference" can invoke UB, but what are the details? If *what* is
dereferenced? And then there is sizeof (and soon decltype) which do not
evaluate their argument - so am I getting this right that sizeof(x->y)
should always be defined? I remember that there was a debate about that
question some time ago on clmc++.

So I think, that it would be nice, if an (effectual) Appendix would turn
the UBs inside out and list all UBs with pointers back to the context
of their premises, like:

Dereferencing
If t is of pointer type T and *t(1) is evaluated(2) and t does not point
to an object of type T (3), it's UB

(1) When is *t implicitely formed?: see "->"
(2) When is *t not evaluated? see: sizeof, decltype
(3) How can t not point to an object of it's declared type: see union,
reinterpret_cast, null pointer etc..

I bet that was shocking!
 
O

Old Wolf

Event funnier: According to 5.2.5, this code:
struct X { static const int x=0; };
int main() {
X*x=0;
x->n;
}

invokes UB.

It requires a diagnostic, as X has no member 'n'.
struct X { enum {x=0}; };
int main() {
X*x=0;
x->n;
}

invokes UB or not.

Also requires a diagnostic, as X has no member 'n'.
 
J

James Kanze

Well, for an example 5.2.5 just says that x->y is dereferenced
during evaluation. So glancing over the paragraph I might
remember that "dereference" can invoke UB, but what are the
details? If *what* is dereferenced?

The pointer. Dereferencing is a run-time action, the result of
the * operator.
And then there is sizeof (and soon decltype) which do not
evaluate their argument - so am I getting this right that
sizeof(x->y) should always be defined?

Yes. The standard explicitly says that the arguments to sizeof
are not evaluated. No run-time behavior.
I remember that there was a debate about that question some
time ago on clmc++.
So I think, that it would be nice, if an (effectual) Appendix
would turn the UBs inside out and list all UBs with pointers
back to the context of their premises,

There's not much to say about pointers: dereferencing a null
pointer, or a pointer to one past the end of an array, is
undefined behavior (in C++---in C, there are certain special
cases where one past the end of an array is allowed).
Dereferencing
If t is of pointer type T and *t(1) is evaluated(2) and t does not point
to an object of type T (3), it's UB
(1) When is *t implicitely formed?: see "->"

Implicit or explicit has nothing to do with it. If the standard
says (and it does) that p->f() has the semantics of (*p).f(),
then it has the semantics of (*p).f(). I don't see what more
needs to be said.
(2) When is *t not evaluated? see: sizeof, decltype

Again, the standard is fairly explicit, although perhaps not
where you'd expect. §3.2/1: "An expression is potentially
evaluated unless it is either the operand of the sizeof
operator, or the operand of the typeid operator and does not
designate an lvalue of polymorphic class type."
(3) How can t not point to an object of it's declared type:
see union, reinterpret_cast, null pointer etc..

A pointer value can be considered as having one of four
categories:

-- it points to an object (no problem there),

-- it points to one past the end of an array (dereference
illegal, but pointer arithmetic still allowed).

-- it is null (no dereference, and I think, no pointer
arithmetic---but I'm not sure about p+0), and

-- anything else (nothing allowed, even lvalue to rvalue
conversion is undefined behavior)

With regards to unions, nothing changes. A union contains one
(and only one) of its members at a time. Any attempt to access
any other member is undefined behavior.
 
J

James Kanze

Yes, but doing this:
sizeof( static_cast<P*>(0)->member ); //or
sizeof( *static_cast<P*>(0)->member )
would not invoke cause behavior (for interest sake) as
this dereference is "sliced" at compile time.

There's no slicing involved, but the standard explicitly says
that the arguments of sizeof are not evaluated, so no runtime
undefined behavior can result.

Note that the fact that they are not evaluated has other
implications as well. For example, if you write "sizeof(f())",
you're not required to provide an implementation of f. And if
you write "sizeof(f<int>())", the template function f is not
instantiated for int.
 
W

werasm

There's no slicing involved, but the standard explicitly says
that the arguments of sizeof are not evaluated, so no runtime
undefined behavior can result.

I was not referring to slicing in C++, but to Alan's use of it (i.e
"no matter how you "cut/slice/look at" it).
Note that the fact that they are not evaluated has other
implications as well. For example, if you write "sizeof(f())",
you're not required to provide an implementation of f. And if
you write "sizeof(f<int>())", the template function f is not
instantiated for int.

Yes, this is typically used in SFINAE. Good to mention.

Regards,

Werner
 
M

Marco Manfredini

James said:
There's not much to say about pointers: dereferencing a null
pointer, or a pointer to one past the end of an array, is
undefined behavior (in C++---in C, there are certain special
cases where one past the end of an array is allowed).

Not computer-pointers. Text pointers, links.
Implicit or explicit has nothing to do with it. If the standard
says (and it does) that p->f() has the semantics of (*p).f(),
then it has the semantics of (*p).f(). I don't see what more
needs to be said.

Sure, but the definition of what "*t" means is in 5.3.1/1, the
semantical equivalence of x->y and (*x).y is in 5.2.5/3 and the effect
of dereferencing a null pointer is mentioned in 1.9/4 (and if the user
is lucky enough, he may even find 4.10/1 and learn that the null
pointer is not address 0). This is like the German Tax Code.
Again, the standard is fairly explicit, although perhaps not
where you'd expect. §3.2/1: "An expression is potentially
evaluated unless it is either the operand of the sizeof
operator, or the operand of the typeid operator and does not
designate an lvalue of polymorphic class type."

And so on..This is what I mean. To understand something rather
important, namely when and how "->" invokes undefined behavior I have
to go through the whole document. To make my point clear, I am not
interested in a discussion whether a specific construct has UB or not,
but how the references to UB in the Standard could be organized to make
it easier or just feasible to identify UB in code and optionally how to
avoid that certain case of UB.

If I search "undefined" in the Standard, I get 195 hits. This includes
totally arcane specifications, like that if a source file does not end
in a new-line character then UB arises[1], to ones really worth
knowing, such that even forming a pointer to a non-static member of a
non-POD object X is UB if the constructor for X hasn't started yet. Of
course, this is a case of the "non-POD class types don't have a static
layout" principle, but the example in 12.7 may suprise, because it adds
the salt of undefined initializion order between TUs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,085
Latest member
cryptooseoagencies

Latest Threads

Top