Undefined behaviour

M

marbac

Hi,

i heard a lot about "undefined behaviour" in this and other newsgroups
dealing with c/c++.

Is there a list where all cases with undefined behaviour in C++ are listed?

regards marbac
 
S

Sharad Kala

marbac said:
Hi,

i heard a lot about "undefined behaviour" in this and other newsgroups
dealing with c/c++.

Is there a list where all cases with undefined behaviour in C++ are
listed?

I don't know of any compiled as a list.
But read the Holy Standard, it speaks a lot about undefined behaviors.

-Sharad
 
J

John Harrison

marbac said:
Hi,

i heard a lot about "undefined behaviour" in this and other newsgroups
dealing with c/c++.

Is there a list where all cases with undefined behaviour in C++ are listed?

regards marbac

It's called the C++ standard, and its several hundred pages long.

Off the top of my head here are a few common causes of undefined behaviour

1) dereferencing a null pointer
2) accessing outside the bounds of an array
3) deleting the same memory twice
4) dereferencing a pointer after it has been deleted
5) dereferencing a pointer which points to a destroyed object
6) accessing an uninitialised variable
7) signed integer overflow
8) modifying a const object

No doubt I've missed many others

As you can see several of the common causes of undefined behaviour involve
pointers. So the moral is don't use pointers, prefer STL classes instead,
they are somewhat safer.

john
 
S

Sharad Kala

John Harrison said:
It's called the C++ standard, and its several hundred pages long.

Off the top of my head here are a few common causes of undefined behaviour

1) dereferencing a null pointer
2) accessing outside the bounds of an array
3) deleting the same memory twice
4) dereferencing a pointer after it has been deleted
5) dereferencing a pointer which points to a destroyed object
6) accessing an uninitialised variable
7) signed integer overflow
8) modifying a const object

Some more strike me -
1) main returning void
2) Copying a pointer after it has been deleted
3) Using mismatched forms of new/delete for arrays
4) Modifying a string literal
5) Changing a variable twice without a sequence point
6) Instantiating an STL container with auto_ptr
7) Adding declarations/definitions to std namespace
8) Playing around with reinterpret_cast

many more...

-Sharad
 
I

Ivan Vecerina

John Harrison said:
....
Off the top of my head here are a few common causes of undefined behaviour ....
1) dereferencing a null pointer
or any pointer to an address that has not been properly obtained ( i.e.
*(int*)454 = 0; )
2) accessing outside the bounds of an array
3) deleting the same memory twice
or calling delete on an object addresss that was not allocated with
new.
3b) Using delete[] to release memory allocated with new,
or delete on the address returned by new[].
4) dereferencing a pointer after it has been deleted
5) dereferencing a pointer which points to a destroyed object
6) accessing an uninitialised variable
or a variable that has been destroyed.
(NB: this also applies to global variables).
7) signed integer overflow
8) modifying a const object
Two important additions I can think of:
9) modifying a variable twice between sequence points (or accessing the
value being modified).
e.g. a = ++i + ++i; or a = i + ++i;
10) deleting a derived class through a pointer to a base class whose
destructor is not virtual.

Off the top of my head too, I think that these would be the most common
causes, but I'm sure the list can be extended.

Furthermore, UB may be triggered by causing library functions
to perform one of the above actions, for example by passing an
insufficiently large or invalid output buffer to functions such as sprintf
or strcpy.
Some standard library functions have explicit restrictions on the parameters
they
can receive (e.g. calling memcpy with overlapping memory ranges).
So it is important, for writing correct code, to understand the behavior
and restrictions of the functions you are calling. And it's not an obvious
thing.
No doubt I've missed many others
So do I...
The each of the C and C++ standards use the term "Undefined behavior"
close to 200 times, and an exaustive list is impossible to provide.
As you can see several of the common causes of undefined behaviour involve
pointers. So the moral is don't use pointers, prefer STL classes instead,
they are somewhat safer.
Overall, the C++ standard library does a better job than C's at trying to
prevent UB. What helps even more is if you are using an STL implementation
that
supports a 'debug' mode where all container iterators are checked at
runtime.
Some caveats I can think of include:
- initializing an std::string will a NULL char pointer.
- using [..] on standard containers (i.e. vector) does not verify range.
( vector::at() may be used instead, and will throw an exception ).


That's just adding my two cents, obviously my list is also partial and
incomplete...
Ivan
 
P

Peter Koch Larsen

"Ivan Vecerina" <[email protected]> skrev i
en meddelelse
[snip]
Some caveats I can think of include:
- initializing an std::string will a NULL char pointer.
- using [..] on standard containers (i.e. vector) does not verify range.
( vector::at() may be used instead, and will throw an exception ).

To be pedantic, nothing prevents vector::eek:perator[] to be implemented as
vector::at. At least this is how I read the standard.
No doubt most libraries will not do so for performance reasons, of course.

/Peter
 
R

Rolf Magnus

Sharad said:
Some more strike me -
1) main returning void
2) Copying a pointer after it has been deleted

You could just combine a lot of the pointer stuff to:

Using the value of a pointer that doesn't point to a valid object.
3) Using mismatched forms of new/delete for arrays
4) Modifying a string literal
5) Changing a variable twice without a sequence point

Or changing and reading it
6) Instantiating an STL container with auto_ptr
7) Adding declarations/definitions to std namespace
8) Playing around with reinterpret_cast

many more...

1) Dividing by zero
2) Pointer arithmetic that crosses array bounds
3) Returning a reference or pointer to a local variable
4) Writing to a member of a union and then reading another one
5) Deleting a derived class object though a pointer to a base class that
has no virtual destructor
7) Using offsetof on a non-POD class/struct
8) Using a map/set with a comparison function with no strict/weak
ordering for the key type
9) Accessing vector members that don't exist
10)Using a container iterator after it has become invalid
 
T

tom_usenet

"Ivan Vecerina" <[email protected]> skrev i
en meddelelse
[snip]
Some caveats I can think of include:
- initializing an std::string will a NULL char pointer.
- using [..] on standard containers (i.e. vector) does not verify range.
( vector::at() may be used instead, and will throw an exception ).

To be pedantic, nothing prevents vector::eek:perator[] to be implemented as
vector::at. At least this is how I read the standard.
No doubt most libraries will not do so for performance reasons, of course.

I wouldn't like a library that implemented operator[] as at. I don't
want exceptions from undefined behaviour, since the throwing context
is lost.

IMHO, any decent operator[] should at the very least have an assert in
it.

Tom
 
A

Aguilar, James

marbac said:
Hi,

i heard a lot about "undefined behaviour" in this and other newsgroups
dealing with c/c++.

Is there a list where all cases with undefined behaviour in C++ are listed?

regards marbac

I don't think it's really possible to compile a list. Think about it this
way: everything you could possibly do wrong that your compiler won't catch
is undefined behavior. Hence, what you're asking for is really a
compilation of all errors which could possibly be made in writing code,
which, unfortunately, is unlikely to exist anywhere, and would probably be
useless to you even if it did.
 
D

DaKoadMunky

Writing to a member of a union and then reading another one

Are you sure?

Certainly it seems as though it is a logic error which could possibly result in
undefined behavior, but is it guaranteed undefined behavior?

<CODE>

union FooBar
{
int Foo;
char Bar;
};

int main()
{

FooBar fooBar;

fooBar.Foo = 32; //Write FooBar::Foo

char bar = fooBar.Bar; //Oops! Read FooBar::Bar

return 0;
}

</CODE>

This code could results in the "rewrite your harddrive and sleep with your
girlfriend" kind of undefined behavior?

For that to happen wouldn't the compiler have to keep track of the last member
written to for each instance of a union so that it would be able to recognize
mismatched reads?

Just curious.
 
J

JKop

DaKoadMunky posted:
one

Are you sure?

Certainly it seems as though it is a logic error which could possibly
result in undefined behavior, but is it guaranteed undefined behavior?

<CODE>

union FooBar
{
int Foo;
char Bar;
};

int main()
{

FooBar fooBar;

fooBar.Foo = 32; //Write FooBar::Foo

char bar = fooBar.Bar; //Oops! Read FooBar::Bar

return 0;
}

</CODE>

This code could results in the "rewrite your harddrive and sleep with
your girlfriend" kind of undefined behavior?

For that to happen wouldn't the compiler have to keep track of the last
member written to for each instance of a union so that it would be able
to recognize mismatched reads?

Just curious.

Well the only thing *I* can think of that could make that
do anything weird is if by editing that particular byte of
the int, that the value you're left with is invalid; but
then can you even have an invalid bit pattern for an int?
Common sense says no, but maybe the almighty Standard
doesn't give any guarantees that there isn't.

-JKop
 
R

Ron Natalie

JKop said:
DaKoadMunky posted:

Conversion by union can be a disaster. Been there, done that. The BSD kernel used
to have a union that essentially looked like this:

union u {
char* c;
short* s;
int* i;
long* l;
};

and used to be loosy goosy vax and store into one field and read back from another.

This was fine until we were porting to a machine that encodes the operand size in
the low order bits of the pointer. That led to some fun hunting.
 
I

Ioannis Vranos

marbac said:
Hi,

i heard a lot about "undefined behaviour" in this and other newsgroups
dealing with c/c++.

Is there a list where all cases with undefined behaviour in C++ are listed?




"1.3.12 undefined behavior

behavior, such as might arise upon use of an erroneous program construct
or erroneous data, for which this International Standard imposes no
requirements. Undefined behavior may also be expected when this
International Standard omits the description of any explicit definition
of behavior. [Note: permissible undefined behavior ranges from ignoring
the situation completely with unpredictable results, to behaving during
translation or program execution in a documented manner characteristic
of the environment (with or without the issuance of a diagnostic
message), to terminating a translation or execution (with the issuance
of a diagnostic message). Many erroneous program constructs do not
engender undefined behavior; they are required to be diagnosed. ]"






Regards,

Ioannis Vranos

http://www23.brinkster.com/noicys
 
D

Default User

Rolf said:
4) Writing to a member of a union and then reading another one


This is implementation-defined in C. I'd be surprised if it was
different in C++.




Brian Rodenborn
 
R

Ron Natalie

Default User said:
This is implementation-defined in C. I'd be surprised if it was
different in C++.
C says it's unspecified, there is no need for the implementation to have a specific
behavior. In the general case, it's undefined behavior for C++ (there are some
specific outs).
 
D

Default User

Ron said:
C says it's unspecified, there is no need for the implementation to have a specific
behavior.

Ah, no. From the 89 standard:

With one exception, if a member of a union object is accessed after
a value has been stored in a different member of the object, the
behavior is implementation-defined./33/ One special guarantee is made
in order to simplify the use of unions: If a union contains several
structures that share a common initial sequence, and if the union
object currently contains one of these structures, it is permitted to
inspect the common initial part of any of them. Two structures share
a common initial sequence if corresponding members have compatible
types for a sequence of one or more initial members.

33. The ``byte orders'' for scalar types are invisible to isolated
programs that do not indulge in type punning (for example, by
assigning to one member of a union and inspecting the storage by
accessing another member that is an appropriately sized array of
character type), but must be accounted for when conforming to
externally-imposed storage layouts.


The C99 standard says virtually the same thing.

In the general case, it's undefined behavior for C++ (there are some
specific outs).


Could you quote the standard on that?



Brian Rodenborn
 
O

Old Wolf

Sharad Kala said:
Some more strike me -
2) Copying a pointer after it has been deleted

Exactly the same as John's #6 (the standard-ese term is "indeterminate")
4) Modifying a string literal

Same as John's #8 (string literals have type "char const []")
 
P

Peter Koch Larsen

tom_usenet said:
"Ivan Vecerina" <[email protected]> skrev i
en meddelelse
[snip]
Some caveats I can think of include:
- initializing an std::string will a NULL char pointer.
- using [..] on standard containers (i.e. vector) does not verify range.
( vector::at() may be used instead, and will throw an exception ).

To be pedantic, nothing prevents vector::eek:perator[] to be implemented as
vector::at. At least this is how I read the standard.
No doubt most libraries will not do so for performance reasons, of
course.

I wouldn't like a library that implemented operator[] as at. I don't
want exceptions from undefined behaviour, since the throwing context
is lost.

IMHO, any decent operator[] should at the very least have an assert in
it.

Tom

You're right of course. My point was that vector::eek:perator[] is allowed to
verify the range. The assert is an excellent solution.

/Peter
 
I

Ioannis Vranos

Peter said:
You're right of course. My point was that vector::eek:perator[] is allowed to
verify the range. The assert is an excellent solution.



If an implementation of vector's operator[]() threw an out_of_range
exception or anything else, it would be a system-specific extension, and
code assuming that this operator is index-checked, cannot be considered
portable.

Furthermore, since at() is provided for this, operator[] is reasonable
to be defined having an efficient access to the data.






Regards,

Ioannis Vranos

http://www23.brinkster.com/noicys
 
R

Rolf Magnus

Ioannis said:
Peter said:
You're right of course. My point was that vector::eek:perator[] is
allowed to verify the range. The assert is an excellent solution.



If an implementation of vector's operator[]() threw an out_of_range
exception or anything else, it would be a system-specific extension,

It would be an instance of undefined behaviour, just like a crash would
be. Still you would probably not call a crash a "system-specific
extension".
and code assuming that this operator is index-checked, cannot be
considered portable.

That's of course true, and this is exactly the reason why there is at().
Furthermore, since at() is provided for this, operator[] is reasonable
to be defined having an efficient access to the data.

Yes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top