undefined and unspecified behavior

R

Rajesh S R

Can anyone tell me what is the difference between undefined behavior
and unspecified behavior?
Though I've read what is given about them, in ISO standards, I'm still
not able to get the difference.

For example:
Consider the following code:
a[i++] = i;

We say that the above expression statement produces undefined
behavior. Why can't we call it as unspecified behavior, because
nothing is said about the order of completion of side effects, which
affects the output?

Furthermore, I've found this expression "demons out of your nose",
used by many, while talking about undefined behavior. Please tell me
what that expression means, in this context.

Thanks in advance for the reply.
 
B

Barry Schwarz

Can anyone tell me what is the difference between undefined behavior
and unspecified behavior?
Though I've read what is given about them, in ISO standards, I'm still
not able to get the difference.

For example:
Consider the following code:
a[i++] = i;

We say that the above expression statement produces undefined
behavior. Why can't we call it as unspecified behavior, because
nothing is said about the order of completion of side effects, which
affects the output?

Furthermore, I've found this expression "demons out of your nose",
used by many, while talking about undefined behavior. Please tell me
what that expression means, in this context.

Thanks in advance for the reply.

Undefined behavior requires the use of a "nonportable or erroneous
program construct or of erroneous data" (paragraph 3.4.3). Your
example above meets this condition because while syntactically correct
it violates the restriction in paragraph 6.5-2 and is therefore
erroneous.

Unspecified behavior is not erroneous. It is a slightly more general
case of implementation-defined behavior. In both cases, the standard
provides two or more possibilities. For implementation-defined
behavior, the chosen possibility must be documented by every
(conforming) implementation. For unspecified behavior, there are no
further requirements so it need not be documented.


Remove del for email
 
S

santosh

Rajesh S R wrote:

Furthermore, I've found this expression "demons out of your nose",
used by many, while talking about undefined behavior. Please tell me
what that expression means, in this context.

It's just a humorous way of mentioning that anything can happen after
undefined behaviour has been invoked in a program. I've heard that
that particular phrase was coined in clc by a former regular called
Kaz Kyzhelku. Apparently it's a common response when undefined
behaviour is invoked in C programs running on the mythical DS9000.
 
C

christian.bau

Unspecified behavior: The C Standard gives more than one possibility,
and it is not specified which of these possibilities is chosen, but
you know that one of them will be picked.

Undefined behavior: Anything can happen, and I mean absolutely
anything. Don't even think about trying to figure out what could
happen and what couldn't, because _anything_ can happen. In your
example, if the unspecified order in which side effects happen where
the only problem, then it would be "unspecified" behavior. But it is
worse: _Anything_ can happen, including formatting your hard disk,
your computer starting to send spam all over the place, or reporting
your credit card numbers to some hackers.
 
K

Keith Thompson

Rajesh S R said:
Can anyone tell me what is the difference between undefined behavior
and unspecified behavior?
Though I've read what is given about them, in ISO standards, I'm still
not able to get the difference.

For example:
Consider the following code:
a[i++] = i;

We say that the above expression statement produces undefined
behavior. Why can't we call it as unspecified behavior, because
nothing is said about the order of completion of side effects, which
affects the output?

In this case, you might assume that

a[i++] = i;

can only behave in one of a limited number of ways, simply because the
standard doesn't specify the order of evaluation; the "i" on the right
hand side of the assignment might be evaluated either before or after
"i" is incremented by the "++". (If that were the case, it would be
unspecified behavior.)

But in fact, the standard explicitly says that the behavior is
*undefined*, which means there are *no* constraints on the behavior.
It could quietly do exactly what you expect it to do, it could
increase i by 42 rather than by 1, it could clobber some unrelated
variable, it could crash the program, or it could crash your operating
system. Some of these outcomes are unlikely; the point is that the
standard doesn't preclude them.

The reason the standard allows this much freedom to the implementation
is basically to allow for optimizations. The compiler is allowed to
rearrange expressions to improve performance. To do so, it has to do
some compile-time analysis to make sure the rearrangement doesn't
break anything. The standard permits this analysis to *assume* that
no undefined behavior occurs. If that assumption is incorrect, it's
considered the programmer's fault that the code is broken.

Now in this particular case, the undefined behavior is something that
could fairly easily be detected at compile time. But pointers make
things more interesting. Rather than

a[i++] = i;

consider

a[(*p1)++] = (*p2)++;

The latter invokes undefined behavior only if p1 and p2 happen to
point to the same object, something the compiler can't necessarily
determine.
Furthermore, I've found this expression "demons out of your nose",
used by many, while talking about undefined behavior. Please tell me
what that expression means, in this context.

It's a humorous expression of the idea that the standard "places no
requirements" on the results of undefined behavior. Obviously a C
implementation can't literally make demons fly out of your nose -- but
if executing "a[i++] = i;" actually *did* make demons fly out of your
nose, that wouldn't tell you that the implementation is
non-conforming.
 
C

CBFalconer

santosh said:
Rajesh S R wrote:



It's just a humorous way of mentioning that anything can happen after
undefined behaviour has been invoked in a program. I've heard that
that particular phrase was coined in clc by a former regular called
Kaz Kyzhelku. Apparently it's a common response when undefined
behaviour is invoked in C programs running on the mythical DS9000.

What mythical? Go here:

http://dialspace.dial.pipex.com/town/green/gfd34/art/
 
T

Thad Smith

Rajesh said:
Can anyone tell me what is the difference between undefined behavior
and unspecified behavior?
Though I've read what is given about them, in ISO standards, I'm still
not able to get the difference.

From N869:
3.19
[#1] unspecified behavior
behavior where this International Standard provides two or
more possibilities and imposes no requirements on which is
chosen in any instance

There are specified bounds on the results of unspecified behavior. It
might be an integer with some value.

Undefined behavior has no specified bounds, so may cause the computer to
crash, for example, which would not be permitted for unspecified behavior.
For example:
Consider the following code:
a[i++] = i;

We say that the above expression statement produces undefined
behavior. Why can't we call it as unspecified behavior, because
nothing is said about the order of completion of side effects, which
affects the output?

I can't give a good reason.
Furthermore, I've found this expression "demons out of your nose",
used by many, while talking about undefined behavior. Please tell me
what that expression means, in this context.

It is a fanciful image of something very unlikely. The point is that
since /that/ is permitted, any conceivable failure is permitted by the
standard (such as crashing, clobbering variables, restarting, you name
it). The shorthand phrase, then, covers all those and more!
 
D

Default User

santosh said:
Rajesh S R wrote:



It's just a humorous way of mentioning that anything can happen after
undefined behaviour has been invoked in a program. I've heard that
that particular phrase was coined in clc by a former regular called
Kaz Kyzhelku. Apparently it's a common response when undefined
behaviour is invoked in C programs running on the mythical DS9000.


I've usually seen a fellow name John Woods (who I don't recall
personally) credited with coining it. This may be it:

<http://groups.google.com/group/comp.std.c/msg/dfe1ef367547684b>



Kaz famously had said something like, "I ran it on the Deathstation
9000, and demons flew out of my nose." That became a .sig for one or
more regulars over the years.

I can't find Kaz's original post (hey, it's Saturday) only follow-ups.

<http://groups.google.com/group/comp.arch/msg/3095a0d47c235c86>


The more you dig, the more you realize how many messages aren't in the
archives.



Brian
 
C

CBFalconer

Default said:
.... snip ...

The more you dig, the more you realize how many messages aren't
in the archives.

Assuming you mean Google by archives, that may be because of
anti-social x-noarchive headers, or because the originator
specifically asked that it be removed. The latter is a perfectly
respectable means of retracting foolishness.

--
Some informative links:
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/> (taming google)
<http://members.fortunecity.com/nnqweb/> (newusers)
 
D

Default User

CBFalconer said:
Assuming you mean Google by archives,

Yes. I know there are other archives, but I don't of any that are
publicly searchable.
that may be because of
anti-social x-noarchive headers, or because the originator
specifically asked that it be removed.

Kaz has many posts in Google, including many with that email address. I
don't know much about Mr. Woods.

There are posts of mine that are missing, that neither of the above
apply to.
The latter is a perfectly
respectable means of retracting foolishness.

Neither post that I found referenced seems to fit that category, but
you never know.



Brian
 
M

Malcolm McLean

Rajesh S R said:
Can anyone tell me what is the difference between undefined behavior
and unspecified behavior?
Though I've read what is given about them, in ISO standards, I'm still
not able to get the difference.

For example:
Consider the following code:
a[i++] = i;

We say that the above expression statement produces undefined
behavior. Why can't we call it as unspecified behavior, because
nothing is said about the order of completion of side effects, which
affects the output?

Furthermore, I've found this expression "demons out of your nose",
used by many, while talking about undefined behavior. Please tell me
what that expression means, in this context.
Let's say i is ten.
In reality this code will do one of three things: set a[10] to 10, set a[10]
to 11, or terminate the program with an error message.
Had ANSI chosen they could have eliminated option three and required the
compiler vendors to document the behaviour, making the code
implementation-defined. However they didn't. Rather than spell everything
out, we just say "this code can do anything".

Many illegal operations have the potential to corrupt the computer's memory.
Since C does not require intelligent guarding of array bounds, illegal
writes can be difficult to catch. Once you set random variables to random
values, anything can happen, including the program behaving in very funny
ways, such as incrementing Blogg's bank account by a million dollars.
"Demons fly out of your nose" is a bad example, since no program can make a
computer do anything it is physically incapable of performing.
 
E

Eric Sosman

Rajesh said:
Can anyone tell me what is the difference between undefined behavior
and unspecified behavior?
> Though I've read what is given about them, in ISO standards, I'm still
> not able to get the difference.

Sometimes the Standard gives a list of possible behaviors
in some situation: "The behavior shall be A or B or ..." As
long as the implementation behaves in any of those ways, it
obeys the Standard. That is "unspecified behavior," meaning
that the Standard does not specify which of the possibilities
will be chosen. As an example, the Standard points out the
order of function argument evaluation: in `f( u(x), v(x) )'
it is "unspecified" whether u() is called before or after v().

Sometimes the Standard requires that the implementation
document the choice of some particular unspecified behavior.
For example, the maximum value of an `int' is unspecified, but
the implementation must "publish" that value as `INT_MAX'.
This subset of unspecified behavior is called "implementation-
defined behavior."

Sometimes the Standard imposes no requirement at all: If
the program does thus-and-such, the Standard simply washes its
hands of the matter and disclaims jurisdiction. That's what
"undefined behavior" is: When the Standard no longer rules the
course of events. For example, the effect of dividing by zero
is undefined: You might get a result like "infinity" or you
might get a completely bogus result or you might get a crash.
In principle, since the program has left the country where the
Standard is law and has entered a lawless territory, anything
can happen. You have no Constitutional rights on a desert island.
For example:
Consider the following code:
a[i++] = i;

We say that the above expression statement produces undefined
behavior. Why can't we call it as unspecified behavior, because
nothing is said about the order of completion of side effects, which
affects the output?

One intent of the Standard is to promote implementations of
C on widely differing platforms, which it does (in part) by trying
to avoid raising unnecessary barriers. Every time the Standard
requires some particular behavior, it constrains implementations
and puts a burden of obedience on them. Instead of forcing the
implementations to assign some arbitrary meaning to the dubious
statement above, the Standard leaves it undefined: Why pass a
largely useless law that might make trouble by hobbling the
(software and hardware) optimizers?

Every requirement in the Standard represents a balancing act
between usefulness and restrictiveness.
Furthermore, I've found this expression "demons out of your nose",
used by many, while talking about undefined behavior. Please tell me
what that expression means, in this context.

It was the original 1989 ANSI Standard that gave the phrase
"undefined behavior" wide currency, and since it was a new thing
people liked to play with it. A little bit of a game developed
in c.l.c. about just how "undefined" the behavior could be, and
when somebody posted code like `a[i++] = i' people would have fun
cooking up outrageous descriptions of the possible consequences.

Things started fairly tamely with descriptions of system
malfunctions: "Your hard drive will be reformatted," "Your
CPU will melt to silicon slag," and that sort of thing. The
descriptions started becoming more imaginative: "Your SPACE bar
will be charged to six million volts and electrocute you on
the spot," "Chocolate pudding will ooze from your floppy drive."
The warning that "Demons will fly out of your nose" was popular
at about the time the game of describing undefined behaviors
started to lose its appeal, and survived in memory in somewhat
the way the last Tsar remains a special figure (quick: who was
the next-to-last Tsar?). The shorthand "nasal demons" was widely
used for a while, but nowadays these fanciful descriptions of
undefined behavior are largely things of the past.

Perhaps reality has overtaken them.
 
K

Kenneth Brody

Thad said:
Rajesh S R wrote: [...]
For example:
Consider the following code:
a[i++] = i;

We say that the above expression statement produces undefined
behavior. Why can't we call it as unspecified behavior, because
nothing is said about the order of completion of side effects, which
affects the output?

I can't give a good reason.
[...]

Consider a different example, using the same UB from the Standard:

i = i++;

Some CPUs can execute more than one instruction simultaneously.
Why should the compiler be forbidden to take advantage of this by
having both the "store this value into i" and "increment i" run in
parallel? (If this were "j = i++", it would be perfectly legal.)
You now have two instructions modifying the same memory location
running in parallel. At a minimum, this will cause "bad things"[tm]
to happen. At worst, it can lock the computer up. (Most likely,
the hardware will have been designed to take this into account,
and trap the error, allowing the O/S to terminate the application.
However, this is not guaranteed, and the behavior is outside the
Standard's scope.)

Now, why should the Standard give special-case exceptions to this
particular UB in cases where "unspecified behavior" might make
sense?

As someone else pointed out, the purpose of UB is to give the
compiler writers leeway in implementation and optimization,
given the assumption that UB won't be invoked by the source
code. Why should the compiler writer have to attempt to detect
such scenarios, or not implement some optimizations which would
be useful in non-UB scenarios, when the UB is forbidden by the
Standard in the first place?

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top