Undefined but reasonable

S

sandeep

We often read that an undefined behavior can make nosal demons appear or
reformat your hard disk. I would like to see the ISO Standard address
this problem by speaking of "undefined but reasonable" behavior.
Reasonable would mean effects are restricted to the relevant subsystem.

For example, an undefined behavior in the memory subsystem could corrupt
the value of a variable or raise a signal or abort the program, but it
cannot affect disk files.

An undefined behavior in the file subsystem could corrupt any open file
but not reformat the whole disk or send data through a network connection.

Thanks for your attention.
 
I

Ian Collins

We often read that an undefined behavior can make nosal demons appear or
reformat your hard disk. I would like to see the ISO Standard address
this problem by speaking of "undefined but reasonable" behavior.
Reasonable would mean effects are restricted to the relevant subsystem.

For example, an undefined behavior in the memory subsystem could corrupt
the value of a variable or raise a signal or abort the program, but it
cannot affect disk files.

A standard for a programming language can't dictate the behaviour of the
environment used to run a programme written in that language unless that
language happens to be something like Java with its own sandbox.
 
C

crisgoogle

We often read that an undefined behavior can make nosal demons appear or
reformat your hard disk. I would like to see the ISO Standard address
this problem by speaking of "undefined but reasonable" behavior.
Reasonable would mean effects are restricted to the relevant subsystem.

For example, an undefined behavior in the memory subsystem could corrupt
the value of a variable or raise a signal or abort the program, but it
cannot affect disk files.

An undefined behavior in the file subsystem could corrupt any open file
but not reformat the whole disk or send data through a network connection..

Thanks for your attention.

This approach mainly is just completely impractical.

First of all, how do you decide which subsystems are "relevant" across
all
platforms. For example, particularly in any real-mode type processor,
undefined
behaviour in the memory subsystem, as you phrase it, could affect darn
near anything, including files on disk (e.g., maybe your undefined
behaviour actually wrote to a FILE structure that's for a currently
open file).

Secondly, what's the point? Once you do something undefined, unless
you very carefully, and in a very restrictive manner, define the
"relevant" systems as above, what exactly can you be sure your
program will actually do? Okay, the undefined behaviour is restricted
to the memory subsystem ... but that means that anything you do after
that that may possibly use memory (i.e., just about _anything_), will
also still be undefined. Going the other way, if you do something
undefined that affect the file system (let's say you overwrite part
of the CRT) means that anything else you program does subsequently
is again undefined.

Just think about how you might word the Standard to do what you want.
I
think you will very quickly realize that it's not going to fly.
 
W

Walter Banks

sandeep said:
We often read that an undefined behavior can make nosal demons appear or
reformat your hard disk. I would like to see the ISO Standard address
this problem by speaking of "undefined but reasonable" behavior.
Reasonable would mean effects are restricted to the relevant subsystem.

As reasonable as your idea is it has a fundamental problem. We
have implemented a handful of C compilers on very unusual
architectures. In every case we attempt to make a reasonable
documented choice for undefined behaviour. Most of the
problems we see are not easily foreseen inside the C standards
meetings.

Any attempt to restrict the undefined behaviour would by definition
define the developers choices and in effect be defined.

The good news if there is any is compiler developers in general
tend to make rational choices in their implementation. This is one
case where trusting the developer rather than depending on the
C committee to prevent problems in the implementation of undefined
behaviour is probably the best choice.

Regards,


walter..
 
K

Kelsey Bjarnason

We often read that an undefined behavior can make nosal demons appear or
reformat your hard disk. I would like to see the ISO Standard address
this problem by speaking of "undefined but reasonable" behavior.
Reasonable would mean effects are restricted to the relevant subsystem.

For example, an undefined behavior in the memory subsystem could corrupt
the value of a variable or raise a signal or abort the program, but it
cannot affect disk files.

An undefined behavior in the file subsystem could corrupt any open file
but not reformat the whole disk or send data through a network
connection.

Thanks for your attention.

Suppose you have a system where instead of writing to ports to control,
oh, disk access, you do so by writing values to specific memory locations.

Along comes the code with the UB, in this case, writing past the end of
an object (eg writing the 20th element of char a[10]). The result? You
could very well be writing to the disk controller memory address, with
something akin to "format disk cylinder 3". Which in turn could quite
possibly trash your entire OS, never mind just causing app bugs.
 
J

Jens Thoms Toerring

Suppose you have a system where instead of writing to ports to control,
oh, disk access, you do so by writing values to specific memory locations.
Along comes the code with the UB, in this case, writing past the end of
an object (eg writing the 20th element of char a[10]). The result? You
could very well be writing to the disk controller memory address, with
something akin to "format disk cylinder 3". Which in turn could quite
possibly trash your entire OS, never mind just causing app bugs.

And to make that impossible the compiler would have to know
*everything* about all the details of the system and it would
have to check each and every write/read to/from memory at run-
time to keep a program from doing that. The result would be
that writing a compiler would be a task probably orders of
magniude more difficult and every program running as slow as
molasses.

And, of course, the compiler would also have to have an extra
mode where all this checks are disabled in order to use C for
what it was initially developed for, i.e. writing operating
systems, drivers etc. - all these things actually only work
*because* there's undefined behaviour: when the behaviour is
not defined by the C standard the system itself can define
the behaviour and thus allow to do a lot of useful things
with C the standard can't reasonably be expected to pre-
scribe.

Finally, the whole problem doesn't exist on modern systems
with virtual memory - there you can't accidentally write to
memory registers that would reformat the hard drive from a
"normal" program since they aren't within the address space
of the program. So the introduction of virtual memory (with
the associated hardware) reduced the ecosystem for the worst
kind of nasal demons to make them nearly extinct - in a way
that's wastly more efficient than anything one could try to
put into a compiler. For the systems without virtual memory
(e.g. embedded systems) poisoning the biotope to get rid of
the nasal demons would destroy the whole thing.

Regards, Jens
 
E

Eric Sosman

We often read that an undefined behavior can make nosal demons appear or
reformat your hard disk. I would like to see the ISO Standard address
this problem by speaking of "undefined but reasonable" behavior.
Reasonable would mean effects are restricted to the relevant subsystem.

Have you looked at the Standard's definition of "unspecified
behavior?"
For example, an undefined behavior in the memory subsystem could corrupt
the value of a variable or raise a signal or abort the program, but it
cannot affect disk files.

Have you ever encountered a system where this is the case, or
could even possibly be the case? If I engage in memory-related
undefined behavior that clobbers some bytes in a buffer which is
subsequently written to disk, have I not "affected disk files?" If
I clobber a FILE (not FILE*) and unluckily change its notion of the
current file offset so my next I/O operation occurs at an unexpected
place, have I not "affected disk files?" Face it: Your program's
accessible memory holds the data that will be written to disk *and*
metadata describing where *and* how it will be written (how could it
be otherwise, since all such are arguments to the I/O functions?),
so corruption of your process' memory is always capable of affecting
anything else your process can touch.
An undefined behavior in the file subsystem could corrupt any open file
but not reformat the whole disk or send data through a network connection.

Same difficulties.
 
S

Seebs

Same difficulties.

This kind of stuff is why I plonked sandeep a while back; I don't even get
the impression that he's making an effort.

There's a couple of basic problems here. One is that it's totally non-obvious
how you could define undefined behavior as being "in the file subsystem"; I
guess maybe he's referring to stuff like flushing an input stream?

But consider: On many modern systems (possibly a majority by now), network
connections ARE files -- or at least, they are objects which have the same
semantics as open disk files.

So I have, somewhere, a set of "file descriptors" which refer to the files
I have open. File descriptor #1 is the output stream, file descriptor #7
is the file I'm writing logs to, and file descriptor #9 is the socket I use
to talk to a server.

The user does something "undefined" which causes a higher-level stream object
to have its internal data structure changed so that, instead of referring to
file descriptor 7, it refers to file descriptor 9. I try to write to that
file. The data goes out over the network socket.

It's simply not reasonable or sane to try to declare that no screwup can
ever have that effect.

-s
 
N

Nobody

Finally, the whole problem doesn't exist on modern systems
with virtual memory - there you can't accidentally write to
memory registers that would reformat the hard drive from a
"normal" program since they aren't within the address space
of the program.

That doesn't mean that you can't end up e.g. corrupting files.

If you overwrite a function's return address, you can end up executing
arbitrary data (i.e. pseudo-random code), which could do anything.

Most mechanisms which can be used to prevent this are heavily dependent
upon support from the CPU (e.g. the presence of an MMU). Pure software
approaches can impose entirely unacceptable overheads.

If you want this kind of protection, use a high-level language.
 
K

Keith Thompson

Nobody said:
That doesn't mean that you can't end up e.g. corrupting files.

If you overwrite a function's return address, you can end up executing
arbitrary data (i.e. pseudo-random code), which could do anything.

Well, barring OS bugs it can only do anything that the current
process has permission to do. So if I write past the end of an
array in a program that I'm running under my own user account,
I might corrupt my own files, and I might even bring down the
system by consuming excessive resources (thrashing the CPU or
filling the disk), but in theory I can't corrupt *your* files or
the operating system.

On the other hand, OS bugs that allow unprivileged processes
to exceed their privileges are hardly unknown, nor is code that
deliberately exploits such bugs.

And of course all this applies only to "modern systems with virtual
memory". There are plenty C implementations on systems that don't
qualify for that description.

One of the biggest problems with sandeep's original proposal is
defining just what these "subsystems" are and exactly where the
boundaries between them lie. The other biggest problem is enforcing
those boundaries.
 
S

Seebs

Well, barring OS bugs it can only do anything that the current
process has permission to do.

.... But such OS bugs are discovered on a pretty much weekly basis.
:) That really is the issue -- anything a hostile user can do could
happen by accident sooner or later.

-s
 
N

Nick

Seebs said:
... But such OS bugs are discovered on a pretty much weekly basis.
:) That really is the issue -- anything a hostile user can do could
happen by accident sooner or later.

When you consider what
printf("\t\b\b");
has been known to do, it's hard to see what poor old C can do about it.
 
N

Nick

Seebs said:
This kind of stuff is why I plonked sandeep a while back; I don't even get
the impression that he's making an effort.

There's a couple of basic problems here. One is that it's totally non-obvious
how you could define undefined behavior as being "in the file subsystem"; I
guess maybe he's referring to stuff like flushing an input stream?

But consider: On many modern systems (possibly a majority by now), network
connections ARE files -- or at least, they are objects which have the same
semantics as open disk files.

So I have, somewhere, a set of "file descriptors" which refer to the files
I have open. File descriptor #1 is the output stream, file descriptor #7
is the file I'm writing logs to, and file descriptor #9 is the socket I use
to talk to a server.

The user does something "undefined" which causes a higher-level stream object
to have its internal data structure changed so that, instead of referring to
file descriptor 7, it refers to file descriptor 9. I try to write to that
file. The data goes out over the network socket.

I don't think you even need to do that. If you just crash the system
stone dead instantly, while another process (the sort of thing we pretend
doesn't exist!) is half way through writing to disk, you can corrupt a
file.
It's simply not reasonable or sane to try to declare that no screwup can
ever have that effect.

Absolutely.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top