Spiros Bousbouras said:
void foo(void)
{
int a; /* no UB so far */
int b = a; /* boom */
}
I think you're missing the OP's point. If it's OK for a to have
whatever value it has, why is it problematic for b to also have that
value?
Consider a and b as floating point variables. The suppose CPU has
some special hardware to handle floating point: either several
floating point wegisters[1] or an ack-stay[1] of wegisters[1] that
are supposed to cause a trap if a signalling NaN is loaded into
them. IEEE goes to considerable effort to detail what signalling
NaNs are supposed to do.
If a and b are of different size, it is likely that the CPU will
load the value of a into a floating point wegister[1] and then store
it into b, if that's the fastest way of converting the size of
floating point types. If this CPU loads the unininitialized value
of a into a floating point wegister[1], KABOOM!
Unfortunately I don't know what a wegister or ack-stay is and
Wikipedia doesn't know either. Furthermore in your scenario is
it not possible that by accident variable a will have a
legitimate value ?
Certainly.
But let's say that it contains a trap
representation ; is it the idea that your imaginary processor
automatically checks every floating value loaded onto a
wregister for having a trap represenation and does something
special if it does ?
The imaginary processor I have in mind (let's call it the "Pintel 387")
does precisely that. If a signaling NaN is loaded into a floating-point
register, er, wegister, a CPU exception occurs, which if it is not
masked or handled will usually cause the OS to kill the program.
Now in the case of assignment, a compiler could do it by copying the
bytes directly (via a general integer register) and not needing to load
the value into a floating point register. In fact it probably would,
since this is generally faster. But a dumb compiler might not do so,
and a trap would happen.
If you assign your variable containing a signaling NaN to a variable of
a different type, so that a conversion is needed, this is usually done
through the FPU and would probably cause a register load.
SNIP
A similar issue exists with pointers. Load a ([3456]86 architecture)
segment register with the segment portion of an invalid pointer,
and it traps. Leave the pointer in memory, and it's harmless.
Great , a real world example. So what characterises an invalid
pointer and what does the processor do ?
A "far" pointer on the 386 consists of a 16-bit selector and a 32-bit
offset. In order to derefence such a pointer, you must load the
selector into a special segment register, and then reference that
register in instructions that will use the pointer.
The selector is an index into a table of segments. A segment refers to
some region of memory, and the table contains information (a
"descriptor") about the segment, such as its base address, size, and
permissions. When the selector is loaded, the CPU checks that the
segment referenced is valid and accessible, and if so it caches the
descriptor information for future use. If you load a selector which
isn't a valid index into the table, or refers to an entry which is
marked as invalid, or which your program doesn't have permission to
access, the CPU raises a "general protection exception", which is
handled by the OS and ordinarily would cause it to kill your program.
As before, the compiler would not have any reason to load the selector
part of a far pointer into a segment register unless it is going to
dereference the pointer, and would probably avoid doing so because this
is an expensive operation due to all the checks that take place. So
this is maybe not the best example in the world. I don't offhand know
of a better one, though.
I'll note that this mechanism is mostly obsolete these days; it still
exists, but is generally not used by modern operating systems because it
is inconvenient. They set up a single segment which encompasses all of
memory, and from then on treat the address space as flat.