Confusion about undefined behaviour

P

Phil Carmody

Stephen Sprunk said:
(* The IDT must live at physical addresses 0-1023 on x86, which is a

False assumptions may lead to false conclusions. You seem to be
unaware of the LIDT instruction which has only existed in the
instruction set for 26 years. Yes, even I'm surprised it's 26,
but it is.

Phil
 
J

James Kuyper

Hallvard said:
Then we are not having the same discussion.

Quite possibly - but until you are more specific about what you found
objectionable in my statement, that's the best response I can give you.
Yes. Then what are you disagreeing with me about? Why are you denying
that the kernel can define what happens when you follow a null pointer?

Because the kernel designers can define what they want to have happen,
and on the basis of that definition rule out the use of certain
compilers, but it's the compiler that they chose which defines what
actually happens.
There's nothing to correct if it's not a bug. It's not a bug to depend
on a compiler extension. If that's what they were doing (and specifying
that they were doing).

It's a bug if dependence on the compiler extension was unintentional,
which seems to be what happened in this case. The code looks very much
like the possibility that 'tun' might be null was given only pro-forma
consideration, and as a result the test for that possibility was misplaced.

For what possible reason should they NOT replace:


struct sock *sk = tun->sk;
if (!tun)
return POLLERR;

with

struct sock *sk;
if(!tun)
return POLLERR;
sk = tun->sk;

?
 
K

Keith Thompson

Hallvard B Furuseth said:
Yes, this particular case was a bug. But this is about the general case
- quoting Stephen Sprunk upthread:

"However, the problem is that in Linux/x86 _kernel_ code, dereferencing
NULL is defined (by the implementation) to _not_ trap"

Ok -- but does it *ever* make sense for kernel code to depend on this?

Hypothesis: Attempting to dereference a null pointer in Linux kernel
code always indicates a bug. The environment arranges not to trap
on such a dereference only for the purpose of avoiding drastic
consequences for such bugs, not to enable any useful behavior.
 
K

Keith Thompson

Phil Carmody said:
The implementation may specify that a pointer with all zero bits
in its internal representation is dereferencable, the C standard
permits that (though one has to work for it). That's different
from a NULL pointer though.

I've worked on a system where a NULL pointer, e.g. (void*)0,
was represented by 0x8000000 (and any address with the top bit
set would cause a fault).

Absolutely correct in the general case. In this specific case,
however (Linux on x86 compiled with gcc), a null pointer happens to be
all-bits-zero, and the implementation may rely on that guarantee.
 
J

James Kuyper

Hallvard said:
Keith Thompson writes: ....

Yes, this particular case was a bug. But this is about the general case
- quoting Stephen Sprunk upthread:

"However, the problem is that in Linux/x86 _kernel_ code, dereferencing
NULL is defined (by the implementation) to _not_ trap.

The code in question was compiled by gcc with
-fdelete-null-pointer-checks turned on, either explicitly or as a
side-effect of the other options chosen. That implementation of C does
not define the behavior that results from dereferencing NULL in the
manner he described, so his statement was incorrect. It's the C compiler
which defines such things, not the kernel. The kernel might come with
documentation that specifies something else, but that documentation has
no control over what actually happens; only the compiler has that control.
 
P

Phil Carmody

Oh, I do. The C Standard Committee can walk away without blame, and so
can the gcc crowd. The fault lies entirely with the kernel jocks.

But they never ran into the situation anyway, so they didn't need
to walk away. They can stand back and feel smug, if they so desire,
as they have the right to.

I spotted some more UB in the core kernel yesterday, and I noticed that
the maintainer of the component it was part of was the same guy who
contributed the UB above. (Un)fortunately it's not exploitable ;-)

I was trying to find C&V in the standard yesterday, but failing -
can someone give me a pointer or reference to where it says the
following is baaaaaaaad.

struct { int x; int foo[32]; int bar[32]; } *p;
//...
p->foo[32] = thing();
p->bar[0] = IdontcarewhatIjustdidasImrepairingitnow();

Much obliged,
Phil
 
P

Phil Carmody

Moi said:
Strictly speaking it is UB.

But, given the fact that the kernel-source must be considered
freestanding code, IMHO different rules _could_ apply.
( I don't know what GCC's -ffreestanding flag actually does; it possibly
just avoids library dependencies)

IIRC in freestanding environments the compiler *may* define UB to
anything it wants. (correct me if I am wrong)

The implementation may define anything it wants in cases of UB.
It may define a dog to have 5 tails, for example.
I can imagine the compiler user _wants_ the "intended behaviour" (IB) to
be "If I dereference null pointers, I know what I'm doing. Trust me ..."

How else could a kernel ever address memory at location=0 ?

I can think of at least two methods.
1) it can use an implementation which doesn't use the all-bits-zero value
as the null pointer. (I've seen this on some embedded systems where address 0
was necessary to access, for example.)
2) it can use an implementation which supplies a
writeAtAddress0(char const*,size_t) function.

You appear to be confusing NULL with the all-bits-zero address. I admit,
the ability to use ``0'' as a null pointer constant invites such
misinterpretation, but it's still a misinterpretation.
Seen in this light, the compiler should *not* be allowed to optimize out

Too late to apply your rules, it already is allowed according to the
rules of the language.

Phil
 
P

Phil Carmody

Moi said:
I know. This is c.l.c ;-)

I know as well that on the x86 a NULL pointer is represented as "all bits
zero".

Processors do not define what address value the NULL pointer will
represent, C implementations do.

Phil
 
P

Phil Carmody

Dik T. Winter said:
Where in that article is a 'fix patch'?

If you're trying to imply that I claimed the fix patch was *in* the
article, you're up for a slapping. (But only a gentle one, as English
isn't your first language, though judging by your normal usage that's
hard to believe.)

However, if you're genuinely interested to see the most succint demonstration
of what's UB in the buggy code, then the fix patch is two clicks away, via
the exploit. Didn't your mother teach you about the wonders of hypertext?!

Phil
 
B

Boon

Phil said:
I was trying to find C&V in the standard yesterday, but failing -
can someone give me a pointer or reference to where it says the
following is baaaaaaaad.

struct { int x; int foo[32]; int bar[32]; } *p;
//...
p->foo[32] = thing();
p->bar[0] = IdontcarewhatIjustdidasImrepairingitnow();

In C89, 3.3.2.1 (Array subscripting) and 3.3.6 (Additive operators)

"""
A postfix expression followed by an expression in square brackets [] is a
subscripted designation of a member of an array object. The definition of
the subscript operator [] is that E1[E2] is identical to (*(E1+(E2)))
"""

and

"""
Thus if P points to a member of an array object, the expression P+1 points to
the next member of the array object. Unless both the pointer operand and the
result point to a member of the same array object, or one past the last member
of the array object, the behavior is undefined. Unless both the pointer operand
and the result point to a member of the same array object, or the pointer
operand points one past the last member of an array object and the result points
to a member of the same array object, the behavior is undefined if the result is
used as the operand of a unary * operator.
"""

p->foo[32] invokes UB.
 
F

FatPhil

Phil said:
I was trying to find C&V in the standard yesterday, but failing -
....
In C89, 3.3.2.1 (Array subscripting) and 3.3.6 (Additive operators) ....
p->foo[32] invokesUB.

Many many thanks - I didn't think of looking there!
Bringing into C99-land (in particular n1256):

6.5.6 Additive operators
8 When an expression that has integer type is added to or subtracted
from a pointer, the
result has the type of the pointer operand. If the pointer operand
points to an element of
an array object, and the array is large enough, the result points to
an element offset from
the original element such that the difference of the subscripts of
the resulting and original
array elements equals the integer expression. In other words, if the
expression P points to
the i-th element of an array object, the expressions (P)+N
(equivalently, N+(P)) and
(P)-N (where N has the value n) point to, respectively, the i+n-th
and i-n-th elements of
the array object, provided they exist. Moreover, if the expression P
points to the last
element of an array object, the expression (P)+1 points one past the
last element of the
array object, and if the expression Q points one past the last
element of an array object,
the expression (Q)-1 points to the last element of the array object.
If both the pointer
operand and the result point to elements of the same array object,
or one past the last
element of the array object, the evaluation shall not produce an
overflow; otherwise, the
behavior is undefined. If the result points one past the last
element of the array object, it
shall not be used as the operand of a unary * operator that is
evaluated.

I don't think that's worded perfectly, but the meat is all there.

Thanks again,
Phil
 
N

Nobody

Yes, this particular case was a bug. But this is about the general case
- quoting Stephen Sprunk upthread:

"However, the problem is that in Linux/x86 _kernel_ code, dereferencing
NULL is defined (by the implementation) to _not_ trap"

I don't agree with that assessment.

Several people are claiming that the fact that the kernel maps RAM into
the first page means that the kernel developers are trying to *define* C's
null pointer semantics. But in the absence of actual evidence, that's pure
conjecture.


Several people are claiming that the fact that the kernel maps RAM into
the first page means that the kernel developers are trying to *define* C's
null pointer semantics. But in the absence of actual evidence, that's pure
conjecture.
 
D

Dik T. Winter

>
> If you're trying to imply that I claimed the fix patch was *in* the
> article, you're up for a slapping.

Perhaps, but my article was based on that article, and then you stated that
I have to read it again, I can find nothing more than what I already have
written about it.
> However, if you're genuinely interested to see the most succint demonstration
> of what's UB in the buggy code, then the fix patch is two clicks away, via
> the exploit. Didn't your mother teach you about the wonders of hypertext?!

Not entirely two clicks with the configuration I am using. First download
a gzipped tar-file containing the exploit. Looking for links in the
source files downloaded and follow that... However, why should I expect
to find a fix in the exploit?
 
F

FatPhil

Just an observation: In this case we saw code that dereferenced a null
pointer - not only is this undefined behaviour, but also something
that would case a crash in many current C environments, for example in
a typical Linux, MacOS X or Windows application program. And
surprisingly for many, Linuxkernelprogrammers apparently don't
expect it to crash.

However, there are other dereferences that will _not_ cause a crash in
a typical environment, and that still cause undefined behaviour if the
pointer p is a null pointer:

1. char* q = p->x; // When x has type "array of char"
2. int* q = &p->x; // When x has type "int"
3. int b = (p >= p);
4. T* q = p + 0; // When p has type "T*"
5. ptrdiff_t d = p - p;

Each of these causes undefined behaviour, and could cause the compiler
to assume that p != NULL.

For example, a compiler can always replace (p + 0 != NULL) with TRUE.

I'm not doubting you for a minute, but I'd like someone else to
independently confirm each of the above.

Doesn't gcc use (2) for its offsetof() macro? (Answering self - yes,
if there isn't a __compiler_offsetof macro, but I don't know when
that's the case, it seems that if it's defined, it's defined to be
__builtin_offsetof.)

Cheers,
Phil
 
F

Flash Gordon

Obviously a dereference of a null pointer so undefined.

Equivalent to
int *q = &((*p).x)
So the & is not directly applied to the result of the * so the rule that
& and * can cancel out does not apply, so still a dereference of a null
pointer and undefined behaviour.

A comparison other than for (in)equality when the pointers do not point
to (or one passed the end of) the same object (as they do not point to
*any* object). So undefined behaviour.

Addition to a pointer is only defined if the pointer points to an object
(or one passed the end of an object) so undefined behaviour.

Subtraction of pointers from each other when the pointers do not point
to (or one passed the end of) the same object (as they do not point to
*any* object). So undefined behaviour.
I'm not doubting you for a minute, but I'd like someone else to
independently confirm each of the above.

Doesn't gcc use (2) for its offsetof() macro? (Answering self - yes,
if there isn't a __compiler_offsetof macro, but I don't know when
that's the case, it seems that if it's defined, it's defined to be
__builtin_offsetof.)


The implementation is allowed to use anything it wants in its headers to
achieve the desired result, including things which are undefined
behaviour if *you* do them. It would be legal for an implementation to
provide the following definition of the offestof macro
#defined offsetof(type,field) (size_t)(*0)
Just as long as it implemented some magic to make it work (for instance,
completely ignoring the definition and just doing the right thing). I
doubt you will see it done, but it would not make the implementation
non-conforming as far as I can see.

In fact, one reason for some of the macros (including offestof) that the
implementation is required to provide is because *you* cannot implement
them without invoking undefined behaviour!
 
P

Phil Carmody

[SNIP - 5 expositions on why they're UB]

Many thanks. I agree with all the reasoning, so we're all on the same
page. I still think that section 8 of additive operators needs a rewrite
though.

I must say that I'm a bit disappointed by #4. If any merit could be
squeezed out of the use of any of them, that was the only drop I
could detect.

I notice that gcc-4.3 is smart enough to fold the explicit
int cmp(void*p,void*q)
{
return p==q || p>q;
}
into a single test:
cmp:
movl 8(%esp), %eax
cmpl %eax, 4(%esp)
setae %al
movzbl %al, %eax
ret
So at least there's no apparent cost for (one instance of) going out
of your way if you know that NULLs are possible pointer values (that
you wish to compare as if they are less than any real pointer to your
array - this is probably a bad design, but is at least workable).

Phil
 
F

Flash Gordon

Phil said:
[SNIP - 5 expositions on why they're UB]

Many thanks.>

Glad to be of help :)
I agree with all the reasoning, so we're all on the same
page.

Well, I've had some training as a pedantic git...
I still think that section 8 of additive operators needs a rewrite
though.

I must say that I'm a bit disappointed by #4. If any merit could be
squeezed out of the use of any of them, that was the only drop I
could detect.

I can't think of any real situation where knowing it was valid would
help. After all, for the real world the 0 would be in a value in a
variable (why would you add a constant 0?) so you would have to know
that in all situations where p was null the value added would be 0, and
I can't imagine when that would be true! Oh, and you could always do a
test if it was possible!
I notice that gcc-4.3 is smart enough to fold the explicit
int cmp(void*p,void*q)
{
return p==q || p>q;
}
into a single test:
cmp:
movl 8(%esp), %eax
cmpl %eax, 4(%esp)
setae %al
movzbl %al, %eax
ret

I'm too out of practice (and out of date) on x86 assembler to read that,
so I'll take your word it is sensible...
So at least there's no apparent cost for (one instance of) going out
of your way if you know that NULLs are possible pointer values (that
you wish to compare as if they are less than any real pointer to your
array - this is probably a bad design, but is at least workable).

Well, compilers are pretty good at the easy optimisations.
 
D

David Thompson

I would have thought that this didn't count as dereferencing the
pointer. Dereferencing gives you a value; you can't apply & to a value
to ask "where is this stored?". Not that I'm an expert.

In C semantics dereferencing a valid (data) pointer gives you an
*lvalue*, which you can take the & of; any sane compiler collapses
these to a no-op (just keeps the pointer). The C99 change makes it
official even for a null pointer, which doesn't point to a valid
object. Most other uses -- not all -- of the lvalue coerce it to an
rvalue, which is what many people would call the actual dereference.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,812
Messages
2,569,694
Members
45,478
Latest member
dontilydondon

Latest Threads

Top