Knowing the implementation, are all undefined behaviours become implementation-defined behaviours?

Discussion in 'C Programming' started by Michael Tsang, Feb 14, 2010.

  1. -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
    program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
    pointer is defined to "crash the program with SIGSEGV".

    Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
    simply wrap around so we can say that the behaviour is defined to round on
    x86 CPUs.
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (GNU/Linux)

    iEYEARECAAYFAkt3kjsACgkQm4klUUKw07D7QwCfQH0jkVFEDAQMi9+t31JiQ449
    4QMAn2M+QxWW3yf4WShHgmWjBCluBvun
    =e8V1
    -----END PGP SIGNATURE-----
     
    Michael Tsang, Feb 14, 2010
    #1
    1. Advertising

  2. Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    * Michael Tsang:
    > -----BEGIN PGP SIGNED MESSAGE-----
    > Hash: SHA1
    >
    > Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
    > program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
    > pointer is defined to "crash the program with SIGSEGV".
    >
    > Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
    > simply wrap around so we can say that the behaviour is defined to round on
    > x86 CPUs.
    > -----BEGIN PGP SIGNATURE-----
    > Version: GnuPG v1.4.9 (GNU/Linux)
    >
    > iEYEARECAAYFAkt3kjsACgkQm4klUUKw07D7QwCfQH0jkVFEDAQMi9+t31JiQ449
    > 4QMAn2M+QxWW3yf4WShHgmWjBCluBvun
    > =e8V1
    > -----END PGP SIGNATURE-----
    >


    Your question, from the subject line, is

    "Knowing the implementation, are all undefined behaviours become
    implementation-defined behaviours?"

    And it's cross-posted to [comp.lang.c] and [comp.lang.c++].

    At least for C++ the answer is a definite maybe: theoretically it depends on the
    implementation.

    In practice the answer is a more clear "no", because it's practically impossible
    for an implementation to clearly define all behaviors, in particular pointer
    operations and use of external libraries.



    Cheers & hth.,

    - Alf
     
    Alf P. Steinbach, Feb 14, 2010
    #2
    1. Advertising

  3. Michael Tsang

    Seebs Guest

    On 2010-02-14, Michael Tsang <> wrote:
    > Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
    > program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
    > pointer is defined to "crash the program with SIGSEGV".


    Not necessarily.

    > Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
    > simply wrap around so we can say that the behaviour is defined to round on
    > x86 CPUs.


    That's not rounding, that's wrapping.

    But no, it's not the case. These are not necessarily *defined* -- they may
    merely be typical side-effects that are not guaranteed or supported.

    Modern gcc can do some VERY strange things if you write code which might
    dereference a null pointer. (For instance, loops which check whether a
    pointer is null may have the test removed because, if it were null, it
    would have invoked undefined behavior to dereference it...)

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
     
    Seebs, Feb 14, 2010
    #3
  4. Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    On Feb 14, 8:03 am, Michael Tsang <> wrote:
    >

    "Undefined behaviour" doesn't mean "exists in some metaphysical state
    of indefiniteness" but "the C standard imposes no requirements on the
    program's behaviour (and therefore the program is incorrect)". There
    was a huge thread about this a few years back on gets.

    So typically derefencing null will have the same effect each time any
    particular program is run, probably the same effect on any particular
    platform. Derefencing a wild pointer may have different effects,
    particularly on a multi-taskign machine where exact pointer vlaues
    vary from runto run.
     
    Malcolm McLean, Feb 14, 2010
    #4
  5. Michael Tsang

    Robert Fendt Guest

    Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    And thus spake Seebs <>
    14 Feb 2010 07:03:57 GMT:

    > dereference a null pointer. (For instance, loops which check whether a
    > pointer is null may have the test removed because, if it were null, it
    > would have invoked undefined behavior to dereference it...)


    Sorry to interrupt, but since when is checking a pointer value
    for 0 the same as deferencing it? Checking a pointer treats the
    pointer itself as a value, and comparison against 0 is one of
    the few things that are _guaranteed_ to work with a pointer
    value. So if GCC really would remove a check of the form

    if(!pointer)
    do_something(*pointer);

    or even

    if(pointer == 0)
    throw NullPointerException;

    then GCC would be very much in violation of the standard. And
    produce absolutely useless code, as well. What's the point of
    having pointers in a language if you wouldn't even be able to
    perform basic operations on them?

    Regards,
    Robert
     
    Robert Fendt, Feb 14, 2010
    #5
  6. Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    * Richard Heathfield:
    > Michael Tsang wrote:
    >> Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
    >> program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
    >> pointer is defined to "crash the program with SIGSEGV".

    >
    > Thread's subject line: Knowing the implementation, are all undefined
    > behaviours become implementation-defined behaviours?
    >
    > No. For example, consider a stack exploit on gets(). There are systems
    > on which the behaviour could be absolutely anything at all, depending on
    > user input!6\b$10be5c39no carrier


    :)


    Cheers,

    - Alf
     
    Alf P. Steinbach, Feb 14, 2010
    #6
  7. Michael Tsang

    Bo Persson Guest

    Robert Fendt wrote:
    > And thus spake Seebs <>
    > 14 Feb 2010 07:03:57 GMT:
    >
    >> dereference a null pointer. (For instance, loops which check
    >> whether a pointer is null may have the test removed because, if it
    >> were null, it would have invoked undefined behavior to dereference
    >> it...)

    >
    > Sorry to interrupt, but since when is checking a pointer value
    > for 0 the same as deferencing it? Checking a pointer treats the
    > pointer itself as a value, and comparison against 0 is one of
    > the few things that are _guaranteed_ to work with a pointer
    > value. So if GCC really would remove a check of the form
    >
    > if(!pointer)
    > do_something(*pointer);
    >
    > or even
    >
    > if(pointer == 0)
    > throw NullPointerException;
    >
    > then GCC would be very much in violation of the standard. And
    > produce absolutely useless code, as well. What's the point of
    > having pointers in a language if you wouldn't even be able to
    > perform basic operations on them?
    >


    Yes, but there are cases where the compiler can determine that the
    pointer is ALWAYS null or not-null, and remove code that would execute
    otherwise. For example:

    *pointer = 42;
    if(pointer == 0)
    throw NullPointerException;

    is known never to throw the exception!


    Bo Persson
     
    Bo Persson, Feb 14, 2010
    #7
  8. Re: Knowing the implementation, are all undefined behaviours become

    In article <>, Robert Fendt <> writes:

    > Checking a pointer treats the
    > pointer itself as a value, and comparison against 0 is one of
    > the few things that are _guaranteed_ to work with a pointer
    > value.


    No, evaluating an invalid pointer is undefined behavior.

    {
    void *p;

    p = malloc(1);
    free(p);
    p; /* UB */
    !p; /* UB */
    0 != p; /* UB */
    }

    See the C99 Rationale 6.3.2.3 Pointers for an informative (not
    normative) description.

    I believe that in this paragraph:

    ----v----
    Regardless how an invalid pointer is created, any use of it yields
    undefined behavior. Even assignment, comparison with a null pointer
    constant, or comparison with itself, might on some systems result in an
    exception.
    ----^----

    "any use" denotes "any evaluation", and "assignment" means "assignment
    FROM the invalid pointer". I'm fairly sure the following is valid:

    {
    int *ip;

    ip = malloc(sizeof *ip);
    free(ip);
    sizeof ip;
    sizeof *ip;
    ip = 0;
    ip;
    !ip;
    0 != ip;
    }

    Cheers,
    lacos
     
    Ersek, Laszlo, Feb 14, 2010
    #8
  9. Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    In article <>,
    Malcolm McLean <> wrote:

    >Derefencing a wild pointer may have different effects,
    >particularly on a multi-taskign machine where exact pointer vlaues
    >vary from runto run.


    It's not a general characteristic of multi-tasking systems that
    pointer values vary from run to run. Virtual memory has traditionally
    been used to give all instances of a program indistinguishable address
    spaces, and addresses will usually be the same.

    Recently for security reasons some operating systems have started to
    deliberately randomise the locations of, for example, shared
    libraries, so pointers are now more likely to vary. (Fortunately this
    can usually be disabled for debugging.)

    -- Richard
    --
    Please remember to mention me / in tapes you leave behind.
     
    Richard Tobin, Feb 14, 2010
    #9
  10. Michael Tsang

    Robert Fendt Guest

    Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    And thus spake "Bo Persson" <>
    Sun, 14 Feb 2010 11:24:48 +0100:

    > Yes, but there are cases where the compiler can determine that the
    > pointer is ALWAYS null or not-null, and remove code that would execute
    > otherwise. For example:
    >
    > *pointer = 42;
    > if(pointer == 0)
    > throw NullPointerException;
    >
    > is known never to throw the exception!


    Yes, that's static optimisation. Nothing wrong with that.
    However, the posting I was commenting explicitely described
    something different:

    >> dereference a null pointer. (For instance, loops which check
    >> whether a pointer is null may have the test removed because, if it
    >> were null, it would have invoked undefined behavior to dereference
    >> it...)


    This would mean nothing else than the compiler removing
    nullpointer checks solely on the grounds that a nullpointer
    cannot be de-referenced legally. So the compiler would see a
    pointer dereference, and decide "then it can't be null anyway,
    since it's used later". And that's just bull, sorry.

    Yes, if there's an unconditional pointer dereference and
    _afterwards_ a check for null, the compiler could take this as a
    hint that said pointer has been checked for null before the first
    dereference and thus remove the superfluous check. So if you had
    something like this:

    MyType& obj = *pointer;
    if (!pointer)
    threw NullPointerException;

    Since the dereference happens _before_ the check, the program
    has already entered the domain of undefined behaviour, and the
    check is moot (even if one has not 'used' the object reference
    in any other way). If the author of the previous posting meant
    that, then I agree (though I have doubts whether GCC really
    optimises this agressively). But in that case his comment was at
    least not very clear.

    Regards,
    Robert
     
    Robert Fendt, Feb 14, 2010
    #10
  11. Robert Fendt <> writes:
    <snip>
    > Yes, if there's an unconditional pointer dereference and
    > _afterwards_ a check for null, the compiler could take this as a
    > hint that said pointer has been checked for null before the first
    > dereference and thus remove the superfluous check. So if you had
    > something like this:
    >
    > MyType& obj = *pointer;
    > if (!pointer)
    > threw NullPointerException;
    >
    > Since the dereference happens _before_ the check, the program
    > has already entered the domain of undefined behaviour, and the
    > check is moot (even if one has not 'used' the object reference
    > in any other way). If the author of the previous posting meant
    > that, then I agree (though I have doubts whether GCC really
    > optimises this agressively).


    gcc does exactly that (with certain options). I think this is the
    nature a recent Linux kernel bug: http://lkml.org/lkml/2009/7/6/19

    The pointer use was ever so slightly less obvious but it led gcc to
    conclude that the following test could be removed.

    Given the cross-post, I should say that I have no idea if gcc does
    this for the exact case you cite (which is C++) but I wanted to point
    out that similar things are done.

    <snip>
    --
    Ben.
     
    Ben Bacarisse, Feb 14, 2010
    #11
  12. Michael Tsang

    Robert Fendt Guest

    Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    And thus spake Ben Bacarisse <>
    Sun, 14 Feb 2010 13:41:23 +0000:

    > gcc does exactly that (with certain options). I think this is the
    > nature a recent Linux kernel bug: http://lkml.org/lkml/2009/7/6/19


    It certainly looks that way. That's a nasty bugger to spot.

    > Given the cross-post, I should say that I have no idea if gcc does
    > this for the exact case you cite (which is C++) but I wanted to point
    > out that similar things are done.


    Yes, I did not notice this whole thread had been crossposted to
    comp.lang.c; a more appropriate example would then have been a
    sizeof(*pointer) or something. Since sizeof in that case relies
    only on static type information, one could assume it should work
    whether the pointer is null or not. But the dereference itself
    already makes the whole programm ill-formed (in case of a
    nullpointer).

    Regards,
    Robert
     
    Robert Fendt, Feb 14, 2010
    #12
  13. Michael Tsang

    James Kanze Guest

    Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    On Feb 14, 1:54 pm, Robert Fendt <> wrote:
    > And thus spake Ben Bacarisse <>
    > Sun, 14 Feb 2010 13:41:23 +0000:


    > > gcc does exactly that (with certain options). I think this
    > > is the nature a recent Linux kernel
    > > bug:http://lkml.org/lkml/2009/7/6/19


    > It certainly looks that way. That's a nasty bugger to spot.


    Either the pointer can be null, or it cannot. If it can be
    null, the first unit test which tests it with null should cause
    a crash. If it cannot, then the test the g++ would have
    removed is superfluous, and removing it shouldn't change
    anything.

    There are many other cases of undefined behavior which do affect
    optimizations, however. Consider an expression like: f((*p)++,
    (*q)++). Given this, the compiler "knows" that p and q do not
    reference the same memory (since if they did, it would be
    undefined behavior), which means that in other code in the
    function, the compiler might have cached *p, and knows that it
    doesn't have to update or purge its cached value if there is a
    write through *q.

    > > Given the cross-post, I should say that I have no idea if
    > > gcc does this for the exact case you cite (which is C++) but
    > > I wanted to point out that similar things are done.


    > Yes, I did not notice this whole thread had been crossposted
    > to comp.lang.c; a more appropriate example would then have
    > been a sizeof(*pointer) or something. Since sizeof in that
    > case relies only on static type information, one could assume
    > it should work whether the pointer is null or not. But the
    > dereference itself already makes the whole programm ill-formed
    > (in case of a nullpointer).


    Dereferencing a null pointer is only undefined behavior if the
    code is actually executed. Something like sizeof(
    f(*(MyType*)0) ) is perfectly legal, and widely used in some
    template idioms (although I can't think of a reasonable use for
    it in C).

    --
    James Kanze
     
    James Kanze, Feb 14, 2010
    #13
  14. Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    On Feb 14, 4:11 pm, James Kanze <> wrote:
    >
    > Dereferencing a null pointer is only undefined behavior if the
    > code is actually executed.  Something like sizeof(
    > f(*(MyType*)0) ) is perfectly legal, and widely used in some
    > template idioms (although I can't think of a reasonable use for
    > it in C).
    >

    Nulls are dereferenced to produce the offsetof macro hack in C.
     
    Malcolm McLean, Feb 14, 2010
    #14
  15. Re: Knowing the implementation, are all undefined behaviours become

    In article <>, Malcolm McLean <> writes:
    > On Feb 14, 4:11=A0pm, James Kanze <> wrote:
    >>
    >> Dereferencing a null pointer is only undefined behavior if the
    >> code is actually executed. =A0Something like sizeof(
    >> f(*(MyType*)0) ) is perfectly legal, and widely used in some
    >> template idioms (although I can't think of a reasonable use for
    >> it in C).
    >>

    > Nulls are dereferenced to produce the offsetof macro hack in C.


    No, they are not.

    I guess you mean something like this:

    #define offsetof(type, member_designator) \
    ((size_t)&((type *)0)->member_designator)

    Let's deal first with the conversion of the final pointer to size_t:

    C99 6.3.2.3 Pointers, p6: "Any pointer type may be converted to an
    integer type. Except as previously specified, the result is
    implementation-defined. If the result cannot be represented in the
    integer type, the behavior is undefined. The result need not be in the
    range of values of any integer type."

    Then wrt. dereferencing the null pointer:

    C99 6.6 Constant expressions, p9: "An address constant is a null
    pointer, [...]; it shall be created explicitly using the unary &
    operator or an integer constant cast to pointer type, or [...]. The
    [...] member-access . and -> operators, the address & and indirection *
    unary operators, and pointer casts may be used in the creation of an
    address constant, but the value of an object shall not be accessed by
    use of these operators."

    Perhaps this is relevant too:

    C99 6.5.3.2 Address and indirection operators, p3: "[...] If the operand
    is the result of a unary * operator, neither that operator nor the &
    operator is evaluated and the result is as if both were omitted, except
    that the constraints on the operators still apply and the result is not
    an lvalue. [...]"

    Cheers,
    lacos
     
    Ersek, Laszlo, Feb 14, 2010
    #15
  16. Re: Knowing the implementation, are all undefined behaviours become implementation-defined behaviours?

    James Kanze <> writes:

    > On Feb 14, 1:54 pm, Robert Fendt <> wrote:

    <snip>
    >> Yes, I did not notice this whole thread had been crossposted
    >> to comp.lang.c; a more appropriate example would then have
    >> been a sizeof(*pointer) or something. Since sizeof in that
    >> case relies only on static type information, one could assume
    >> it should work whether the pointer is null or not. But the
    >> dereference itself already makes the whole programm ill-formed
    >> (in case of a nullpointer).

    >
    > Dereferencing a null pointer is only undefined behavior if the
    > code is actually executed. Something like sizeof(
    > f(*(MyType*)0) ) is perfectly legal, and widely used in some
    > template idioms (although I can't think of a reasonable use for
    > it in C).


    For a non-literal null, it is quite common:

    new_ptr = realloc(old_ptr, new_length * sizeof *new_ptr);

    will work regardless of the state of new_ptr (null, well-defined or
    indeterminate).

    [I know you know this: I am simple illustrating the point with a
    common idiom.]

    --
    Ben.
     
    Ben Bacarisse, Feb 14, 2010
    #16
  17. Michael Tsang

    Seebs Guest

    On 2010-02-14, Robert Fendt <> wrote:
    > And thus spake Seebs <>
    > 14 Feb 2010 07:03:57 GMT:
    >> dereference a null pointer. (For instance, loops which check whether a
    >> pointer is null may have the test removed because, if it were null, it
    >> would have invoked undefined behavior to dereference it...)


    > Sorry to interrupt, but since when is checking a pointer value
    > for 0 the same as deferencing it?


    It's not.

    But if you dereference a pointer at some point, a check against it can
    be omitted. If, that is, that dereference can happen without the check.

    So imagine something like:

    ptr = get_ptr();

    while (ptr != 0) {
    /* blah blah blah */
    ptr = get_ptr();
    x = *ptr;
    }

    gcc might turn the while into an if followed by an infinite loop, because
    it *knows* that ptr can't become null during the loop, because if it did,
    that would have invoked undefined behavior.

    And there are contexts where you can actually dereference a null and not
    get a crash, which means that some hunks of kernel code can become infinite
    loops unexpectedly with modern gcc. Until the kernel is fixed, which I
    believe it has been.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
     
    Seebs, Feb 14, 2010
    #17
  18. Michael Tsang

    Seebs Guest

    Re: Knowing the implementation, are all undefined behaviours become implementation-defined behaviours?

    On 2010-02-14, James Kanze <> wrote:
    > Either the pointer can be null, or it cannot. If it can be
    > null, the first unit test which tests it with null should cause
    > a crash. If it cannot, then the test the g++ would have
    > removed is superfluous, and removing it shouldn't change
    > anything.


    Unless you're in a context where dereferencing null exhibits the undefined
    behavior of giving you access to a block of memory.

    > Dereferencing a null pointer is only undefined behavior if the
    > code is actually executed. Something like sizeof(
    > f(*(MyType*)0) ) is perfectly legal, and widely used in some
    > template idioms (although I can't think of a reasonable use for
    > it in C).


    Implementation of offsetof(), too, although that's not exactly safe.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
     
    Seebs, Feb 14, 2010
    #18
  19. Re: Knowing the implementation, are all undefined behaviours become implementation-defined behaviours?

    Malcolm McLean <> writes:

    > On Feb 14, 4:11 pm, James Kanze <> wrote:
    >>
    >> Dereferencing a null pointer is only undefined behavior if the
    >> code is actually executed.  Something like sizeof(
    >> f(*(MyType*)0) ) is perfectly legal, and widely used in some
    >> template idioms (although I can't think of a reasonable use for
    >> it in C).
    >>

    > Nulls are dereferenced to produce the offsetof macro hack in C.


    Then I would say that it is not an example of what James was talking
    about. In his C++ example, no null pointer is dereferenced.

    Obviously there is a terminology issue here in that you might want to
    say that sizeof *(int *)0 is a dereference of a null pointer because,
    structurally, it applies * to such a pointer; but I would rather
    reserve the word dereference for an /evaluated/ application of * (or []
    or ->). I'd go so far as to say that any other use is wrong.

    --
    Ben.
     
    Ben Bacarisse, Feb 14, 2010
    #19
  20. Michael Tsang

    Thad Smith Guest

    Re: Knowing the implementation, are all undefined behaviours becomeimplementation-defined behaviours?

    Michael Tsang wrote:
    >
    > Deferencing a NULL pointer is undefined behaviour,


    Actually, dereferencing a null pointer _results in_ behavior undefined by
    Standard C.

    In answer to your subject line question "Knowing the implementation, are all
    undefined behaviours become implementation-defined behaviours?", no.

    In Standard C "implementation-defined behavior" means that the implementation
    documents the behavior. Even if the behavior is consistent for a particular
    implementation, it may not be documented.

    --
    Thad
     
    Thad Smith, Feb 14, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rennie deGraaf

    "Interesting" C behaviours

    Rennie deGraaf, Nov 27, 2004, in forum: C Programming
    Replies:
    6
    Views:
    454
    glen herrmannsfeldt
    Nov 28, 2004
  2. Oodini
    Replies:
    1
    Views:
    1,838
    Keith Thompson
    Sep 27, 2005
  3. Tom
    Replies:
    3
    Views:
    485
    Steven D'Aprano
    Jul 26, 2009
  4. Michael Tsang
    Replies:
    32
    Views:
    1,157
    Richard Bos
    Mar 1, 2010
  5. Jon
    Replies:
    1
    Views:
    394
    Peter Nilsson
    Nov 8, 2010
Loading...

Share This Page