Null pointers in C---spec (N869) trouble, was What computer language is used a lot in the IT industr

Discussion in 'C Programming' started by Thomas G. Marshall, Jul 2, 2004.

  1. Arthur J. O'Dwyer <> coughed up the following:
    > On Thu, 1 Jul 2004, Thomas G. Marshall wrote:
    >>
    >> Aside: I've looked repeatedly in google and for some reason cannot
    >> find what is considered to be the latest ansi/iso C spec. I cannot
    >> even find C99 in its final draft. Where in ansi.org or the like do
    >> I find it?

    >
    > The official C99 specification is copyright ISO and distributed by
    > various national member bodies as well as by ISO itself. Someone
    > (ANSI?) sells it in PDF form for $18. Google.
    > The N869 draft standard (a preliminary version) is publicly
    > available. Official distributions of the C90 standard are
    > apparently no longer available, but neither is it free. Which is a
    > pity. :)
    >
    >> (N869) 3.15 object
    >> region of data storage in the execution environment, the contents
    >> of which can represent values
    >>
    >> Ok, I'm assuming that this covers everything from device driver
    >> access to memory mapped video buffers to returns from malloc().
    >> Correct?

    >
    > Returns from malloc(), yes. Those other things you mentioned, C has
    > no conception of them. You get "regions of data storage" in C by
    > allocating them, either with object definitions (register, automatic,
    > and static objects) or with calls to malloc, calloc or realloc
    > (dynamic objects). That's it, as far as standard C is concerned.


    How is malloc() itself written then when it needs to coordinate throughout
    the heap? Can it be called C?

    If not then we have a problem: I want to quiz someone on their knowledge of
    what a C program will do, not on what's called defined behavior in the spec.

    It occurred to me while driving today one of the reasons why the C
    specification might want to be hands off of incrementing an address that
    happens to be 0. It is conceivable to me that the C spec. would have to
    account for machines where the address 0 is of particular meaning, and that
    even loading an address register with the value can cause a maskable
    interrupt. (not accessing the what's at the location 0 in any way, just the
    address itself).

    Furthermore, they would have to also account for regions of address space
    that is not memory mapped, nor handled by the address translator.


    > Many implementations will let you write whatever you like, wherever
    > you like, as long as you call a magic function first, or put a magic
    > value in a pointer, or whatever, but all that falls under "undefined
    > behavior" (or "implementation-defined behavior") in the Standard, and
    > is never portable.
    >
    > To answer your other post, and to give a slightly more sophisticated
    > "quiz" (since I'm feeling nice today:), no, null pointers never point
    > to objects according to the Standard (that's part of the definition of
    > "null pointer"; consider --- how else would you make "if (p != NULL)"
    > useful, if NULL could be a valid address too?).


    That was the trouble. The malloc's never return that. You have to be
    careful on that machine.


    > And no quiz on C
    > should assume *implicitly* that the quizzee thinks sizeof(long)==4.


    It's established by me in the begining, as I posted. I've been on machines
    where the size of byte, short, and long are the same 32 bits, and sizeof()
    each of them is 1.


    > A portable "quiz":
    >
    > #include <stdio.h>
    >
    > int main()
    > {
    > char arr[100][2];
    > char *p = arr[1];
    > char *q = arr[5];
    > int alpha = &arr[5] - &arr[1];
    > int beta = q - p;
    > int gamma = sizeof **arr;
    > printf("%d %d %d\n", alpha, beta, gamma);
    > return 0;
    > }
    >
    > -Arthur


    No good for my purposes. I'm looking for something very concise and
    specific, and the ensuing conversation, part of which can be the C99
    dictates about how it'll work on compilers, but is undefined by the spec.

    WHOA. Please take a look at N869 3.18:

    3.18
    1 undefined behavior
    behavior, upon use of a nonportable or erroneous program construct, of
    erroneous data, or of indeterminately valued objects, for which this
    International Standard *imposes no requirements*

    2 NOTE Possible undefined behavior ranges from ignoring the situation
    completely with unpredictable results, to behaving during translation or
    program execution in a documented manner characteristic of the environment
    (with or without the issuance of a diagnostic message), to terminating a
    translation or execution (with the issuance of a diagnostic message).

    3 EXAMPLE An example of undefined behavior is the behavior on integer
    overflow.


    When 3.18 #1 says that undefined behavior can be behavior for which this
    standard imposes no requirements, is that suggesting that there are no
    requirements for that issue. That null pointer incrementing is simply free
    of requirements?

    PLEASE do not freak out on me with ire. It does no good in this ng to do
    so. I'm trying to dig through what is what.

    I'm crossing this over to comp.lang.c so they can beat this up as well. For
    those in c.l.c, the initiating concern is over the following interview
    question of mine:

    What does the following C snippet produce on a
    "normal 32 bit system". That is, 32 bit longs, byte
    addressable 32 bit address space, etc., etc. A
    sparcstation 1 for example IIRC---no tricks, nothing
    hidden. If there's a typo, I'm sorry, I'm typing this quickly.

    long *a = 0;
    long *b = 0;
    a++;
    b++; b++;
    printf ("%d %d %d\n", a, b, (b-a));

    Forget the newbies. 99% of the senior candidates rattle off:

    4 8 4

    (I'll add for the purists that a byte here is an octet.)

    The goal of this question was to show that the answer would be:

    4 8 1

    but really to discuss pointer issues.

    One of the big complaints here is that I'm being told that the C
    specification does not allow me to increment a null pointer. Nor can I
    subtract pointers that do not point within the same object (or 1 past an
    array).

    I'm going to repost this under something more concise if there's lackluster
    response---it's getting unweildy.
    Thomas G. Marshall, Jul 2, 2004
    #1
    1. Advertising

  2. Thomas G. Marshall

    Chris Torek Guest

    In article <news:NyfFc.1217$>
    Thomas G. Marshall <>
    writes:
    >How is malloc() itself written then when it needs to coordinate throughout
    >the heap?


    I find this question quite confusing. What "heap" do you mean?
    Coordinate with what other agent(s)? One can write memory allocators
    in C, and they can use heaps, queues, AVL trees; first-fit, best-fit,
    etc.; the details are up to whomever writes the allocator.

    On the other hand, malloc() itself can *not* be written in Standard
    C, for the simple reason that malloc() is required (by the Standard)
    to provide "properly aligned" memory for *any* purpose, and there
    is no Standard-provided way to discover or create such alignment.
    So malloc() itself, if written in C, is necessarily written in
    not-strictly-portable C. (I find this rather irritating myself,
    and believe the Standard could have required implementations to
    provide some macros and/or functions that, together with C99's
    intptr_t, would do the trick.)

    >Can it be called C?


    It can be called Gronklezeeb, if you prefer. :) I suspect you mean
    "should" it be called C -- which gets into a rather different
    problem, that of communicating ideas from one human being to another.
    This is not something the C language, nor any Standard, can solve.

    >If not then we have a problem: I want to quiz someone on their knowledge of
    >what a C program will do [on, apparently, a specific machine or set of
    >machines], not on what's called defined behavior in the spec.


    In that case, ask that. But if you are posting in comp.lang.c, be
    aware that machine-specific answers are considered off-topic, in
    part because the answers *change* from one machine to the next.
    The output from the "%p" format is quite different on the IBM AS/400
    than on the average 32-bit system today (e.g., x86-based, using a
    compiler like GCC and libraries like those included with Windows
    or Linux or BSD). The pointers and perhaps even the integers
    themselves are quite different as well. A Univac 1100 uses ones'
    complement arithmetic and 9, 18 and 36-bit integers; a Tandem or
    Eclipse has multiple pointer formats; and so on.

    Apparently Arthur J. O'Dwyer <> wrote:
    >> And no quiz on C
    >> should assume *implicitly* that the quizzee thinks sizeof(long)==4.


    >It's established by me in the begining, as I posted.


    This is fine, but once you do it, you are no longer talking about
    C-the-programming-language-for-the-abstract-machine-in-the-standard,
    but rather C-as-used-on-the-Fooblatz-42. The knowledge required
    is some combination of *both* "Standard C" and "Fooblatz details".

    [And, apparently a different subject; it is again not quite clear:]

    >take a look at N869 3.18:
    >
    >3.18
    >1 undefined behavior
    >behavior, upon use of a nonportable or erroneous program construct, of
    >erroneous data, or of indeterminately valued objects, for which this
    >International Standard *imposes no requirements*

    [...]
    >When 3.18 #1 says that undefined behavior can be behavior for which this
    >standard imposes no requirements, is that suggesting that there are no
    >requirements for that issue. That null pointer incrementing is simply free
    >of requirements?


    Correct. The C standard imposes *no* requirements. Any given
    implementation can do *anything*, or even *nothing*. Whatever it
    does or fails to do does not change its "conformance status".
    Most compilers do "whatever is convenient" in problem cases, and
    that is just fine, because ANYTHING they do is just fine.

    Note that undefined behavior has *two* functions, as I outline in
    <http://web.torek.net/torek/c/index.html> and
    <http://web.torek.net/torek/c/compiler.html>. Without something
    like undefined behavior, Standard C would not only be what you got,
    it would be all you could *ever* get -- which would make C far less
    useful in the real world, where one often needs to do something
    that only works on Machine X, in some cases precisely *because*
    it only works on Machine X.

    >I'm crossing this over to comp.lang.c so they can beat this up as well. For
    >those in c.l.c, the initiating concern is over the following interview
    >question of mine:
    >
    > What does the following C snippet produce on a
    > "normal 32 bit system". That is, 32 bit longs, byte
    > addressable 32 bit address space, etc., etc. A
    > sparcstation 1 for example IIRC---no tricks, nothing
    > hidden. If there's a typo, I'm sorry, I'm typing this quickly.
    >
    > long *a = 0;
    > long *b = 0;
    > a++;
    > b++; b++;
    > printf ("%d %d %d\n", a, b, (b-a));


    This requires knowledge, but not so much of C, but rather of "a
    sparcstation 1 for example". Try this code snippet on an IBM AS/400
    and see what happens. :) What would you do with a candidate who
    knows C quite well but has *only* used an AS/400?

    Incidentally, the above code can even fail -- print bizarre
    apparently-garbage results, in this case -- on 68000-based systems,
    if the C compiler chooses to pass pointers in the "A" registers
    and integers in the "D" registers. The printf() call would fill
    in four A registers, and then the printf() engine would read out
    one A register -- for the format -- and three D registers, for the
    three %d directives. (I know of no 68000-based systems that pass
    parameters in this way, but I do know of one that *returns* pointers
    in A0 and integers in D0. This makes "failure to declare malloc()"
    result in "program crashes mysteriously", even though the same code
    works on the same machine if you use a different compiler suite.)
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Jul 3, 2004
    #2
    1. Advertising

  3. Thomas G. Marshall

    Chris Torek Guest

    (gah, oops:)

    In article <news:> I wrote, in part:
    >> printf ("%d %d %d\n", a, b, (b-a));

    [where a and b have the same pointer type, in this case "long *"]

    >... Incidentally, the above code can even fail -- print bizarre
    >apparently-garbage results, in this case -- on 68000-based systems,
    >if the C compiler chooses to pass pointers in the "A" registers
    >and integers in the "D" registers. The printf() call would fill
    >in four A registers


    Make that "three A registers and one D" -- the result of the
    subtraction, (b-a), has integral type (type "ptrdiff_t") and on
    this compiler would go in a D register.

    >... and then the printf() engine would read out
    >one A register -- for the format -- and three D registers, for the
    >three %d directives.


    Disregarding certain other details about the initial values of the
    variables "a" and "b" above, the call:

    printf("%p %p %d\n", (void *)a, (void *)b, (int)(b - a));

    would be strictly conforming and will work on every correct C
    implementation. (Of course I assume <stdio.h> has been included,
    for instance, and any other necessary structural bits provided.)
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Jul 3, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Vijay Kumar R Zanvar

    Typo in n869

    Vijay Kumar R Zanvar, Jun 10, 2004, in forum: C Programming
    Replies:
    1
    Views:
    313
    those who know me have no need of my name
    Jun 10, 2004
  2. Chris Torek
    Replies:
    26
    Views:
    559
    Kelsey Bjarnason
    Jul 19, 2004
  3. Matt
    Replies:
    5
    Views:
    295
  4. Replies:
    34
    Views:
    1,202
    santosh
    Jul 26, 2008
  5. Andrew Chen
    Replies:
    1
    Views:
    180
    David Chelimsky
    Mar 25, 2008
Loading...

Share This Page