Do buffers always start with the lowest memory address being the first element?

Discussion in 'C Programming' started by kiru.sengal@gmail.com, Jan 15, 2005.

  1. Guest

    [This post is with regards computers/OSes with stacks that grow down.
    i386/Unix is one possibility]

    I have embedded my questions/assumptions in the the following sample
    code:


    include <stdio.h>

    long num;
    /* allocated a zero-init fixed memory location (.BSS) */

    char *s = "Hello world";
    /* s allocated a fixed memory location and initialized
    to address of 'H' (DATA). Meanwhile, the literal string
    (12-byte buffer) is stored in a fixed WRITE ONLY area (TEXT) */

    short buffer[100];
    /* buffer allocated in contiguous fixed memory locations (in DATA)
    where buffer[0] is lowest memory address and buffer[99] is highest
    memory address */

    int main()
    {
    int count = 4; /* stored in main()'s stack frame */
    float fcount= 4.0; /* stored in main()'s stack frame */
    static confusion = 8; /* stored in same region as buffer (DATA) */

    long lbuffer[50];
    /* stored in main()'s stack frame, but since stack grows down, is
    the field for buffer[0] still placed in the lowest memory address?
    */

    printf("\n%s\n",s);

    int i; /* stored in main()'s stack frame, but when is memory
    allocated? */

    for(i=0; i<100; i++)
    {
    buffer = i;
    }

    return 0;

    }

    Additional questions:

    - Since local variables don't have to be declared at the beginning of a
    function, during run-time, is space for all local variables to be used
    in any function allocated when the function is entered, or only when
    they are used?

    - Since main() is always the starting point for programs, do compilers
    really put it's local variables in a main()-stackframe or simply place
    it in fixed memory locations? How are static locals in main()
    treated?


    Thanking everyone in advance.
     
    , Jan 15, 2005
    #1
    1. Advertising

  2. Malcolm Guest

    <> wrote
    >

    The short answer to the question is "yes".
    Technically you could write a perverse implementation that uses some weird
    and wonderful mapping between pointers and physical memory, but no-one does
    this.


    > include <stdio.h>
    >
    > long num;
    > /* allocated a zero-init fixed memory location (.BSS) */
    >
    > char *s = "Hello world";
    > /* s allocated a fixed memory location and initialized
    > to address of 'H' (DATA). Meanwhile, the literal string
    > (12-byte buffer) is stored in a fixed WRITE ONLY area (TEXT) */
    >
    > short buffer[100];
    > /* buffer allocated in contiguous fixed memory locations (in DATA)
    > where buffer[0] is lowest memory address and buffer[99] is highest
    > memory address */
    >


    BSS, TEXT, and DATA are purely concepts provided by your OS. Compilers
    usually adhere to the conventions of their host platform, but not absolutely
    always. On a different platform, there may not be this distinction between
    read-only and temporary memory.
    >
    > - Since local variables don't have to be declared at the beginning of a
    > function, during run-time, is space for all local variables to be used
    > in any function allocated when the function is entered, or only when
    > they are used?
    >

    The same goes for the stack frame. Usually the stack pointer will be
    advanced to allow space for all locals on function entry, and reset on exit.
    However you cannot assume that this will always be the case for every
    compiler.
    >
    > - Since main() is always the starting point for programs, do compilers
    > really put it's local variables in a main()-stackframe or simply place
    > it in fixed memory locations? How are static locals in main()
    > treated?
    >

    Normally on a hosted system it is not possible for an application to write
    to absolute memory addresses. So globals and static locals have got to go
    somewhere defined at runtime. This might well be in the space immediately
    before the stack where main's locals are held. However you cannot guarantee
    this, and normally it shouldn't concern you as a C programmer.
     
    Malcolm, Jan 15, 2005
    #2
    1. Advertising

  3. Eric Sosman Guest

    Re: Do buffers always start with the lowest memory address beingthe first element?

    wrote:
    > [This post is with regards computers/OSes with stacks that grow down.
    > i386/Unix is one possibility]
    >
    > I have embedded my questions/assumptions in the the following sample
    > code:
    >
    >
    > include <stdio.h>
    >
    > long num;
    > /* allocated a zero-init fixed memory location (.BSS) */


    Zero-initialized, yes. At a fixed location, yes.
    "BSS" is an implementation detail, not necessarily shared
    by all implementations.

    > char *s = "Hello world";
    > /* s allocated a fixed memory location and initialized
    > to address of 'H' (DATA). Meanwhile, the literal string
    > (12-byte buffer) is stored in a fixed WRITE ONLY area (TEXT) */


    Fixed location for `s', yes. Initialized to point to
    the initial 'H', yes. "Hello world" at a fixed location,
    yes. Definitely not in a write-only area, possibly in a
    read-only area or a read-write area at the implementation's
    discretion. "TEXT" is an implementation detail.

    > short buffer[100];
    > /* buffer allocated in contiguous fixed memory locations (in DATA)
    > where buffer[0] is lowest memory address and buffer[99] is highest
    > memory address */


    Contiguous fixed locations, yes. buffer[0] and [99] at
    the low and high positions, yes. "DATA" is an implementation
    detail (and probably not correct on implementations that happen
    to use "BSS").

    > int main()
    > {
    > int count = 4; /* stored in main()'s stack frame */
    > float fcount= 4.0; /* stored in main()'s stack frame */


    "Stack frame" is an implementation detail. Most
    implementations use a stack, and the C language imposes a
    LIFO ordering on the required lifetimes of `auto' variables,
    but the language does not actually require an explicit stack.

    > static confusion = 8; /* stored in same region as buffer (DATA) */


    May be stored anywhere at all, in the same region as `buffer'
    or somewhere else, so long as it exists when main() is first called
    and continues to exist until the program exits. "DATA" is an
    implementation detail.

    > long lbuffer[50];
    > /* stored in main()'s stack frame, but since stack grows down, is
    > the field for buffer[0] still placed in the lowest memory address?
    > */


    "Stack frame" is an implementation detail. lbuffer[0]
    and [49] are at the low and high ends, respectively, of the
    memory occupied by `lbuffer', no matter where it is stored.

    > printf("\n%s\n",s);
    >
    > int i; /* stored in main()'s stack frame, but when is memory
    > allocated? */


    "Stack frame" is an implementation detail. Memory is
    allocated (and deallocated) whenever the implementation
    chooses, so long as `i' becomes allocated before it is used
    and remains allocated until it is used no longer.

    > for(i=0; i<100; i++)
    > {
    > buffer = i;
    > }
    >
    > return 0;
    >
    > }
    >
    > Additional questions:
    >
    > - Since local variables don't have to be declared at the beginning of a
    > function, during run-time, is space for all local variables to be used
    > in any function allocated when the function is entered, or only when
    > they are used?


    Different implementations behave differently. A conforming
    C program cannot tell.

    > - Since main() is always the starting point for programs, do compilers
    > really put it's local variables in a main()-stackframe or simply place
    > it in fixed memory locations? How are static locals in main()
    > treated?


    Compilers can do whatever they like, so long as the
    variables exist when they are supposed to. All that I have
    encountered use the same mechanisms for `auto' and `static'
    variables in main() as they do for any other function.

    Even though main() is the first function called when a
    program starts, nothing prevents it from being called again,
    recursively. Here's a stupid program to print its command-
    line arguments in reverse order:

    #include <stdio.h>
    int main(int argc, char **argv) {
    if (argc > 0) {
    main(argc - 1, argv + 1);
    puts (*argv);
    }
    return 0;
    }

    > Thanking everyone in advance.


    You're welcome. A piece of advice: It is usually better
    to concentrate on *what* the implementation does with your
    program than on *how* it does it. If you write carefully and
    portably the former is constant, while the latter changes
    from one implementation to the next.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Jan 15, 2005
    #3
  4. Chris Torek Guest

    In article <>
    <> wrote:
    >[This post is with regards computers/OSes with stacks that grow down.
    >i386/Unix is one possibility]


    The C standard does not assume a downward-growing stack, nor even
    an upward-growing stack. It merely requires that automatic (local)
    variables behave in a "stack-like manner", if a function is called
    recursively.

    (Google's broken news-posting interface destroyed your indentation.
    I have tried to restore it here.)

    >I have embedded my questions/assumptions in the the following sample
    >code:
    >
    >include <stdio.h>
    >
    >long num;
    >/* allocated a zero-init fixed memory location (.BSS) */


    "BSS" is a system-specific (albeit common) method of implementing
    this; C requires only that the variable exist as long as the program
    runs, and be initialized to zero. Some implementations do the same
    with this as with:

    long num = 0;

    because, e.g., they lack anything similar to the "bss region" found
    on your example system.

    >char *s = "Hello world";
    >/* s allocated a fixed memory location and initialized
    >to address of 'H' (DATA). Meanwhile, the literal string
    >(12-byte buffer) is stored in a fixed WRITE ONLY area (TEXT) */


    Surely you mean "read only" :) In typical Unix-like systems today,
    the string literals are in a "read only data" section (".rodata"
    and the like). C allows but does not require that the array produced
    by the string literal be *physically* read-only, and some
    implementations leave it write-able. The effect of writing on
    elements of the array is undefined: it may trap, it may succeed
    (changing the array), or it may silently fail (no trap but the
    array remains unchanged), in the three "most typical" implementations,
    but as far as Standard C is concerned, *anything* is allowed.
    You -- the programmer -- are simply required not to do this,
    in order for you to make any predictions about the operation of
    your program.

    >short buffer[100];
    >/* buffer allocated in contiguous fixed memory locations (in DATA)
    >where buffer[0] is lowest memory address and buffer[99] is highest
    >memory address */


    As with "num" above, C requires only that the variable exist as long
    as the program runs, and be initialized to zero. On the implementation
    you use, you will find that "buf" is also in the ".bss" section.

    The question of "lower" and "higher" memory addresses suggests to
    me that you are asking about machine-level interpretations, but C
    does not give you direct access to machine-level interpretations.
    Instead, Standard C pastes a (usually quite thin) layer of
    abstraction atop the machine-level. You may compute pointers
    pointing into the array named "buffer", e.g.:

    short *p1 = &buffer[20];
    short *p2 = &buffer[75];

    and, given two pointers of compatible type pointing into this array,
    you may compare them using any of the four relational operators:

    int result1 = p1 < p2;
    int result2 = p1 <= p2;
    int result3 = p1 > p2;
    int result4 = p1 >= p2;

    Results 1 and 2 here are guaranteed to be zero if p1 is "not less
    than" p2, and "not less than" means "has a lower subscript in the
    array". In that sense, the elements of the array are indeed
    addressed from lowest to highest.

    (You can also, of course, use the equality operators: p1 == p2,
    p1 != p2. I mention them separately because you can use them in
    places you may *not* use the relational operators.)

    There is nothing stopping an actual C compiler on real hardware
    from putting buffer[0] at the hardware's highest physical memory
    address, and working down towards lower addresses. In this case,
    a C source expression like:

    p1 < p2

    might compile into a machine instruction that tests instead whether
    p1 is greater than p2. (Practically speaking, this would be stupid
    on today's hardware, and no one will do it, because it would also
    require negating the index in expressions like buffer. But one
    might do this on a machine in which ordinary array indexing works
    by subtraction instead of addition -- and such machines have existed
    in the past.)

    At the C code level, then, it *is* the case that buffer[0] through
    buffer[99] are in contiguous memory locations starting "at the
    bottom" and "moving up", but there is no requirement that they be
    physically contiguous (consider virtual-memory systems with small
    page sizes), nor that a C-code level test like "p1 < p2" compile
    to a machine instruction testing whether p1 is less than p2. In
    C's abstract model, they are contiguous and ascending; the extent
    to which C's abstract model matches what really happens on the
    machine depends on both the C compiler and the machine.

    >int main()
    >{
    > int count = 4; /* stored in main()'s stack frame */
    > float fcount= 4.0; /* stored in main()'s stack frame */
    > static confusion = 8; /* stored in same region as buffer (DATA) */


    Again, the C standard requires only that count and fcount work "as
    if" they were on some kind of stack, and that "confusion" work "as
    if" it were in that kind of data-segment: initially 8, and valid
    throughout the lifetime of the program.

    > long lbuffer[50];
    > /* stored in main()'s stack frame, but since stack grows down, is the
    > field for buffer[0] still placed in the lowest memory address? */


    Again, everything is "as if". Like count and fcount, lbuffer need
    only exist as long as main() continues to execute, and if some
    other function in your program calls main(), you must get "new
    copies" of the variables, preserving the old copies that are still
    around because the earlier call to main() is also still around (but
    suspended until this copy of main() returns).

    The direction of stack growth, if there is even a single stack[%]
    that has a single growth direction, is irrelevant to C's abstract
    model. C promises only that &lbuffer[0] < &lbuffer[1] and so on.
    If you somehow manage to compare &lbuffer[23] relationally to
    &buffer[72], no particular result is required. (The types of the
    two buffers' elements do not match, so this requires a cast, which
    potentially changes the value, which muddies the issue even more,
    but never mind all that.) On the other hand, equality comparisons,
    after conversion to a suitable type such as "char *" or "void *",
    *are* required to produce "not equal":

    if ((char *)&lbuffer[23] == (char *)&buffer[72])
    abort(); /* never happens */

    > printf("\n%s\n",s);
    >
    > int i; /* stored in main()'s stack frame, but when is memory allocated? */


    Declarations after code are a C99 feature. Quite a few compilers
    do not support this; in C89 you could use a new block:

    printf("%s\n", s);
    {
    int i;
    ...
    }

    C requires only that the program behave "as if" i's lifetime begins
    at its declaration and continues until execution reaches the "}" that
    terminates its scope. In your C99-specific code, that is the final
    close-brace for main(); in the C89 variant, it is the close-brace
    inserted to match the open-brace I added here.

    > for(i=0; i<100; i++)
    > {
    > buffer = i;
    > }
    > return 0;
    >}
    >
    >Additional questions:
    >
    >- Since local variables don't have to be declared at the beginning of a
    >function, during run-time, is space for all local variables to be used
    >in any function allocated when the function is entered, or only when
    >they are used?


    A C compiler can achieve the required behavior by allocating all
    local variables at a function's entry, or by allocating them upon
    reaching their enclosing block or (in C99) their initial definition.
    There are merits and drawbacks to either method; you will find that
    different C compilers choose different approaches.

    >- Since main() is always the starting point for programs, do compilers
    >really put it's local variables in a main()-stackframe or simply place
    >it in fixed memory locations? How are static locals in main()
    >treated?


    In C (but not in C++ -- the languages are really quite different,
    despite some syntactic similarities), you -- the programmer -- are
    allowed to call main() recursively. If you do, it must behave just
    like any other function. This makes it difficult for C compilers
    to weasel out of creating a stack frame in the usual manner on
    typical machines. (The compiler would have to determine that you
    do not in fact call main() recursively; if so, it could rewrite
    all the automatic variables in main() to have static-duration.
    This determination is not all that hard, but such rewriting is also
    not all that profitable -- the program is unlikely to be any faster
    or smaller. So why bother?)

    In both C and C++, it is possible -- via the atexit() function for
    instance -- to do dumb things with variables whose lifetime terminates
    when main() returns. Consider the following broken C code:

    #include <stdio.h>
    #include <stdlib.h>

    static int *p;

    void oops(void) {
    printf("*p = %d\n", *p);
    }

    int main(void) {
    int v = 42;
    atexit(oops);
    p = &v;
    return 0;
    }

    Here, in the C abstract machine, atexit() registers the function
    oops() to run when the program exits. Then we set p to point to
    v, which is local to main(), and then we return from main(),
    destroying the variable v. Now all atexit()-registered functions
    are run, so oops() runs, and attempts to access *p -- but p points
    to a variable whose lifetime has terminated. The effect is undefined:
    the program is allowed to crash, or print 42, or print any other
    number, or indeed do anything, such as post lies about your boss
    to USENET. :)

    We can fix the program by changing "int v" to "static int v". This
    changes the storage duration from automatic to static, so that only
    one copy of "v" exists no matter how many times we call main()
    recursively (none, in this case), and "v" exists for the lifetime
    of the program. Since the "lifetime of the program" continues
    *past* the return of main(), while atexit() like oops() run, this
    now makes a difference.

    We can also fix the program by removing the call to atexit(), though
    of course this stops the program from printing 42.

    [% There is one quite substantial merit to having at least two
    stacks, one for "control" -- return addresses and the like -- and
    one for "data" such as local variables. In particular, a system
    with two stacks offers the opportunity to debug programs that
    overwrite local arrays. Because the data are in the "Dstack",
    pointed to by the DSP or data-stack-pointer register, while the
    control values are in the "Cstack" pointed to by the CSP or
    control-stack-pointer register, and the two stacks are "far apart"
    in memory, writing past your own DSP area clobbers only other DSP
    memory. Breakpoints set via the CSP still cause the program to
    stop where you want, and you can then observe the DSP corruption.

    This design also interferes with typical Microsoft-bug-exploits:
    buffer overruns no longer allow you to overwrite the return address.
    The distance between CSP and DSP can be randomized on each run of
    the program, as well.]
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
     
    Chris Torek, Jan 15, 2005
    #4
  5. Re: Do buffers always start with the lowest memory address beingthe first element?

    "" <> writes:
    [...]
    > - Since local variables don't have to be declared at the beginning of a
    > function, during run-time, is space for all local variables to be used
    > in any function allocated when the function is entered, or only when
    > they are used?


    They can be allocated whenever the compiler chooses to allocate them,
    as long as they exist when they're used.

    > - Since main() is always the starting point for programs, do compilers
    > really put it's local variables in a main()-stackframe or simply place
    > it in fixed memory locations? How are static locals in main()
    > treated?


    main() is generally treated like any other function. Local variables
    within main() can't be allocated statically (at least not without a
    lot of extra trickery) because main() can be called recursively.

    A compiler could detect that main is never called recursively in a
    given program and do something different, but I doubt that any
    compilers actually do this. First, since main() can be called from a
    separate translation unit, it can't be determined until link time.
    Second, storing main()'s locals statically isn't likely to help
    significantly anyway, so the optimization isn't even worth doing.

    As for variables declared "static" within main(), again, these are
    almost certainly treated the same way as static variables within any
    other function.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Jan 15, 2005
    #5
  6. Guest

    [Google seems to have some problem, apologies if this posts multiple
    times]

    Chris Torek wrote:
    > In article <>
    > <> wrote:
    > >[This post is with regards computers/OSes with stacks that grow

    down.
    > >i386/Unix is one possibility]

    >
    > The C standard does not assume a downward-growing stack, nor even
    > an upward-growing stack. It merely requires that automatic (local)
    > variables behave in a "stack-like manner", if a function is called
    > recursively.
    >
    > (Google's broken news-posting interface destroyed your indentation.
    > I have tried to restore it here.)
    >
    > >I have embedded my questions/assumptions in the the following sample
    > >code:
    > >
    > >include <stdio.h>
    > >
    > >long num;
    > >/* allocated a zero-init fixed memory location (.BSS) */

    >
    > "BSS" is a system-specific (albeit common) method of implementing
    > this; C requires only that the variable exist as long as the program
    > runs, and be initialized to zero. Some implementations do the same
    > with this as with:
    >
    > long num = 0;
    >
    > because, e.g., they lack anything similar to the "bss region" found
    > on your example system.
    >
    > >char *s = "Hello world";
    > >/* s allocated a fixed memory location and initialized
    > >to address of 'H' (DATA). Meanwhile, the literal string
    > >(12-byte buffer) is stored in a fixed WRITE ONLY area (TEXT) */

    >
    > Surely you mean "read only" :) In typical Unix-like systems today,
    > the string literals are in a "read only data" section (".rodata"
    > and the like). C allows but does not require that the array produced
    > by the string literal be *physically* read-only, and some
    > implementations leave it write-able. The effect of writing on
    > elements of the array is undefined: it may trap, it may succeed
    > (changing the array), or it may silently fail (no trap but the
    > array remains unchanged), in the three "most typical"

    implementations,
    > but as far as Standard C is concerned, *anything* is allowed.
    > You -- the programmer -- are simply required not to do this,
    > in order for you to make any predictions about the operation of
    > your program.
    >
    > >short buffer[100];
    > >/* buffer allocated in contiguous fixed memory locations (in DATA)
    > >where buffer[0] is lowest memory address and buffer[99] is highest
    > >memory address */

    >
    > As with "num" above, C requires only that the variable exist as long
    > as the program runs, and be initialized to zero. On the

    implementation
    > you use, you will find that "buf" is also in the ".bss" section.



    Nitpick: buffer, not buf


    > The question of "lower" and "higher" memory addresses suggests to
    > me that you are asking about machine-level interpretations, but C
    > does not give you direct access to machine-level interpretations.
    > Instead, Standard C pastes a (usually quite thin) layer of
    > abstraction atop the machine-level. You may compute pointers
    > pointing into the array named "buffer", e.g.:
    >
    > short *p1 = &buffer[20];
    > short *p2 = &buffer[75];
    >
    > and, given two pointers of compatible type pointing into this array,
    > you may compare them using any of the four relational operators:
    >
    > int result1 = p1 < p2;
    > int result2 = p1 <= p2;
    > int result3 = p1 > p2;
    > int result4 = p1 >= p2;
    >
    > Results 1 and 2 here are guaranteed to be zero if p1 is "not less
    > than" p2, and "not less than" means "has a lower subscript in the
    > array". In that sense, the elements of the array are indeed
    > addressed from lowest to highest.
    >
    > (You can also, of course, use the equality operators: p1 == p2,
    > p1 != p2. I mention them separately because you can use them in
    > places you may *not* use the relational operators.)
    >
    > There is nothing stopping an actual C compiler on real hardware
    > from putting buffer[0] at the hardware's highest physical memory
    > address, and working down towards lower addresses. In this case,
    > a C source expression like:
    >
    > p1 < p2
    >
    > might compile into a machine instruction that tests instead whether
    > p1 is greater than p2. (Practically speaking, this would be stupid
    > on today's hardware, and no one will do it, because it would also
    > require negating the index in expressions like buffer. But one
    > might do this on a machine in which ordinary array indexing works
    > by subtraction instead of addition -- and such machines have existed
    > in the past.)
    >
    > At the C code level, then, it *is* the case that buffer[0] through
    > buffer[99] are in contiguous memory locations starting "at the
    > bottom" and "moving up", but there is no requirement that they be
    > physically contiguous (consider virtual-memory systems with small
    > page sizes), nor that a C-code level test like "p1 < p2" compile
    > to a machine instruction testing whether p1 is less than p2. In
    > C's abstract model, they are contiguous and ascending; the extent
    > to which C's abstract model matches what really happens on the
    > machine depends on both the C compiler and the machine.
    >
    > >int main()
    > >{
    > > int count = 4; /* stored in main()'s stack frame */
    > > float fcount= 4.0; /* stored in main()'s stack frame */
    > > static confusion = 8; /* stored in same region as buffer (DATA)

    */
    >
    > Again, the C standard requires only that count and fcount work "as
    > if" they were on some kind of stack, and that "confusion" work "as
    > if" it were in that kind of data-segment: initially 8, and valid
    > throughout the lifetime of the program.
    >
    > > long lbuffer[50];
    > > /* stored in main()'s stack frame, but since stack grows down,

    is the
    > > field for buffer[0] still placed in the lowest memory address?

    */
    >
    > Again, everything is "as if". Like count and fcount, lbuffer need
    > only exist as long as main() continues to execute, and if some
    > other function in your program calls main(), you must get "new
    > copies" of the variables, preserving the old copies that are still
    > around because the earlier call to main() is also still around (but
    > suspended until this copy of main() returns).
    >
    > The direction of stack growth, if there is even a single stack[%]
    > that has a single growth direction, is irrelevant to C's abstract
    > model. C promises only that &lbuffer[0] < &lbuffer[1] and so on.
    > If you somehow manage to compare &lbuffer[23] relationally to
    > &buffer[72], no particular result is required. (The types of the
    > two buffers' elements do not match, so this requires a cast, which
    > potentially changes the value, which muddies the issue even more,
    > but never mind all that.) On the other hand, equality comparisons,
    > after conversion to a suitable type such as "char *" or "void *",
    > *are* required to produce "not equal":
    >
    > if ((char *)&lbuffer[23] == (char *)&buffer[72])
    > abort(); /* never happens */


    I am not sure if I agree with you completly on this one. The above is
    supposed to be true _only_ as long as the indexes are within the limit
    assigned to them. I wrote this test snippet for my Borland 5.0

    #include <stdio.h>

    int main(void)
    {
    int ibuff[1];
    char cbuff[1];

    if ( (char*)&ibuff[0] == (char*)&cbuff[0+0x42])
    printf("OOps\n");
    printf("%p %p\n", (char*) &ibuff[0], (char*)&cbuff[0] );
    }

    > > printf("\n%s\n",s);
    > >
    > > int i; /* stored in main()'s stack frame, but when is memory

    allocated? */
    >
    > Declarations after code are a C99 feature. Quite a few compilers
    > do not support this; in C89 you could use a new block:
    >
    > printf("%s\n", s);
    > {
    > int i;
    > ...
    > }
    >
    > C requires only that the program behave "as if" i's lifetime begins
    > at its declaration and continues until execution reaches the "}" that
    > terminates its scope. In your C99-specific code, that is the final
    > close-brace for main(); in the C89 variant, it is the close-brace
    > inserted to match the open-brace I added here.
    >
    > > for(i=0; i<100; i++)
    > > {
    > > buffer = i;
    > > }
    > > return 0;
    > >}
    > >
    > >Additional questions:
    > >
    > >- Since local variables don't have to be declared at the beginning

    of a
    > >function, during run-time, is space for all local variables to be

    used
    > >in any function allocated when the function is entered, or only when
    > >they are used?

    >
    > A C compiler can achieve the required behavior by allocating all
    > local variables at a function's entry, or by allocating them upon
    > reaching their enclosing block or (in C99) their initial definition.
    > There are merits and drawbacks to either method; you will find that
    > different C compilers choose different approaches.
    >
    > >- Since main() is always the starting point for programs, do

    compilers
    > >really put it's local variables in a main()-stackframe or simply

    place
    > >it in fixed memory locations? How are static locals in main()
    > >treated?

    >
    > In C (but not in C++ -- the languages are really quite different,
    > despite some syntactic similarities), you -- the programmer -- are
    > allowed to call main() recursively. If you do, it must behave just
    > like any other function. This makes it difficult for C compilers
    > to weasel out of creating a stack frame in the usual manner on
    > typical machines. (The compiler would have to determine that you
    > do not in fact call main() recursively; if so, it could rewrite
    > all the automatic variables in main() to have static-duration.
    > This determination is not all that hard, but such rewriting is also
    > not all that profitable -- the program is unlikely to be any faster
    > or smaller. So why bother?)
    >
    > In both C and C++, it is possible -- via the atexit() function for
    > instance -- to do dumb things with variables whose lifetime

    terminates
    > when main() returns. Consider the following broken C code:
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > static int *p;
    >
    > void oops(void) {
    > printf("*p = %d\n", *p);
    > }
    >
    > int main(void) {
    > int v = 42;
    > atexit(oops);
    > p = &v;
    > return 0;
    > }
    >
    > Here, in the C abstract machine, atexit() registers the function
    > oops() to run when the program exits. Then we set p to point to
    > v, which is local to main(), and then we return from main(),
    > destroying the variable v. Now all atexit()-registered functions
    > are run, so oops() runs, and attempts to access *p -- but p points
    > to a variable whose lifetime has terminated. The effect is

    undefined:
    > the program is allowed to crash, or print 42, or print any other
    > number, or indeed do anything, such as post lies about your boss
    > to USENET. :)
    >
    > We can fix the program by changing "int v" to "static int v". This
    > changes the storage duration from automatic to static, so that only
    > one copy of "v" exists no matter how many times we call main()
    > recursively (none, in this case), and "v" exists for the lifetime
    > of the program. Since the "lifetime of the program" continues
    > *past* the return of main(), while atexit() like oops() run, this
    > now makes a difference.
    >
    > We can also fix the program by removing the call to atexit(), though
    > of course this stops the program from printing 42.
    >
    > [% There is one quite substantial merit to having at least two
    > stacks, one for "control" -- return addresses and the like -- and
    > one for "data" such as local variables. In particular, a system
    > with two stacks offers the opportunity to debug programs that
    > overwrite local arrays. Because the data are in the "Dstack",
    > pointed to by the DSP or data-stack-pointer register, while the
    > control values are in the "Cstack" pointed to by the CSP or
    > control-stack-pointer register, and the two stacks are "far apart"
    > in memory, writing past your own DSP area clobbers only other DSP
    > memory. Breakpoints set via the CSP still cause the program to
    > stop where you want, and you can then observe the DSP corruption.
    >
    > This design also interferes with typical Microsoft-bug-exploits:
    > buffer overruns no longer allow you to overwrite the return address.
    > The distance between CSP and DSP can be randomized on each run of
    > the program, as well.]


    Point well taken, but I believe it's not strictly a Microsoft thingy.
    AFAIK Linux and for that matter x86 based systems save the return
    address within the current stack itself only.



    --
    Imanpreet Singh Arora

    If I am given 6 hours to chop a tree, I would spend the
    first 4 to sharpen my axe.
    Abraham Lincoln
     
    , Jan 17, 2005
    #6
  7. Chris Torek Guest

    >Chris Torek wrote:
    >> As with "num" above, C requires only that the variable exist as long
    >> as the program runs, and be initialized to zero. On the implementation
    >> you use, you will find that "buf" is also in the ".bss" section.


    In article <>
    <> wrote:
    >Nitpick: buffer, not buf


    Oops, quite right.

    >> ... On the other hand, equality comparisons,
    >> after conversion to a suitable type such as "char *" or "void *",
    >> *are* required to produce "not equal":
    >>
    >> if ((char *)&lbuffer[23] == (char *)&buffer[72])
    >> abort(); /* never happens */

    >
    >I am not sure if I agree with you completly on this one. The above is
    >supposed to be true _only_ as long as the indexes are within the limit
    >assigned to them.


    Yes (but I made sure that this was the case in the example above).
    Given an array "a" of size N, a+0 (&a[0]) through a+N (&a[N]) are
    all valid, computable addresses[%] that are all different, but only
    &a[0] through &a[N-1] are guaranteed to be distinct from other
    objects' addresses, and it is not at all unusual for a+N to have
    the same address (when converted to "char *") as some other object.
    -----
    % There is some brokenness in the C89 wording that makes "a + N"
    OK but "&a[N]" not OK. This is fixed in C99, and possibly even
    via some intermediate update to C89. It is probably better to
    write it as a+N anyway, just in case, so I did here.
    -----

    [on separating control and data stacks]
    >> This design also interferes with typical Microsoft-bug-exploits:
    >> buffer overruns no longer allow you to overwrite the return address.
    >> The distance between CSP and DSP can be randomized on each run of
    >> the program, as well.]


    >Point well taken, but I believe it's not strictly a Microsoft thingy.
    >AFAIK Linux and for that matter x86 based systems save the return
    >address within the current stack itself only.


    Yes, no doubt because the x86 instruction architecture "strongly
    encourages" this (by making it easy to use one stack, and quite
    difficult to use two separate ones). Not all other architectures
    are so hobbled -- although one still finds a single combined stack
    even where there is no "hardware encouragement", e.g., on the MIPS.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
     
    Chris Torek, Jan 19, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tim Greenfield

    WriteFile buffers in memory

    Tim Greenfield, Oct 4, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    2,110
    Tim Greenfield
    Oct 5, 2004
  2. tconkling

    First statement always evaluated first?

    tconkling, Jul 9, 2005, in forum: C Programming
    Replies:
    3
    Views:
    354
    Keith Thompson
    Jul 9, 2005
  3. Henk
    Replies:
    4
    Views:
    867
  4. candide
    Replies:
    65
    Views:
    1,450
  5. Vincent De Groote

    structure address = structure first field address ?

    Vincent De Groote, Apr 29, 2009, in forum: C Programming
    Replies:
    31
    Views:
    2,097
Loading...

Share This Page