Is allocating large objects on the stack a good practice?

Discussion in 'C Programming' started by Michael Tsang, Mar 16, 2010.

  1. For example, if we allocate objects like this:

    void foo(void) {
    static const size_t size = 10000000; /* ten million */
    int data[size]; /* data is a very large object */
    /* do something with data */

    In C99, perhaps this:

    #include <stdio.h>
    int main(void) {
    size_t n; /* This program makes sense even n is very large */
    scanf("%zu", &n); /* n must not be negative */
    int data[n]; /* VLA: we need not care freeing the
    object */
    /* do something with data */

    Or, in C++:

    template<size_t n> void func_templ() {
    int data[n];
    /* do something with data */
    if(n) func_templ<n - 1>() /* recursive template
    instantiation */

    Will it cause any trouble. I know that, writing code that does not leak is
    VERY difficult in C. (In C++, the objects allocates memory only in the
    constructor and deallocates memory only in the destructor so the problem is
    not so serious). Stack allocations (especially C99 VLAs) free the
    programmers ever from mallocking memory manually so it seems to be a good
    idea. Also, I like setting the stack size to unlimited to prevent stack
    overflow ever from happening. Is all these a good idea?
    Michael Tsang, Mar 16, 2010
    1. Advertisements

  2. It's not VERY difficult to write code in C that doesn't leak. However
    you need a certain discipline - either free memory in the same
    function that youa llocated it, or provide matching construct/kill
    functions that return a pointer to a dynamically-allocated structure
    and free it.
    In addition you can use tools to detect memory leaks.

    Allocating everything on the stack is no solution. The stack is
    designed for relatively small amounts of scratchspace memory. If you
    use it for huge data structures you may make code hard for the
    compiler to optimise, and you may run out of stack space. Just as
    significantly, whilst the pattern of last allocated / first freed is
    useful, it cannot be applied to all the data structures ypu might
    need. Consider a linked list that has to expand and contract as the
    user adds and delates items form his workspace.
    The better solution is automatic garbage collection. However this has
    lots of issues of its own.
    Dr Malcolm McLean, Mar 16, 2010
    1. Advertisements

  3. This is a bad idea for my taste.

    Two ideas to cope:

    1) object construction:

    Resource type whose initialization entails dynamic memory allocation
    for and initialization of sub-resources.
    struct res
    struct sub_res1 *res1;
    struct sub_res2 *res2;
    struct sub_res3 *res3;

    Initialize a resource object that has already been allocated.
    Return -1 for failure and 0 for success. If -1 is returned, the
    contents of the object is indeterminate.
    res_init(struct res *res, int p1, int p2, int p3)
    res->res1 = res1_construct(p1);

    if (0 != res->res1) {
    res->res2 = res2_construct(p2);

    if (0 != res->res2) {
    res->res3 = res3_construct(p3);

    if (0 != res->res3) {
    /* object complete */

    return 0;



    return -1;

    /* Uninitialize a successfully initialized object. */
    res_uninit(struct res *res)

    /* Allocate and initialize an object. */
    struct res *
    res_construct(int p1, int p2, int p3)
    struct res *res;

    res = malloc(sizeof *res);
    if (0 != res) {
    if (0 == res_init(res, p1, p2, p3) {
    return res;

    return 0;

    /* Uninitialize and free an allocated and initialized object. */
    res_destruct(struct res *res)

    This goes on recursively, that is, the res1_construct(),
    res2_construct() and res3_construct() functions called in res_init() all
    have the same structure as res_construct() itself. Additionally,
    res_construct() can participate in an even-higher level _init() routine.

    The _init() functions allow the programmer to define an "outermost"
    object with automatic or static storage duration, or to initialize a
    structure member object of an already allocated structure. (What is
    "outermost" depends on the programmer's situation, so it's useful to
    declare all _init() and _uninit() functions separately from _construct()
    and _destruct(), and with external linkage.)

    The _init() functions can allocate all kinds of system resources, not
    just memory (eg. file descriptors to all kinds of files.)

    The whole thing mimics the constructor/destructor stuff of C++.

    2) Temporary object construction for computation: this is almost
    identical to the _init() functions, except that the innermost
    success-return is replaced with storing the result and a success
    indicator, and with releasing the innermost resource. The unwinding
    happens unconditionally here:

    Compute some result based on p1, p2, p3. If the computation was
    successful, 0 is returned and the result is stored in *result.
    Otherwise, -1 is returned.
    compute(struct result_type *result, int p1, int p2, int p3)
    int ret;
    struct sub_res1 tmp1;

    ret = -1;
    if (-1 != res1_init(&tmp1, p1)) {
    struct sub_res2 tmp2;

    if (-1 != res2_init(&tmp2, p2)) {
    struct sub_res3 tmp3;

    if (-1 != res3_init(&tmp3, p3)) {
    /* All objects present, do computation. */

    *result = ...;
    ret = 0;




    return ret;

    Ideas 1 and 2 can be recursively combined, too; for example, some
    computation may be necessary to initialize an object, and the compute()
    function above already relies on idea 1 (object initialization). No such
    call-tree leaks (unless I botched up the code above, but you get the

    Note that the _uninit() and _destruct() functions described above are
    unable to signal errors. If such an _uninit() calls eg. fclose(), that's
    lossy, because fclose() might try to flush output and it could fail.
    This is only relevant in the compute() case, not the object construction
    case, because in the latter case, we will signal an error anyway back to
    the caller if we're on the error path.

    I can name two solutions to this:

    a) Make the _uninit() functions return a success/error value too, and
    when walking towards the exit in compute(), have any failed _uninit()
    reset "ret" to -1 and destroy (release) *result.

    b) Make fflush() part of the computation, or more generally, make sure
    that once we set "ret = 0", nothing can go wrong within reason.

    .... I cheated a little, because even compute() is a sort of object
    initialization -- that of "*result".

    These "patterns" cannot be used indiscriminately. The idea to take away
    is the staircase-like embedding of "if" statements. (Many people hate it
    with a passion, because it introduces a lot of basic blocks and
    increases "cyclomatic complexity". IMHO with a reasonable resolution
    (and consequently, depth) of _init() / compute() functions, things stay
    manageable. One benefit of this approach appears to be that you never
    have to write O(n^2) pieces pf _uninit() calls in error handling

    Or something like that.

    Ersek, Laszlo, Mar 16, 2010
  4. Michael Tsang

    ImpalerCore Guest

    While I agree that for many users that writing code that does not leak
    is difficult (and you'll likely get responses from the regulars here
    that it's not difficult, as it's usually a matter of perspective and
    experience), it is not necessarily so unmanageable that one regresses
    to the stack for everything. For dynamic data structures, like auto-
    resizing strings, arrays, linked lists, trees, etc..., using the heap
    can be the best solution even though there are the associated risks of
    using them. Managing the memory leak risk requires a commitment to
    learn the tools that help you determine leaks, to exercise your code
    in ways that may lead to memory leaks, and to learn the C programming
    styles and idioms needed to avoid such leaks. This is a long process
    and one area that I'm still improving.

    Memory debuggers are a great tool in any programmers toolbox. They
    provide invaluable first glances if you are making simple memory
    management errors, like forgetting a free, or not properly cleaning up
    a dynamic data structures. It does require some effort to check out
    various memory debuggers, but the time spent learning the tool will be
    quickly regained in discovering or troubleshooting common allocation
    errors. I've been using dmalloc myself and have been mostly pleased
    with it so far.

    Unfortunately, the memory debugger is often only as good as the code
    it is run on. This implies that memory leaks/errors pop up when your
    program has unexpected errors. These problems often originate from
    buffer overflows, running out of resources, invalid or too large
    input, programming errors, and more. It can be very tedious to
    exercise a function or an interface in all the boundary cases, and
    even then you will likely miss some. If you expose your code to other
    people, they will use it in ways you did not expect, and all of a
    sudden you have more errors that you didn't see the last N times.

    And yet, even with all these issues, people still use the heap because
    it is at times, the best solution for the problem at hand. One of the
    main benefits of dynamic allocation is that the space efficiency can
    be much higher than using the stack, as you allocate only the space
    that you need for the time needed. The other main benefit is that you
    have more control of the lifetime of an object, rather than
    restricting it to the scope of a function or block.

    I liken C programming to using a power tool without the safety on.
    You may get nicks and bruises from using it, and in the beginning some
    very frustrating times (I can remember the torment inflicted on me the
    first time I needed to write a linked list of linked list containers
    in C for a class), but learning to endure through those times will
    give you the ability to effective use and maybe even enjoy C.
    Learning to use malloc/free in a safe way is a long and sometimes
    difficult journey, but the benefit of having the tool and knowing how
    to use it *is* worth it if you plan to spend a long enough time using

    Best regards,
    John D.
    ImpalerCore, Mar 16, 2010
  5. The stack overflow handler prints out a purchase order for more memory
    and waits for you to install it. (I didn't say it was quick.)

    Address space limitations can be resolved by ... handwaving ... hey,
    look over there!
    Keith Thompson, Mar 16, 2010
  6. Michael Tsang

    Seebs Guest

    Consider the host contamination checking implementation I suggested for
    a cross-compilation system:

    ... Hey, look! A bear!
    ... No host contamination found.

    Sadly, even a very careful analysis of the benefit vs. implementation time
    turned out not to favor this approach. Maybe I'll resubmit it in a little
    over two weeks.

    Seebs, Mar 16, 2010
  7. isn't _init() in a reserved namespace? Or close to one anyway.
    Nick Keighley, Mar 17, 2010
  8. whilst this is good practice C++ does not compel you to do this
    Nick Keighley, Mar 17, 2010
  9. Michael Tsang

    chrisbazley Guest

    No, I'm pretty sure that identifiers starting with an underscore are
    only reserved if the following letter is another underscore or a
    capital letter.

    I recently modified a library of mine to stop using reserved
    identifiers, having long ago adopted the idiom of prefixing all my
    struct tags with '_' and using the same name without this prefix for
    the corresponding types.

    It was then that I discovered only my structs with capitalised tags
    were infringing reserved namespace. Probably whoever wrote the code I
    mimicked was aware of this, but I wasn't, which shows the danger of
    adopting conventions without understanding them!
    chrisbazley, Mar 17, 2010
  10. Michael Tsang

    Eric Sosman Guest

    For C, "All identifiers that begin with an underscore are
    always reserved for use as identifiers with file scope in both
    the ordinary and tag name spaces." (7.1.3p1). So you could use
    `_init' inside a function, but not as a name for anything outside
    a function. In particular, since "a higher-level routine" would
    necessarily be a file-scope entity, naming it _init would encroach
    on reserved space.

    I don't know whether C++ rules are the same.
    Eric Sosman, Mar 17, 2010
  11. Michael Tsang

    Nobody Guest

    It's possible to write C code which doesn't leak. And in the cases where
    you could use a stack-based array (i.e. where you only need the memory
    for the duration of the function), it isn't very difficult to avoid leaks.

    Where avoiding leaks is hard is where you return pointers to dynamically
    allocated memory and then have to keep track of whether or not it's being
    used. But you can't use a stack-based array in that situation anyhow.

    The main reason to use the stack is performance: fixed-sized arrays are
    allocated for free along with the rest of the stack frame, while alloca()
    (or a C99 VLA) may end up as a single instruction.

    If you're doing e.g. simple processing on small-ish strings, allocating a
    buffer with malloc() may take more time than the actual algorithm. OTOH,
    if the array is large or the processing is complex, the time taken by
    malloc() and free() is likely to be negligible.
    Nobody, Mar 17, 2010
  12. None that I would know of. (I meant XXX_init(), for some resource type
    called "struct XXX" -- I didn't mean "_init" verbatim, without any
    non-empty prefix. Sorry for being vague.)

    Ersek, Laszlo, Mar 17, 2010
  13. Michael Tsang

    red floyd Guest
    -- Each name that contains a double underscore (__) or begins with an
    underscore followed by an uppercase letter is reserved to the
    implementation for any use.

    -- Each name that begins with an underscore is reserved to the
    for use as a name in the global namespace.
    red floyd, Mar 17, 2010
  14. Michael Tsang

    jl_post Guest

    I believe that you should always allocate on the stack -- if you
    can get away with it. Unfortunately, declaring a huge object on the
    stack doesn't always work, as there are pre-set stack-size limits.

    Fortunately, to do what you want (that is, allocate a large array),
    you can use a std::vector, like this:

    void foo(void)
    const size_t size = 10000000; // ten million
    std::vector<int> data(size);
    // do something with data

    Technically, the data "object" will be on the stack, but the data's
    "array" should be on the heap. And since we're technically using a C+
    + object with a proper destructor, there's no need for us to clean up
    the ints that are allocated -- it will automagically be done for us at
    the end of its scope. (So we get the best of both worlds.)

    I hope this help, Michael.

    -- Jean-Luc
    jl_post, Mar 22, 2010
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.