Is allocating large objects on the stack a good practice?

Discussion in 'C Programming' started by Michael Tsang, Mar 16, 2010.

  1. For example, if we allocate objects like this:

    void foo(void) {
    static const size_t size = 10000000; /* ten million */
    int data[size]; /* data is a very large object */
    /* do something with data */
    }

    In C99, perhaps this:

    #include <stdio.h>
    int main(void) {
    size_t n; /* This program makes sense even n is very large */
    scanf("%zu", &n); /* n must not be negative */
    int data[n]; /* VLA: we need not care freeing the
    object */
    /* do something with data */
    }

    Or, in C++:

    template<size_t n> void func_templ() {
    int data[n];
    /* do something with data */
    if(n) func_templ<n - 1>() /* recursive template
    instantiation */
    }

    Will it cause any trouble. I know that, writing code that does not leak is
    VERY difficult in C. (In C++, the objects allocates memory only in the
    constructor and deallocates memory only in the destructor so the problem is
    not so serious). Stack allocations (especially C99 VLAs) free the
    programmers ever from mallocking memory manually so it seems to be a good
    idea. Also, I like setting the stack size to unlimited to prevent stack
    overflow ever from happening. Is all these a good idea?
     
    Michael Tsang, Mar 16, 2010
    #1
    1. Advertising

  2. On 16 Mar, 13:57, Michael Tsang <> wrote:
    > Will it cause any trouble. I know that, writing code that does not leak is
    > VERY difficult in C. (In C++, the objects allocates memory only in the
    > constructor and deallocates memory only in the destructor so the problem is
    > not so serious). Stack allocations (especially C99 VLAs) free the
    > programmers ever from mallocking memory manually so it seems to be a good
    > idea. Also, I like setting the stack size to unlimited to prevent stack
    > overflow ever from happening. Is all these a good idea?
    >

    It's not VERY difficult to write code in C that doesn't leak. However
    you need a certain discipline - either free memory in the same
    function that youa llocated it, or provide matching construct/kill
    functions that return a pointer to a dynamically-allocated structure
    and free it.
    In addition you can use tools to detect memory leaks.

    Allocating everything on the stack is no solution. The stack is
    designed for relatively small amounts of scratchspace memory. If you
    use it for huge data structures you may make code hard for the
    compiler to optimise, and you may run out of stack space. Just as
    significantly, whilst the pattern of last allocated / first freed is
    useful, it cannot be applied to all the data structures ypu might
    need. Consider a linked list that has to expand and contract as the
    user adds and delates items form his workspace.
    The better solution is automatic garbage collection. However this has
    lots of issues of its own.
     
    Dr Malcolm McLean, Mar 16, 2010
    #2
    1. Advertising

  3. In article <hno2o7$vc2$-september.org>, Michael Tsang <> writes:
    > For example, if we allocate objects like this:
    >
    > void foo(void) {
    > static const size_t size = 10000000; /* ten million */
    > int data[size]; /* data is a very large object */
    > /* do something with data */
    > }


    This is a bad idea for my taste.


    > I know that, writing code that does not leak is
    > VERY difficult in C.


    Two ideas to cope:

    1) object construction:

    /*
    Resource type whose initialization entails dynamic memory allocation
    for and initialization of sub-resources.
    */
    struct res
    {
    struct sub_res1 *res1;
    struct sub_res2 *res2;
    struct sub_res3 *res3;
    };


    /*
    Initialize a resource object that has already been allocated.
    Return -1 for failure and 0 for success. If -1 is returned, the
    contents of the object is indeterminate.
    */
    int
    res_init(struct res *res, int p1, int p2, int p3)
    {
    res->res1 = res1_construct(p1);

    if (0 != res->res1) {
    res->res2 = res2_construct(p2);

    if (0 != res->res2) {
    res->res3 = res3_construct(p3);

    if (0 != res->res3) {
    /* object complete */

    return 0;
    }

    res2_destruct(res->res2);
    }

    res1_destruct(res->res1);
    }

    return -1;
    }

    /* Uninitialize a successfully initialized object. */
    void
    res_uninit(struct res *res)
    {
    res3_destruct(res->res3);
    res2_destruct(res->res2);
    res1_destruct(res->res1);
    }


    /* Allocate and initialize an object. */
    struct res *
    res_construct(int p1, int p2, int p3)
    {
    struct res *res;

    res = malloc(sizeof *res);
    if (0 != res) {
    if (0 == res_init(res, p1, p2, p3) {
    return res;
    }

    free(res):
    }
    return 0;
    }

    /* Uninitialize and free an allocated and initialized object. */
    void
    res_destruct(struct res *res)
    {
    res_uninit(res);
    free(res);
    }


    This goes on recursively, that is, the res1_construct(),
    res2_construct() and res3_construct() functions called in res_init() all
    have the same structure as res_construct() itself. Additionally,
    res_construct() can participate in an even-higher level _init() routine.

    The _init() functions allow the programmer to define an "outermost"
    object with automatic or static storage duration, or to initialize a
    structure member object of an already allocated structure. (What is
    "outermost" depends on the programmer's situation, so it's useful to
    declare all _init() and _uninit() functions separately from _construct()
    and _destruct(), and with external linkage.)

    The _init() functions can allocate all kinds of system resources, not
    just memory (eg. file descriptors to all kinds of files.)

    The whole thing mimics the constructor/destructor stuff of C++.


    2) Temporary object construction for computation: this is almost
    identical to the _init() functions, except that the innermost
    success-return is replaced with storing the result and a success
    indicator, and with releasing the innermost resource. The unwinding
    happens unconditionally here:


    /*
    Compute some result based on p1, p2, p3. If the computation was
    successful, 0 is returned and the result is stored in *result.
    Otherwise, -1 is returned.
    */
    int
    compute(struct result_type *result, int p1, int p2, int p3)
    {
    int ret;
    struct sub_res1 tmp1;

    ret = -1;
    if (-1 != res1_init(&tmp1, p1)) {
    struct sub_res2 tmp2;

    if (-1 != res2_init(&tmp2, p2)) {
    struct sub_res3 tmp3;

    if (-1 != res3_init(&tmp3, p3)) {
    /* All objects present, do computation. */

    *result = ...;
    ret = 0;

    res3_uninit(&tmp3);
    }

    res2_uninit(&tmp2);
    }

    res1_uninit(&tmp1);
    }

    return ret;
    }


    Ideas 1 and 2 can be recursively combined, too; for example, some
    computation may be necessary to initialize an object, and the compute()
    function above already relies on idea 1 (object initialization). No such
    call-tree leaks (unless I botched up the code above, but you get the
    idea).

    Note that the _uninit() and _destruct() functions described above are
    unable to signal errors. If such an _uninit() calls eg. fclose(), that's
    lossy, because fclose() might try to flush output and it could fail.
    This is only relevant in the compute() case, not the object construction
    case, because in the latter case, we will signal an error anyway back to
    the caller if we're on the error path.

    I can name two solutions to this:

    a) Make the _uninit() functions return a success/error value too, and
    when walking towards the exit in compute(), have any failed _uninit()
    reset "ret" to -1 and destroy (release) *result.

    b) Make fflush() part of the computation, or more generally, make sure
    that once we set "ret = 0", nothing can go wrong within reason.

    .... I cheated a little, because even compute() is a sort of object
    initialization -- that of "*result".

    These "patterns" cannot be used indiscriminately. The idea to take away
    is the staircase-like embedding of "if" statements. (Many people hate it
    with a passion, because it introduces a lot of basic blocks and
    increases "cyclomatic complexity". IMHO with a reasonable resolution
    (and consequently, depth) of _init() / compute() functions, things stay
    manageable. One benefit of this approach appears to be that you never
    have to write O(n^2) pieces pf _uninit() calls in error handling
    sections.)

    Or something like that.

    lacos
     
    Ersek, Laszlo, Mar 16, 2010
    #3
  4. Michael Tsang

    ImpalerCore Guest

    On Mar 16, 9:57 am, Michael Tsang <> wrote:
    > For example, if we allocate objects like this:
    >
    > void foo(void) {
    >         static const size_t size = 10000000; /* ten million */
    >         int data[size]; /* data is a very large object */
    >         /* do something with data */
    >
    > }
    >
    > In C99, perhaps this:
    >
    > #include <stdio.h>
    > int main(void) {
    >         size_t n;       /* This program makes sense even n is very large */
    >         scanf("%zu", &n); /* n must not be negative */
    >         int data[n];    /* VLA: we need not care freeing the
    > object */
    >         /* do something with data */
    >
    > }
    >
    > Or, in C++:
    >
    > template<size_t n> void func_templ() {
    >         int data[n];
    >         /* do something with data */
    >         if(n) func_templ<n - 1>() /* recursive template
    > instantiation */
    >
    > }
    >
    > Will it cause any trouble. I know that, writing code that does not leak is
    > VERY difficult in C. (In C++, the objects allocates memory only in the
    > constructor and deallocates memory only in the destructor so the problem is
    > not so serious). Stack allocations (especially C99 VLAs) free the
    > programmers ever from mallocking memory manually so it seems to be a good
    > idea. Also, I like setting the stack size to unlimited to prevent stack
    > overflow ever from happening. Is all these a good idea?


    While I agree that for many users that writing code that does not leak
    is difficult (and you'll likely get responses from the regulars here
    that it's not difficult, as it's usually a matter of perspective and
    experience), it is not necessarily so unmanageable that one regresses
    to the stack for everything. For dynamic data structures, like auto-
    resizing strings, arrays, linked lists, trees, etc..., using the heap
    can be the best solution even though there are the associated risks of
    using them. Managing the memory leak risk requires a commitment to
    learn the tools that help you determine leaks, to exercise your code
    in ways that may lead to memory leaks, and to learn the C programming
    styles and idioms needed to avoid such leaks. This is a long process
    and one area that I'm still improving.

    Memory debuggers are a great tool in any programmers toolbox. They
    provide invaluable first glances if you are making simple memory
    management errors, like forgetting a free, or not properly cleaning up
    a dynamic data structures. It does require some effort to check out
    various memory debuggers, but the time spent learning the tool will be
    quickly regained in discovering or troubleshooting common allocation
    errors. I've been using dmalloc myself and have been mostly pleased
    with it so far.

    Unfortunately, the memory debugger is often only as good as the code
    it is run on. This implies that memory leaks/errors pop up when your
    program has unexpected errors. These problems often originate from
    buffer overflows, running out of resources, invalid or too large
    input, programming errors, and more. It can be very tedious to
    exercise a function or an interface in all the boundary cases, and
    even then you will likely miss some. If you expose your code to other
    people, they will use it in ways you did not expect, and all of a
    sudden you have more errors that you didn't see the last N times.

    And yet, even with all these issues, people still use the heap because
    it is at times, the best solution for the problem at hand. One of the
    main benefits of dynamic allocation is that the space efficiency can
    be much higher than using the stack, as you allocate only the space
    that you need for the time needed. The other main benefit is that you
    have more control of the lifetime of an object, rather than
    restricting it to the scope of a function or block.

    I liken C programming to using a power tool without the safety on.
    You may get nicks and bruises from using it, and in the beginning some
    very frustrating times (I can remember the torment inflicted on me the
    first time I needed to write a linked list of linked list containers
    in C for a class), but learning to endure through those times will
    give you the ability to effective use and maybe even enjoy C.
    Learning to use malloc/free in a safe way is a long and sometimes
    difficult journey, but the benefit of having the tool and knowing how
    to use it *is* worth it if you plan to spend a long enough time using
    C.

    Best regards,
    John D.
     
    ImpalerCore, Mar 16, 2010
    #4
  5. Paavo Helde <> writes:
    > Michael Tsang <> wrote in
    > news:hno2o7$vc2$-september.org:

    [...]
    >> Also, I like setting the stack size to
    >> unlimited to prevent stack overflow ever from happening.

    >
    > Wow, infinite Turing machine! Drop me a note when you have found one! :)


    The stack overflow handler prints out a purchase order for more memory
    and waits for you to install it. (I didn't say it was quick.)

    Address space limitations can be resolved by ... handwaving ... hey,
    look over there!

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 16, 2010
    #5
  6. Michael Tsang

    Seebs Guest

    On 2010-03-16, Keith Thompson <> wrote:
    > Address space limitations can be resolved by ... handwaving ... hey,
    > look over there!


    Consider the host contamination checking implementation I suggested for
    a cross-compilation system:

    HOST CONTAMINATION CHECKS:
    ... Hey, look! A bear!
    ... No host contamination found.

    Sadly, even a very careful analysis of the benefit vs. implementation time
    turned out not to favor this approach. Maybe I'll resubmit it in a little
    over two weeks.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
     
    Seebs, Mar 16, 2010
    #6
  7. On 16 Mar, 15:31, (Ersek, Laszlo) wrote:

    > Additionally,
    > res_construct() can participate in an even-higher level _init() routine.


    isn't _init() in a reserved namespace? Or close to one anyway.
     
    Nick Keighley, Mar 17, 2010
    #7
  8. On 16 Mar, 13:57, Michael Tsang <> wrote:

    > [...] I know that, writing code that does not leak is
    > VERY difficult in C. (In C++, the objects allocates memory only in the
    > constructor and deallocates memory only in the destructor so the problem is
    > not so serious).


    whilst this is good practice C++ does not compel you to do this
     
    Nick Keighley, Mar 17, 2010
    #8
  9. Michael Tsang

    Guest

    On Mar 17, 9:37 am, Nick Keighley <>
    wrote:
    > On 16 Mar, 15:31, (Ersek, Laszlo) wrote:
    >
    > > Additionally,
    > > res_construct() can participate in an even-higher level _init() routine..

    >
    > isn't _init() in a reserved namespace? Or close to one anyway.


    On Mar 17, 9:37 am, Nick Keighley <>
    wrote:
    > On 16 Mar, 15:31, (Ersek, Laszlo) wrote:
    >
    > > Additionally,
    > > res_construct() can participate in an even-higher level _init() routine..

    >
    > isn't _init() in a reserved namespace? Or close to one anyway.


    No, I'm pretty sure that identifiers starting with an underscore are
    only reserved if the following letter is another underscore or a
    capital letter.

    I recently modified a library of mine to stop using reserved
    identifiers, having long ago adopted the idiom of prefixing all my
    struct tags with '_' and using the same name without this prefix for
    the corresponding types.

    It was then that I discovered only my structs with capitalised tags
    were infringing reserved namespace. Probably whoever wrote the code I
    mimicked was aware of this, but I wasn't, which shows the danger of
    adopting conventions without understanding them!

    --
    Christopher Bazley
     
    , Mar 17, 2010
    #9
  10. Michael Tsang

    Eric Sosman Guest

    On 3/17/2010 7:53 AM, wrote:
    > On Mar 17, 9:37 am, Nick Keighley<>
    > wrote:
    >> On 16 Mar, 15:31, (Ersek, Laszlo) wrote:
    >>
    >>> Additionally,
    >>> res_construct() can participate in an even-higher level _init() routine.

    >>
    >> isn't _init() in a reserved namespace? Or close to one anyway.

    >
    > On Mar 17, 9:37 am, Nick Keighley<>
    > wrote:
    >> On 16 Mar, 15:31, (Ersek, Laszlo) wrote:
    >>
    >>> Additionally,
    >>> res_construct() can participate in an even-higher level _init() routine.

    >>
    >> isn't _init() in a reserved namespace? Or close to one anyway.

    >
    > No, I'm pretty sure that identifiers starting with an underscore are
    > only reserved if the following letter is another underscore or a
    > capital letter.


    For C, "All identifiers that begin with an underscore are
    always reserved for use as identifiers with file scope in both
    the ordinary and tag name spaces." (7.1.3p1). So you could use
    `_init' inside a function, but not as a name for anything outside
    a function. In particular, since "a higher-level routine" would
    necessarily be a file-scope entity, naming it _init would encroach
    on reserved space.

    I don't know whether C++ rules are the same.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Mar 17, 2010
    #10
  11. Michael Tsang

    Nobody Guest

    On Tue, 16 Mar 2010 21:57:42 +0800, Michael Tsang wrote:

    > Will it cause any trouble. I know that, writing code that does not leak is
    > VERY difficult in C. (In C++, the objects allocates memory only in the
    > constructor and deallocates memory only in the destructor so the problem is
    > not so serious). Stack allocations (especially C99 VLAs) free the
    > programmers ever from mallocking memory manually so it seems to be a good
    > idea. Also, I like setting the stack size to unlimited to prevent stack
    > overflow ever from happening. Is all these a good idea?


    It's possible to write C code which doesn't leak. And in the cases where
    you could use a stack-based array (i.e. where you only need the memory
    for the duration of the function), it isn't very difficult to avoid leaks.

    Where avoiding leaks is hard is where you return pointers to dynamically
    allocated memory and then have to keep track of whether or not it's being
    used. But you can't use a stack-based array in that situation anyhow.

    The main reason to use the stack is performance: fixed-sized arrays are
    allocated for free along with the rest of the stack frame, while alloca()
    (or a C99 VLA) may end up as a single instruction.

    If you're doing e.g. simple processing on small-ish strings, allocating a
    buffer with malloc() may take more time than the actual algorithm. OTOH,
    if the array is large or the processing is complex, the time taken by
    malloc() and free() is likely to be negligible.
     
    Nobody, Mar 17, 2010
    #11
  12. In article <>, Nick Keighley <> writes:
    > On 16 Mar, 15:31, (Ersek, Laszlo) wrote:
    >
    >> Additionally,
    >> res_construct() can participate in an even-higher level _init() routine.

    >
    > isn't _init() in a reserved namespace? Or close to one anyway.


    None that I would know of. (I meant XXX_init(), for some resource type
    called "struct XXX" -- I didn't mean "_init" verbatim, without any
    non-empty prefix. Sorry for being vague.)

    lacos
     
    Ersek, Laszlo, Mar 17, 2010
    #12
  13. Michael Tsang

    red floyd Guest

    On Mar 17, 5:22 am, Eric Sosman <> wrote:
    > On 3/17/2010 7:53 AM, wrote:
    >
    >
    >
    > > On Mar 17, 9:37 am, Nick Keighley<>
    > > wrote:
    > >> On 16 Mar, 15:31, (Ersek, Laszlo) wrote:

    >
    > >>> Additionally,
    > >>> res_construct() can participate in an even-higher level _init() routine.

    >
    > >> isn't _init() in a reserved namespace? Or close to one anyway.

    >
    > > On Mar 17, 9:37 am, Nick Keighley<>
    > > wrote:
    > >> On 16 Mar, 15:31, (Ersek, Laszlo) wrote:

    >
    > >>> Additionally,
    > >>> res_construct() can participate in an even-higher level _init() routine.

    >
    > >> isn't _init() in a reserved namespace? Or close to one anyway.

    >
    > > No, I'm pretty sure that identifiers starting with an underscore are
    > > only reserved if the following letter is another underscore or a
    > > capital letter.

    >
    >      For C, "All identifiers that begin with an underscore are
    > always reserved for use as identifiers with file scope in both
    > the ordinary and tag name spaces." (7.1.3p1).  So you could use
    > `_init' inside a function, but not as a name for anything outside
    > a function.  In particular, since "a higher-level routine" would
    > necessarily be a file-scope entity, naming it _init would encroach
    > on reserved space.
    >
    >      I don't know whether C++ rules are the same.
    >


    17.4.3.1.2/1
    -- Each name that contains a double underscore (__) or begins with an
    underscore followed by an uppercase letter is reserved to the
    implementation for any use.

    -- Each name that begins with an underscore is reserved to the
    implementation
    for use as a name in the global namespace.
     
    red floyd, Mar 17, 2010
    #13
  14. Michael Tsang

    Guest

    On Mar 16, 7:57 am, Michael Tsang <> wrote:
    > For example, if we allocate objects like this:
    >
    > void foo(void) {
    >         static const size_t size = 10000000; /* ten million */
    >         int data[size]; /* data is a very large object */
    >         /* do something with data */
    > }



    I believe that you should always allocate on the stack -- if you
    can get away with it. Unfortunately, declaring a huge object on the
    stack doesn't always work, as there are pre-set stack-size limits.

    Fortunately, to do what you want (that is, allocate a large array),
    you can use a std::vector, like this:


    void foo(void)
    {
    const size_t size = 10000000; // ten million
    std::vector<int> data(size);
    // do something with data
    }


    Technically, the data "object" will be on the stack, but the data's
    "array" should be on the heap. And since we're technically using a C+
    + object with a proper destructor, there's no need for us to clean up
    the ints that are allocated -- it will automagically be done for us at
    the end of its scope. (So we get the best of both worlds.)

    I hope this help, Michael.

    -- Jean-Luc
     
    , Mar 22, 2010
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. vlsidesign
    Replies:
    26
    Views:
    1,006
    Keith Thompson
    Jan 2, 2007
  2. Replies:
    7
    Views:
    1,236
  3. SM
    Replies:
    9
    Views:
    516
  4. Rakesh Kumar
    Replies:
    5
    Views:
    691
    James Kanze
    Dec 21, 2007
  5. Michael Tsang
    Replies:
    11
    Views:
    557
Loading...

Share This Page