Variable-length arrays: should they be used at all?

Discussion in 'C Programming' started by Rui Maciel, Jun 26, 2012.

  1. Rui Maciel

    Rui Maciel Guest

    In the thread "Learning C as an existing programmer", an interesting
    discussion arose over the use of variable-length arrays (VLAs), specifically
    the dangers they pose by not providing a way to detect potential memory
    allocation bugs.

    GCC's page on variable-length arrays says nothing about what to expect when
    a VLA is too large to handle.[1] In addition, what has been said in GCC's
    mailing list about avoiding segfaults induced by huge VLAs isn't very
    reassuring.[2]

    With this in mind, and considering that VLAs were made optional in C11, is
    it a good idea to simply refuse using them?


    Rui Maciel

    [1] http://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html
    [2] http://gcc.gnu.org/ml/gcc/2007-01/msg00179.html
     
    Rui Maciel, Jun 26, 2012
    #1
    1. Advertising

  2. Rui Maciel

    Stefan Ram Guest

    Rui Maciel <> writes:
    >With this in mind, and considering that VLAs were made optional in C11, is
    >it a good idea to simply refuse using them?


    Common auto objects just differ in the quantity, but not in
    the quality of the problem of automatic storage allocation.

    One would like to have a call »size_t auto();« so that there
    are at least auto() bytes still available for automatic
    storage allocation.

    int factorial( int const n )
    { if( auto() < sizeof( int )*8 + 1024 )return -1;
    else
    { int const fn1 = factorial( n - 1 );
    if( fn1 == -1 )return -1;
    else return n * fn1; }}

    (The expression »sizeof( int )*8 + 1024« above is just a
    heuristic estimation chosen to very probably still be
    sufficient for another function call. Also, an implementation
    is expected to lie on the safe side, i.e., to return a
    value that is somewhat smaller than the actual amount
    available, by at least a small multiple of the minimum
    size of a stack frame.)

    (Since »auto« is a keyword, »auto()« is no function call,
    but a kind of an operator expression with a special meaning
    to the compiler.)
     
    Stefan Ram, Jun 26, 2012
    #2
    1. Advertising

  3. Rui Maciel

    Eric Sosman Guest

    On 6/26/2012 10:28 AM, Rui Maciel wrote:
    > In the thread "Learning C as an existing programmer", an interesting
    > discussion arose over the use of variable-length arrays (VLAs), specifically
    > the dangers they pose by not providing a way to detect potential memory
    > allocation bugs.
    >
    > GCC's page on variable-length arrays says nothing about what to expect when
    > a VLA is too large to handle.[1] In addition, what has been said in GCC's
    > mailing list about avoiding segfaults induced by huge VLAs isn't very
    > reassuring.[2]
    >
    > With this in mind, and considering that VLAs were made optional in C11, is
    > it a good idea to simply refuse using them?


    It may come down to personal preference, and to the "kind" of
    programming you're doing. VLA's are a great notational convenience,
    especially for multi-dimensional arrays. Also, they relieve the
    coder of worrying about releasing memory, which can be a help if
    there are multiple ways to exit the allocating block.

    The disadvantage, of course, is that there is no portable way
    to detect an allocation failure. Even if the program is unable to
    complete its work in the event of malloc() failure, the ability to
    detect it allows the coder to arrange for a clean shutdown rather
    than an abrupt ka-BOOM. But undetectable allocation failure is
    not unique to VLA's, as your reference [2] indicates: The problem
    in that thread was an auto array of fixed size that happened to be
    too large. Pretty much any block might fail to allocate memory for
    its auto variables, even if its own space requirement is modest: A
    paltry four ints could be the straw that breaks the camel's stack.

    The fact that VLA's became optional with C11 may or may not
    be important. Support for IEEE floating-point has been optional
    for years, but that doesn't seem to have stopped people from
    relying on it. What will happen to VLA support in future compilers
    remains to be seen.[*]

    Perhaps a bigger issue than VLA's possible disappearance is
    their tardy APpearance: C99 support has not been quick to arrive,
    and even today it might not be unusual to encounter an implementation
    that lacked VLA's. Between "They may be going away" and "They're not
    even here yet," VLA's might be seen as diminishing the portability
    of code that uses them.

    Okay, so: The pros are notational convenience and relief from
    some memory-management burden, the cons are additional chances for
    ka-BOOM and possible portability/version issues. Wrap it all up
    in your own personal preference and your project's needs, and make
    your own call. Personally, I avoid 'em -- but YMMV.


    [*] I find it distressing that successive Standards seem to
    be turning away from the principle expressed in the Rationale:

    "Beyond this two-level scheme [hosted and freestanding],
    no additional subsetting is defined for C, since the C89
    Committee felt strongly that too many levels dilutes the
    effectiveness of a standard."

    That's from the C99 Rationale, but I think it's a paraphrase from
    the original (which I saw once but don't have). If the C11 Rationale
    includes this text, it might be accused of being insincere.

    --
    Eric Sosman
    d
     
    Eric Sosman, Jun 26, 2012
    #3
  4. Rui Maciel

    jacob navia Guest

    Le 26/06/12 16:28, Rui Maciel a écrit :
    > In the thread "Learning C as an existing programmer", an interesting
    > discussion arose over the use of variable-length arrays (VLAs), specifically
    > the dangers they pose by not providing a way to detect potential memory
    > allocation bugs.
    >
    > GCC's page on variable-length arrays says nothing about what to expect when
    > a VLA is too large to handle.[1] In addition, what has been said in GCC's
    > mailing list about avoiding segfaults induced by huge VLAs isn't very
    > reassuring.[2]
    >
    > With this in mind, and considering that VLAs were made optional in C11, is
    > it a good idea to simply refuse using them?
    >
    >
    > Rui Maciel
    >
    > [1] http://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html
    > [2] http://gcc.gnu.org/ml/gcc/2007-01/msg00179.html
    >


    VLAs are indispensable when you want to avoid unnecessary calls
    to the expensive malloc function.

    For instance if you have some structure X that is hidden as an opaque
    structure to avoid that client code

    You can of course get the size of the hidden structure by calling a
    library function.

    Example:

    int main(void)
    {
    char buffer[iList.GetSize(NULL)];
    List *L = (List *)&buffer[0];
    iList.InitList(L);

    // Here you use your list.

    iList.Clear(L);
    /// Here you do not need to call iList.Finalize since
    // you haven't allocated it with malloc
    }

    The crucial call here is:
    char buffer[iList.GetSize(NULL)];

    This allows tghe library to return the size of the structure WITHOUT
    disclosing its internal state. This is very important.

    The alternatives are very bad since you would have to

    #define SIZEOF_LIST_HEADER 56

    and do not forget to update that each time your List structure changes.
     
    jacob navia, Jun 26, 2012
    #4
  5. Rui Maciel

    Jens Gustedt Guest

    Am 26.06.2012 16:28, schrieb Rui Maciel:
    > In the thread "Learning C as an existing programmer", an interesting
    > discussion arose over the use of variable-length arrays (VLAs), specifically
    > the dangers they pose by not providing a way to detect potential memory
    > allocation bugs.
    >
    > GCC's page on variable-length arrays says nothing about what to expect when
    > a VLA is too large to handle.[1] In addition, what has been said in GCC's
    > mailing list about avoiding segfaults induced by huge VLAs isn't very
    > reassuring.[2]
    >
    > With this in mind, and considering that VLAs were made optional in C11, is
    > it a good idea to simply refuse using them?


    - Pointers to VLA are very convenient when you have to deal with
    multi-dimensional arrays
    - VLA types themselves greatly improve readability of malloc calls
    - VLA are really great help for function calls

    double (*A)[n] = malloc(sizeof(double[n][n]));

    void init(size_t n, double A[n][n]) {
    for (size_t i = 0; i < n; ++i)
    for (size_t j = 0; j < n; ++j)
    A[j] = 0.0;

    }

    Jens
     
    Jens Gustedt, Jun 26, 2012
    #5
  6. Rui Maciel

    Stefan Ram Guest

    jacob navia <> writes:
    >char buffer[iList.GetSize(NULL)];
    >List *L = (List *)&buffer[0];
    >iList.InitList(L);


    You can get the same with

    ListBuffer buffer;
    List * L =( List * )&buffer;
    iList.InitList( L );
     
    Stefan Ram, Jun 26, 2012
    #6
  7. Rui Maciel

    Rui Maciel Guest

    Jens Gustedt wrote:

    > - Pointers to VLA are very convenient when you have to deal with
    > multi-dimensional arrays
    > - VLA types themselves greatly improve readability of malloc calls
    > - VLA are really great help for function calls
    >
    > double (*A)[n] = malloc(sizeof(double[n][n]));
    >
    > void init(size_t n, double A[n][n]) {
    > for (size_t i = 0; i < n; ++i)
    > for (size_t j = 0; j < n; ++j)
    > A[j] = 0.0;
    >
    > }


    Undoubtedly, VLAs are convenient. Yet, with that convenience comes the
    danger of making an otherwise flawless program susceptible to nasty memory
    allocation problems. No matter how convenient VLAs might be, having to
    handle unexplainable segfaults that can't be avoided with the dilligent use
    of safeguards may not be an acceptable tradeoff.


    Rui Maciel
     
    Rui Maciel, Jun 26, 2012
    #7
  8. Rui Maciel

    Rui Maciel Guest

    jacob navia wrote:

    > VLAs are indispensable when you want to avoid unnecessary calls
    > to the expensive malloc function.


    But what about the inability to gracefully recover from a memory allocation
    error? If a VLA is defined with the wrong size at the wrong moment then it
    appears that it isn't possible to do anything about it, nor is it even
    possible to put in place any safeguard to avoid that. In fact, are there
    any scenarios where a VLA defined with an arbitrary size is guaranteed to
    work as expected?


    Rui Maciel
     
    Rui Maciel, Jun 26, 2012
    #8
  9. Rui Maciel <> writes:
    > Jens Gustedt wrote:
    >> - Pointers to VLA are very convenient when you have to deal with
    >> multi-dimensional arrays
    >> - VLA types themselves greatly improve readability of malloc calls
    >> - VLA are really great help for function calls
    >>
    >> double (*A)[n] = malloc(sizeof(double[n][n]));
    >>
    >> void init(size_t n, double A[n][n]) {
    >> for (size_t i = 0; i < n; ++i)
    >> for (size_t j = 0; j < n; ++j)
    >> A[j] = 0.0;
    >>
    >> }

    >
    > Undoubtedly, VLAs are convenient. Yet, with that convenience comes the
    > danger of making an otherwise flawless program susceptible to nasty memory
    > allocation problems. No matter how convenient VLAs might be, having to
    > handle unexplainable segfaults that can't be avoided with the dilligent use
    > of safeguards may not be an acceptable tradeoff.


    Except that the code you quoted isn't subject to unexplainable
    segfaults. It uses a VLA type, but there are no VLA objects with
    automatic storage duration; A is allocated with malloc(), which returns
    NULL on failure.

    The use of a VLA type makes indexing more convenient.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Jun 26, 2012
    #9
  10. Rui Maciel <> writes:
    > jacob navia wrote:
    >> VLAs are indispensable when you want to avoid unnecessary calls
    >> to the expensive malloc function.

    >
    > But what about the inability to gracefully recover from a memory allocation
    > error? If a VLA is defined with the wrong size at the wrong moment then it
    > appears that it isn't possible to do anything about it, nor is it even
    > possible to put in place any safeguard to avoid that. In fact, are there
    > any scenarios where a VLA defined with an arbitrary size is guaranteed to
    > work as expected?


    No (unless the implementation defines its own error-detection
    mechanism).

    But the same applies to old-style constant-size arrays. If you declare

    double mat[100][100];

    either at block scope or at file scope, there's no mechanism to detect
    an allocation failure.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Jun 26, 2012
    #10
  11. On Tuesday, June 26, 2012 3:47:08 PM UTC-4, Rui Maciel wrote:
    > Undoubtedly, VLAs are convenient. Yet, with that convenience comes the
    > danger of making an otherwise flawless program susceptible to nasty memory
    > allocation problems. No matter how convenient VLAs might be, having to
    > handle unexplainable segfaults that can't be avoided with the dilligent use
    > of safeguards may not be an acceptable tradeoff.


    Yep, they should be used with caution. Mind you, the same arguments apply to using recursion. How deeply you can safely recurse, and the consequencesif you go too deep, also murky. Had a recursion related crash where somebody markedly increased the size of an automatic buffer in moderately deeplyrecursive code.
     
    David Resnick, Jun 26, 2012
    #11
  12. בת×ריך ×™×•× ×©×œ×™×©×™, 26 ביוני 2012 21:51:12 UTC+1, מ×ת David Resnick:
    >
    > Yep, they should be used with caution. Mind you, the same arguments apply to using recursion. How deeply you can safely recurse, and the consequences if you go too deep, also murky. Had a recursion related crash where somebody markedly increased the size of an automatic buffer in moderately deeply recursive code.
    >

    The difference is that recursion is usually logarithmic in depth. A malicious or careless user has to construct a degenerate tree. That's a lot harderthan simply declaring that a company has 4 billion employees.
     
    Malcolm McLean, Jun 27, 2012
    #12
  13. Rui Maciel

    jacob navia Guest

    Le 26/06/12 21:18, Stefan Ram a écrit :
    > jacob navia <> writes:
    >> char buffer[iList.GetSize(NULL)];
    >> List *L = (List *)&buffer[0];
    >> iList.InitList(L);

    >
    > You can get the same with
    >
    > ListBuffer buffer;
    > List * L =( List * )&buffer;
    > iList.InitList( L );
    >


    Sorry but I do not understand that: what would be "ListBuffer"?

    A predefined type of buffer long enough to hold a list header
    structure?

    If that is the case then its length MUST be known to the compiler, and
    that means that the definition of the list header structure must be
    disclosed, what we want to avoid precisely.
     
    jacob navia, Jun 27, 2012
    #13
  14. Rui Maciel

    Ian Collins Guest

    On 06/27/12 08:09 AM, Keith Thompson wrote:
    > Rui Maciel<> writes:
    >> jacob navia wrote:
    >>> VLAs are indispensable when you want to avoid unnecessary calls
    >>> to the expensive malloc function.

    >>
    >> But what about the inability to gracefully recover from a memory allocation
    >> error? If a VLA is defined with the wrong size at the wrong moment then it
    >> appears that it isn't possible to do anything about it, nor is it even
    >> possible to put in place any safeguard to avoid that. In fact, are there
    >> any scenarios where a VLA defined with an arbitrary size is guaranteed to
    >> work as expected?

    >
    > No (unless the implementation defines its own error-detection
    > mechanism).
    >
    > But the same applies to old-style constant-size arrays. If you declare
    >
    > double mat[100][100];
    >
    > either at block scope or at file scope, there's no mechanism to detect
    > an allocation failure.


    But you can do a static analysis.

    --
    Ian Collins
     
    Ian Collins, Jun 27, 2012
    #14
  15. Rui Maciel

    jacob navia Guest

    Le 27/06/12 07:50, Ian Collins a écrit :
    > On 06/27/12 08:09 AM, Keith Thompson wrote:
    >> Rui Maciel<> writes:
    >>> jacob navia wrote:
    >>>> VLAs are indispensable when you want to avoid unnecessary calls
    >>>> to the expensive malloc function.
    >>>
    >>> But what about the inability to gracefully recover from a memory
    >>> allocation
    >>> error? If a VLA is defined with the wrong size at the wrong moment
    >>> then it
    >>> appears that it isn't possible to do anything about it, nor is it even
    >>> possible to put in place any safeguard to avoid that. In fact, are
    >>> there
    >>> any scenarios where a VLA defined with an arbitrary size is
    >>> guaranteed to
    >>> work as expected?

    >>
    >> No (unless the implementation defines its own error-detection
    >> mechanism).
    >>
    >> But the same applies to old-style constant-size arrays. If you declare
    >>
    >> double mat[100][100];
    >>
    >> either at block scope or at file scope, there's no mechanism to detect
    >> an allocation failure.

    >
    > But you can do a static analysis.
    >


    lcc-win was ported to a 16 bit Analog Devices chip with something like
    40K RAM available.

    There was no stack, and the stack was created by the compiler using
    a memory array: this implied a static analaysis.

    Recursion was forbidden (coudln't be analyzed well) and indirect
    recursion was detected: function A calls B that calls C that calls A.

    If you are doing that, VLA's are of course off limits. But, as far as
    I know, recursion is allowed in C and nobody is screaming to
    eliminate it because in some RAM constrained environments it could
    lead to crashes.

    We have to separate the C language from the implementations of C
    in very constrained environments.
     
    jacob navia, Jun 27, 2012
    #15
  16. Rui Maciel

    gwowen Guest

    On Jun 27, 6:57 am, jacob navia <> wrote:

    > Recursion was forbidden (coudln't be analyzed well) and indirect
    > recursion was detected:


    ....

    > recursion is allowed in C and nobody is screaming to
    > eliminate it because in some RAM constrained environments it could
    > lead to crashes.


    Clearly, some people (e.g. the lcc authors) *are* eliminating
    recursion - in certain cases - for *precisely* those reasons.

    Like recursion, VLAs are (recursion is) fine as long as you *know* the
    VLA length (recursion depth) is going to remain pretty small relative
    to your stack. If you know that, no problem. If you don't, something
    may blow in a horrible way.
     
    gwowen, Jun 27, 2012
    #16
  17. jacob navia <> writes:
    <snip>
    > lcc-win was ported to a 16 bit Analog Devices chip with something like
    > 40K RAM available.
    >
    > There was no stack, and the stack was created by the compiler using
    > a memory array: this implied a static analaysis.
    >
    > Recursion was forbidden (coudln't be analyzed well) and indirect
    > recursion was detected: function A calls B that calls C that calls A.


    If recursion is forbidden, would it not be worthwhile having a single
    static area for each function? This is what some IBM compilers used to
    do (and they may still do for all I know).

    <snip>
    --
    Ben.
     
    Ben Bacarisse, Jun 27, 2012
    #17
  18. Rui Maciel

    Eric Sosman Guest

    On 6/27/2012 8:50 AM, Ben Bacarisse wrote:
    > jacob navia <> writes:
    > <snip>
    >> lcc-win was ported to a 16 bit Analog Devices chip with something like
    >> 40K RAM available.
    >>
    >> There was no stack, and the stack was created by the compiler using
    >> a memory array: this implied a static analaysis.
    >>
    >> Recursion was forbidden (coudln't be analyzed well) and indirect
    >> recursion was detected: function A calls B that calls C that calls A.

    >
    > If recursion is forbidden, would it not be worthwhile having a single
    > static area for each function? This is what some IBM compilers used to
    > do (and they may still do for all I know).


    A shared stack would use less memory, unless there was some
    execution path in which all functions were active simultaneously.

    Per-function (or per-block) static storage might be attractive
    on machines where stack-relative addressing is cumbersome. I recall
    that Turbo Pascal on the Z80 used this technique. It allowed
    recursive functions and procedures, though: You told the compiler
    which could be called recursively, it generated prologue and
    epilogue code to swap the static data out to a stack and back, and
    each invocation re-used the same static area. This wouldn't work
    for C, of course, since you wouldn't want an auto variable to be
    moved after you'd formed a pointer to it ...

    --
    Eric Sosman
    d
     
    Eric Sosman, Jun 27, 2012
    #18
  19. Rui Maciel

    Stefan Ram Guest

    jacob navia <> writes:
    >Le 26/06/12 21:18, Stefan Ram a écrit :
    >>ListBuffer buffer;
    >>List * L =( List * )&buffer;
    >>iList.InitList( L );

    >Sorry but I do not understand that: what would be "ListBuffer"?
    >A predefined type of buffer long enough to hold a list header
    >structure?


    Yes, something like

    struct { char dummy[ LIST_SIZE ]; } ListBuffer;
     
    Stefan Ram, Jun 27, 2012
    #19
  20. Rui Maciel

    Tim Rentsch Guest

    Rui Maciel <> writes:

    > In the thread "Learning C as an existing programmer", an interesting
    > discussion arose over the use of variable-length arrays (VLAs), specifically
    > the dangers they pose by not providing a way to detect potential memory
    > allocation bugs.
    >
    > GCC's page on variable-length arrays says nothing about what to expect when
    > a VLA is too large to handle.[1] In addition, what has been said in GCC's
    > mailing list about avoiding segfaults induced by huge VLAs isn't very
    > reassuring.[2]
    >
    > With this in mind, and considering that VLAs were made optional in C11, is
    > it a good idea to simply refuse using them?


    There are two distinct concerns here. Let's take them one at a
    time.

    First, encountering a VLA declaration during execution may blow
    the stack and crash, and there is no portable way to detect or
    deal with that. However, it is easy for implementations to
    provide a way of doing that, without adding any new language
    constructs, as I explained in another posting.

    Second, VLAs have not been implmented by some vendors (a certain
    laggard major software company comes to mind here), and VLA
    support is optional in C11, presumably so those vendors can claim
    full C11 compliance without having to provide VLA support.

    The flip side to the second concern is that many or most major
    implementations (at least hosted implementations) do provide VLA
    support, and will continue to do so under C11, despite its being
    optional. The laggards will keep being laggards, whether VLA
    support is optional or not, because they think their user base
    doesn't care about it (or perhaps because they think their user
    base has to accept what they do whether the user base cares about
    it or not).

    The flip side of the first concern is that, one, it often isn't a
    big deal in practice; two, implementors should be encouraged to
    supply a mechanism for detecting/handling VLA allocation failure,
    since it is easy to provide such a mechanism; and three, VLA
    support provides an important benefit that does not have the
    associated risk of undetectable allocation failure, namely,
    variably modified types and more specifically pointers to VLAs,
    which are useful even if VLA objects are never declared.

    Personally, I find VLA support useful and convenient, whether
    using VLAs themselves or just pointers to them, and in a wider
    variety of circumstances than I originally expected. Vendors
    and implementors will provide VLA support if developers use
    them, and very likely won't if they don't. So my conclusion
    is somewhat the opposite of yours -- developers *should* use
    VLAs and variably modified types whenever they are useful and
    convenient to express the programming task at hand, and also
    should encourage and prevail upon vendors and implementors to
    supply better VLA support, such as a mechanism for detecting
    and handling allocation failure like the one described in
    another thread. VLA support is both convenient and useful;
    if demand for it is high enough, implementations that provide
    VLAs will become both better and more ubiquitous.
     
    Tim Rentsch, Jun 27, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Asfand Yar Qazi
    Replies:
    4
    Views:
    410
    Asfand Yar Qazi
    Nov 12, 2004
  2. Martin M.
    Replies:
    4
    Views:
    350
    Simon Brunning
    Dec 15, 2005
  3. John Salerno
    Replies:
    14
    Views:
    555
    Hendrik van Rooyen
    Oct 1, 2006
  4. Dave Rudolf
    Replies:
    1
    Views:
    308
    Kai-Uwe Bux
    May 17, 2006
  5. botp
    Replies:
    6
    Views:
    229
    Joel VanderWerf
    Oct 5, 2010
Loading...

Share This Page