Variable-length arrays: should they be used at all?

Rui Maciel · Jun 26, 2012

In the thread "Learning C as an existing programmer", an interesting
discussion arose over the use of variable-length arrays (VLAs), specifically
the dangers they pose by not providing a way to detect potential memory
allocation bugs.

GCC's page on variable-length arrays says nothing about what to expect when
a VLA is too large to handle.[1] In addition, what has been said in GCC's
mailing list about avoiding segfaults induced by huge VLAs isn't very
reassuring.[2]

With this in mind, and considering that VLAs were made optional in C11, is
it a good idea to simply refuse using them?

Rui Maciel

[1] http://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html
[2] http://gcc.gnu.org/ml/gcc/2007-01/msg00179.html

Stefan Ram · Jun 26, 2012

Rui Maciel said:
With this in mind, and considering that VLAs were made optional in C11, is
it a good idea to simply refuse using them?

Common auto objects just differ in the quantity, but not in
the quality of the problem of automatic storage allocation.

One would like to have a call »size_t auto();« so that there
are at least auto() bytes still available for automatic
storage allocation.

int factorial( int const n )
{ if( auto() < sizeof( int )*8 + 1024 )return -1;
else
{ int const fn1 = factorial( n - 1 );
if( fn1 == -1 )return -1;
else return n * fn1; }}

(The expression »sizeof( int )*8 + 1024« above is just a
heuristic estimation chosen to very probably still be
sufficient for another function call. Also, an implementation
is expected to lie on the safe side, i.e., to return a
value that is somewhat smaller than the actual amount
available, by at least a small multiple of the minimum
size of a stack frame.)

(Since »auto« is a keyword, »auto()« is no function call,
but a kind of an operator expression with a special meaning
to the compiler.)

Eric Sosman · Jun 26, 2012

In the thread "Learning C as an existing programmer", an interesting
discussion arose over the use of variable-length arrays (VLAs), specifically
the dangers they pose by not providing a way to detect potential memory
allocation bugs.

GCC's page on variable-length arrays says nothing about what to expect when
a VLA is too large to handle.[1] In addition, what has been said in GCC's
mailing list about avoiding segfaults induced by huge VLAs isn't very
reassuring.[2]

With this in mind, and considering that VLAs were made optional in C11, is
it a good idea to simply refuse using them?

It may come down to personal preference, and to the "kind" of
programming you're doing. VLA's are a great notational convenience,
especially for multi-dimensional arrays. Also, they relieve the
coder of worrying about releasing memory, which can be a help if
there are multiple ways to exit the allocating block.

The disadvantage, of course, is that there is no portable way
to detect an allocation failure. Even if the program is unable to
complete its work in the event of malloc() failure, the ability to
detect it allows the coder to arrange for a clean shutdown rather
than an abrupt ka-BOOM. But undetectable allocation failure is
not unique to VLA's, as your reference [2] indicates: The problem
in that thread was an auto array of fixed size that happened to be
too large. Pretty much any block might fail to allocate memory for
its auto variables, even if its own space requirement is modest: A
paltry four ints could be the straw that breaks the camel's stack.

The fact that VLA's became optional with C11 may or may not
be important. Support for IEEE floating-point has been optional
for years, but that doesn't seem to have stopped people from
relying on it. What will happen to VLA support in future compilers
remains to be seen.[*]

Perhaps a bigger issue than VLA's possible disappearance is
their tardy APpearance: C99 support has not been quick to arrive,
and even today it might not be unusual to encounter an implementation
that lacked VLA's. Between "They may be going away" and "They're not
even here yet," VLA's might be seen as diminishing the portability
of code that uses them.

Okay, so: The pros are notational convenience and relief from
some memory-management burden, the cons are additional chances for
ka-BOOM and possible portability/version issues. Wrap it all up
in your own personal preference and your project's needs, and make
your own call. Personally, I avoid 'em -- but YMMV.

[*] I find it distressing that successive Standards seem to
be turning away from the principle expressed in the Rationale:

"Beyond this two-level scheme [hosted and freestanding],
no additional subsetting is defined for C, since the C89
Committee felt strongly that too many levels dilutes the
effectiveness of a standard."

That's from the C99 Rationale, but I think it's a paraphrase from
the original (which I saw once but don't have). If the C11 Rationale
includes this text, it might be accused of being insincere.

jacob navia · Jun 26, 2012

Le 26/06/12 16:28, Rui Maciel a écrit :

In the thread "Learning C as an existing programmer", an interesting
discussion arose over the use of variable-length arrays (VLAs), specifically
the dangers they pose by not providing a way to detect potential memory
allocation bugs.

GCC's page on variable-length arrays says nothing about what to expect when
a VLA is too large to handle.[1] In addition, what has been said in GCC's
mailing list about avoiding segfaults induced by huge VLAs isn't very
reassuring.[2]

With this in mind, and considering that VLAs were made optional in C11, is
it a good idea to simply refuse using them?

Rui Maciel

[1] http://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html
[2] http://gcc.gnu.org/ml/gcc/2007-01/msg00179.html

VLAs are indispensable when you want to avoid unnecessary calls
to the expensive malloc function.

For instance if you have some structure X that is hidden as an opaque
structure to avoid that client code

You can of course get the size of the hidden structure by calling a
library function.

Example:

int main(void)
{
char buffer[iList.GetSize(NULL)];
List *L = (List *)&buffer[0];
iList.InitList(L);

// Here you use your list.

iList.Clear(L);
/// Here you do not need to call iList.Finalize since
// you haven't allocated it with malloc
}

The crucial call here is:
char buffer[iList.GetSize(NULL)];

This allows tghe library to return the size of the structure WITHOUT
disclosing its internal state. This is very important.

The alternatives are very bad since you would have to

#define SIZEOF_LIST_HEADER 56

and do not forget to update that each time your List structure changes.

Jens Gustedt · Jun 26, 2012

Am 26.06.2012 16:28, schrieb Rui Maciel:

In the thread "Learning C as an existing programmer", an interesting
discussion arose over the use of variable-length arrays (VLAs), specifically
the dangers they pose by not providing a way to detect potential memory
allocation bugs.

GCC's page on variable-length arrays says nothing about what to expect when
a VLA is too large to handle.[1] In addition, what has been said in GCC's
mailing list about avoiding segfaults induced by huge VLAs isn't very
reassuring.[2]

With this in mind, and considering that VLAs were made optional in C11, is
it a good idea to simply refuse using them?

- Pointers to VLA are very convenient when you have to deal with
multi-dimensional arrays
- VLA types themselves greatly improve readability of malloc calls
- VLA are really great help for function calls

double (*A)[n] = malloc(sizeof(double[n][n]));

void init(size_t n, double A[n][n]) {
for (size_t i = 0; i < n; ++i)
for (size_t j = 0; j < n; ++j)
A[j] = 0.0;

}

Jens

Stefan Ram · Jun 26, 2012

jacob navia said:
char buffer[iList.GetSize(NULL)];
List *L = (List *)&buffer[0];
iList.InitList(L);

You can get the same with

ListBuffer buffer;
List * L =( List * )&buffer;
iList.InitList( L );

Rui Maciel · Jun 26, 2012

Jens said:
- Pointers to VLA are very convenient when you have to deal with
multi-dimensional arrays
- VLA types themselves greatly improve readability of malloc calls
- VLA are really great help for function calls

double (*A)[n] = malloc(sizeof(double[n][n]));

void init(size_t n, double A[n][n]) {
for (size_t i = 0; i < n; ++i)
for (size_t j = 0; j < n; ++j)
A[j] = 0.0;

}

Undoubtedly, VLAs are convenient. Yet, with that convenience comes the
danger of making an otherwise flawless program susceptible to nasty memory
allocation problems. No matter how convenient VLAs might be, having to
handle unexplainable segfaults that can't be avoided with the dilligent use
of safeguards may not be an acceptable tradeoff.

Rui Maciel

Rui Maciel · Jun 26, 2012

jacob said:
VLAs are indispensable when you want to avoid unnecessary calls
to the expensive malloc function.

But what about the inability to gracefully recover from a memory allocation
error? If a VLA is defined with the wrong size at the wrong moment then it
appears that it isn't possible to do anything about it, nor is it even
possible to put in place any safeguard to avoid that. In fact, are there
any scenarios where a VLA defined with an arbitrary size is guaranteed to
work as expected?

Rui Maciel

Keith Thompson · Jun 26, 2012

Rui Maciel said:
Jens said:

- Pointers to VLA are very convenient when you have to deal with
multi-dimensional arrays
- VLA types themselves greatly improve readability of malloc calls
- VLA are really great help for function calls

double (*A)[n] = malloc(sizeof(double[n][n]));

void init(size_t n, double A[n][n]) {
for (size_t i = 0; i < n; ++i)
for (size_t j = 0; j < n; ++j)
A[j] = 0.0;

}

Click to expand...

Undoubtedly, VLAs are convenient. Yet, with that convenience comes the
danger of making an otherwise flawless program susceptible to nasty memory
allocation problems. No matter how convenient VLAs might be, having to
handle unexplainable segfaults that can't be avoided with the dilligent use
of safeguards may not be an acceptable tradeoff.

Except that the code you quoted isn't subject to unexplainable
segfaults. It uses a VLA type, but there are no VLA objects with
automatic storage duration; A is allocated with malloc(), which returns
NULL on failure.

The use of a VLA type makes indexing more convenient.

Keith Thompson · Jun 26, 2012

Rui Maciel said:
But what about the inability to gracefully recover from a memory allocation
error? If a VLA is defined with the wrong size at the wrong moment then it
appears that it isn't possible to do anything about it, nor is it even
possible to put in place any safeguard to avoid that. In fact, are there
any scenarios where a VLA defined with an arbitrary size is guaranteed to
work as expected?

No (unless the implementation defines its own error-detection
mechanism).

But the same applies to old-style constant-size arrays. If you declare

double mat[100][100];

either at block scope or at file scope, there's no mechanism to detect
an allocation failure.

David Resnick · Jun 26, 2012

Undoubtedly, VLAs are convenient. Yet, with that convenience comes the
danger of making an otherwise flawless program susceptible to nasty memory
allocation problems. No matter how convenient VLAs might be, having to
handle unexplainable segfaults that can't be avoided with the dilligent use
of safeguards may not be an acceptable tradeoff.

Yep, they should be used with caution. Mind you, the same arguments apply to using recursion. How deeply you can safely recurse, and the consequencesif you go too deep, also murky. Had a recursion related crash where somebody markedly increased the size of an automatic buffer in moderately deeplyrecursive code.

Malcolm McLean · Jun 27, 2012

×‘×ª××¨×™×š ×™×•× ×©×œ×™×©×™, 26 ×‘×™×•× ×™ 2012 21:51:12 UTC+1, ×ž××ª David Resnick:

Yep, they should be used with caution. Mind you, the same arguments apply to using recursion. How deeply you can safely recurse, and the consequences if you go too deep, also murky. Had a recursion related crash where somebody markedly increased the size of an automatic buffer in moderately deeply recursive code.

The difference is that recursion is usually logarithmic in depth. A malicious or careless user has to construct a degenerate tree. That's a lot harderthan simply declaring that a company has 4 billion employees.

jacob navia · Jun 27, 2012

Le 26/06/12 21:18, Stefan Ram a écrit :

jacob navia said:
jacob navia said:

char buffer[iList.GetSize(NULL)];
List *L = (List *)&buffer[0];
iList.InitList(L);

Click to expand...

You can get the same with

ListBuffer buffer;
List * L =( List * )&buffer;
iList.InitList( L );

Sorry but I do not understand that: what would be "ListBuffer"?

A predefined type of buffer long enough to hold a list header
structure?

If that is the case then its length MUST be known to the compiler, and
that means that the definition of the list header structure must be
disclosed, what we want to avoid precisely.

Ian Collins · Jun 27, 2012

Rui Maciel said:
Rui Maciel said:

But what about the inability to gracefully recover from a memory allocation
error? If a VLA is defined with the wrong size at the wrong moment then it
appears that it isn't possible to do anything about it, nor is it even
possible to put in place any safeguard to avoid that. In fact, are there
any scenarios where a VLA defined with an arbitrary size is guaranteed to
work as expected?

Click to expand...

No (unless the implementation defines its own error-detection
mechanism).

But the same applies to old-style constant-size arrays. If you declare

double mat[100][100];

either at block scope or at file scope, there's no mechanism to detect
an allocation failure.

But you can do a static analysis.

jacob navia · Jun 27, 2012

Le 27/06/12 07:50, Ian Collins a écrit :

Rui Maciel said:
Rui Maciel said:

jacob navia wrote:
VLAs are indispensable when you want to avoid unnecessary calls
to the expensive malloc function.

But what about the inability to gracefully recover from a memory
allocation
error? If a VLA is defined with the wrong size at the wrong moment
then it
appears that it isn't possible to do anything about it, nor is it even
possible to put in place any safeguard to avoid that. In fact, are
there
any scenarios where a VLA defined with an arbitrary size is
guaranteed to
work as expected?

Click to expand...

No (unless the implementation defines its own error-detection
mechanism).

But the same applies to old-style constant-size arrays. If you declare

double mat[100][100];

either at block scope or at file scope, there's no mechanism to detect
an allocation failure.

Click to expand...

But you can do a static analysis.

lcc-win was ported to a 16 bit Analog Devices chip with something like
40K RAM available.

There was no stack, and the stack was created by the compiler using
a memory array: this implied a static analaysis.

Recursion was forbidden (coudln't be analyzed well) and indirect
recursion was detected: function A calls B that calls C that calls A.

If you are doing that, VLA's are of course off limits. But, as far as
I know, recursion is allowed in C and nobody is screaming to
eliminate it because in some RAM constrained environments it could
lead to crashes.

We have to separate the C language from the implementations of C
in very constrained environments.

gwowen · Jun 27, 2012

Recursion was forbidden (coudln't be analyzed well) and indirect
recursion was detected:
....

recursion is allowed in C and nobody is screaming to
eliminate it because in some RAM constrained environments it could
lead to crashes.

Clearly, some people (e.g. the lcc authors) *are* eliminating
recursion - in certain cases - for *precisely* those reasons.

Like recursion, VLAs are (recursion is) fine as long as you *know* the
VLA length (recursion depth) is going to remain pretty small relative
to your stack. If you know that, no problem. If you don't, something
may blow in a horrible way.

Ben Bacarisse · Jun 27, 2012

jacob navia said:
lcc-win was ported to a 16 bit Analog Devices chip with something like
40K RAM available.

There was no stack, and the stack was created by the compiler using
a memory array: this implied a static analaysis.

Recursion was forbidden (coudln't be analyzed well) and indirect
recursion was detected: function A calls B that calls C that calls A.

If recursion is forbidden, would it not be worthwhile having a single
static area for each function? This is what some IBM compilers used to
do (and they may still do for all I know).

<snip>

Eric Sosman · Jun 27, 2012

If recursion is forbidden, would it not be worthwhile having a single
static area for each function? This is what some IBM compilers used to
do (and they may still do for all I know).

A shared stack would use less memory, unless there was some
execution path in which all functions were active simultaneously.

Per-function (or per-block) static storage might be attractive
on machines where stack-relative addressing is cumbersome. I recall
that Turbo Pascal on the Z80 used this technique. It allowed
recursive functions and procedures, though: You told the compiler
which could be called recursively, it generated prologue and
epilogue code to swap the static data out to a stack and back, and
each invocation re-used the same static area. This wouldn't work
for C, of course, since you wouldn't want an auto variable to be
moved after you'd formed a pointer to it ...

Stefan Ram · Jun 27, 2012

jacob navia said:
Le 26/06/12 21:18, Stefan Ram a écrit :
Sorry but I do not understand that: what would be "ListBuffer"?
A predefined type of buffer long enough to hold a list header
structure?

Yes, something like

struct { char dummy[ LIST_SIZE ]; } ListBuffer;

Tim Rentsch · Jun 27, 2012

Rui Maciel said:
In the thread "Learning C as an existing programmer", an interesting
discussion arose over the use of variable-length arrays (VLAs), specifically
the dangers they pose by not providing a way to detect potential memory
allocation bugs.

GCC's page on variable-length arrays says nothing about what to expect when
a VLA is too large to handle.[1] In addition, what has been said in GCC's
mailing list about avoiding segfaults induced by huge VLAs isn't very
reassuring.[2]

With this in mind, and considering that VLAs were made optional in C11, is
it a good idea to simply refuse using them?

There are two distinct concerns here. Let's take them one at a
time.

First, encountering a VLA declaration during execution may blow
the stack and crash, and there is no portable way to detect or
deal with that. However, it is easy for implementations to
provide a way of doing that, without adding any new language
constructs, as I explained in another posting.

Second, VLAs have not been implmented by some vendors (a certain
laggard major software company comes to mind here), and VLA
support is optional in C11, presumably so those vendors can claim
full C11 compliance without having to provide VLA support.

The flip side to the second concern is that many or most major
implementations (at least hosted implementations) do provide VLA
support, and will continue to do so under C11, despite its being
optional. The laggards will keep being laggards, whether VLA
support is optional or not, because they think their user base
doesn't care about it (or perhaps because they think their user
base has to accept what they do whether the user base cares about
it or not).

The flip side of the first concern is that, one, it often isn't a
big deal in practice; two, implementors should be encouraged to
supply a mechanism for detecting/handling VLA allocation failure,
since it is easy to provide such a mechanism; and three, VLA
support provides an important benefit that does not have the
associated risk of undetectable allocation failure, namely,
variably modified types and more specifically pointers to VLAs,
which are useful even if VLA objects are never declared.

Personally, I find VLA support useful and convenient, whether
using VLAs themselves or just pointers to them, and in a wider
variety of circumstances than I originally expected. Vendors
and implementors will provide VLA support if developers use
them, and very likely won't if they don't. So my conclusion
is somewhat the opposite of yours -- developers *should* use
VLAs and variably modified types whenever they are useful and
convenient to express the programming task at hand, and also
should encourage and prevail upon vendors and implementors to
supply better VLA support, such as a mechanism for detecting
and handling allocation failure like the one described in
another thread. VLA support is both convenient and useful;
if demand for it is high enough, implementations that provide
VLAs will become both better and more ubiquitous.

Variable-length arrays: should they be used at all?

Rui Maciel

Stefan Ram

Eric Sosman

jacob navia

Jens Gustedt

Stefan Ram

Rui Maciel

Rui Maciel

Keith Thompson

Keith Thompson

David Resnick

Malcolm McLean

jacob navia

Ian Collins

jacob navia

gwowen

Ben Bacarisse

Eric Sosman

Stefan Ram

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads