On VLAs and incomplete types

Discussion in 'C Programming' started by Sensei, Mar 21, 2008.

  1. Sensei

    Sensei Guest

    Hi! I am still learning a lot reading thin NG, so I am now turning
    here to clarify my doubts with VLAs. I hope I won't be that silly or
    naive :)

    In (6.7.5.2) I read about the correct declaration of arrays, although I
    don't understand how a variable is handled by the compiler in declaring
    a VLA. As I understand, an array must be declared with an integer size
    specifier:

    int x[3];

    That would give a complete type, and I can use x happily. If no integer
    is specified, then

    int x[];

    would be an incomplete type. This quite puzzles me, since I don't
    understand how x[] could then be used in a code block. I can only think
    about a function with two parameters, the incomplete array and the
    size. Is it possible to make use of that variable in other ways?

    The other doubt is about VLAs. If some variable is specified instead of
    a constant, then the variable should be in the block. In the document I
    see a prototype and a variable inside the code block:

    void fvla(int m, int C[m][m])
    ....
    int D[m];


    My question is probably stupid, but how are C and D handled by the
    compiler? My concerns are about some code like the following (mixed
    code and variables, since it is C99):

    /* checks about argc and argv */

    int a = atoi(argv[1]);

    /* follows some check about a */

    int x[a];



    I know it's risky, but it's allowed, isn't it? Is x allocated on the
    fly as we enter the code block (somehow, on the heap or by any other
    non C-related means) and then subsequently freed when exit? Then what
    are the advantages (not counting the "syntactic sugar" I can think of)
    of having VLAs into the standard?

    I am puzzled, but forgive me: I am still learning a lot! :)

    Thanks!

    --

    Sensei <Sensei's e-mail is at Mac-dot-com>
    Sensei, Mar 21, 2008
    #1
    1. Advertising

  2. Sensei <Sensei's e-mail is at Mac-dot-com> writes:

    > In (6.7.5.2) I read about the correct declaration of arrays, although
    > I don't understand how a variable is handled by the compiler in
    > declaring a VLA. As I understand, an array must be declared with an
    > integer size specifier:
    >
    > int x[3];
    >
    > That would give a complete type, and I can use x happily. If no
    > integer is specified, then
    >
    > int x[];
    >
    > would be an incomplete type. This quite puzzles me, since I don't
    > understand how x[] could then be used in a code block.


    It can't be.

    > I can only
    > think about a function with two parameters, the incomplete array and
    > the size. Is it possible to make use of that variable in other ways?


    No, but I think you are being confused by one the things that most
    frequently confuses people new to C. The [] syntax in a parameter
    specification has a special meaning. It does not denote an incomplete
    type -- it is essentially the same as declaring that parameter to be a
    pointer. In other words:

    void f(int x[]);

    and

    void f(int *x);

    are the same. This only applies to the "first" set of []s. Thus

    void f(int x[][])

    *is* illegal because it declares x as having an incomplete element type.

    None of this applies to array objects. Trying to define an array x,
    like this:

    int x[];

    as an automatic variable (i.e. in the body of a function) is wrong.
    You *can* use empty []s if an initializer is used to give the size:

    int x[] = {1, 2, 3};

    is OK. There are some special cases when [] is used in an array
    declaration at file scope (outside any function body) but it is best
    to put these to one side for the moment.

    > The other doubt is about VLAs. If some variable is specified instead
    > of a constant, then the variable should be in the block. In the
    > document I see a prototype and a variable inside the code block:
    >
    > void fvla(int m, int C[m][m])
    > ...
    > int D[m];
    >
    >
    > My question is probably stupid, but how are C and D handled by the
    > compiler?


    This is a common question. How does this work? What is the compiler
    doing here? The trouble with this sort of question is that you are
    asking for more than you need. Why add understanding the C compiler
    to the task of understanding C? Sometimes it helps but with more
    complex languages is definitely does not. (For example, it is much
    easier to understand what a Haskell program means, than it is to see
    how the compiler make it work.)

    Any, I'll have a go... In this program:

    void f(int m, int C[m][m])
    {
    int D[m];
    ...
    }

    int main(void)
    {
    int x[4][4];
    f(4, x);
    return 0;
    }

    x is simple a 2D array of 4 arrays or 4 ints. It is passed to f (like
    all arrays are passed) as a pointer to its first element. x is
    converted to a pointer of type 'int (*)[4]'. f would be the same if
    it were declared:

    void f(int m, int C[][m]);

    or

    void f(int m, int (*C)[m]);

    Only the second m matters to the compiler. This is because, given a
    pointer to that first array of 4 ints, all the compiler needs to know
    is how to get to the next element. It needs to know the element size,
    not how many elements there are.

    When the program flow reaches the declaration of D, space for an array
    m (in this case 4) ints is allocated. This storage lasts until the
    function returns, so it is natural for the compiler to take the
    storage from the same place it uses for other automatic variables --
    usually this is from a stack.

    > My concerns are about some code like the following (mixed
    > code and variables, since it is C99):
    >
    > /* checks about argc and argv */
    >
    > int a = atoi(argv[1]);
    >
    > /* follows some check about a */
    >
    > int x[a];
    >
    >
    > I know it's risky, but it's allowed, isn't it?


    Yes.

    > Is x allocated on the
    > fly as we enter the code block (somehow, on the heap or by any other
    > non C-related means)


    Well put. How or where does not matter. It is likely to allocated
    from a stack ("the" stack if the compiler is using only one).

    > and then subsequently freed when exit?


    The way I used to put it was: the storage lasts until the program flow
    passes the end of the variable's scope (unless the program exits before
    that). It is surprisingly hard to get the words exactly right, but
    most people find the intent more natural than the wording used to
    explain it!

    > Then what
    > are the advantages (not counting the "syntactic sugar" I can think of)
    > of having VLAs into the standard?


    It allows you to have arrays of variable size, efficiently allocated.
    malloc and free will usually be slower and you need to do the freeing.
    The real advantage, though, comes with code like your 2D array
    parameter called C. Such things are simpler with VLA parameters.

    --
    Ben.
    Ben Bacarisse, Mar 21, 2008
    #2
    1. Advertising

  3. Sensei

    Sensei Guest

    On 2008-03-21 16:19:10 +0100, Ben Bacarisse <> said:

    >> Is x allocated on the
    >> fly as we enter the code block (somehow, on the heap or by any other
    >> non C-related means)

    >
    > Well put. How or where does not matter. It is likely to allocated
    > from a stack ("the" stack if the compiler is using only one).
    >
    >> and then subsequently freed when exit?

    >
    > The way I used to put it was: the storage lasts until the program flow
    > passes the end of the variable's scope (unless the program exits before
    > that). It is surprisingly hard to get the words exactly right, but
    > most people find the intent more natural than the wording used to
    > explain it!
    >
    >> Then what
    >> are the advantages (not counting the "syntactic sugar" I can think of)
    >> of having VLAs into the standard?

    >
    > It allows you to have arrays of variable size, efficiently allocated.
    > malloc and free will usually be slower and you need to do the freeing.
    > The real advantage, though, comes with code like your 2D array
    > parameter called C. Such things are simpler with VLA parameters.



    Well, I think you clarified a lot. Thanks!

    One last question, you are saying that VLAs are "usually" faster than
    the malloc/free counterpart, although arrays are passed by means of
    pointers. How can it be so, if the array lives as long as the code
    block it refers to is executing? I mean, the code will effectively
    allocate and deallocate memory, so there must be an explanation for
    being faster. Is heap and stack actually different in this?

    As for 2D arrays, I'm using a simple array with appropriate indexing,
    is there a performance reason for preferring a variable length array?

    Thanks for bearing with me :)

    --

    Sensei <Sensei's e-mail is at Mac-dot-com>

    Basic research is what I am doing when I don't know what I am doing.
    (Wernher von Braun)
    Sensei, Mar 21, 2008
    #3
  4. Sensei

    user923005 Guest

    On Mar 21, 11:55 am, Sensei <Sensei's e-mail is at Mac-dot-com> wrote:
    > On 2008-03-21 16:19:10 +0100, Ben Bacarisse <> said:
    >
    >
    >
    >
    >
    > >> Is x allocated on the
    > >> fly as we enter the code block (somehow, on the heap or by any other
    > >> non C-related means)

    >
    > > Well put.  How or where does not matter.  It is likely to allocated
    > > from a stack ("the" stack if the compiler is using only one).

    >
    > >> and then subsequently freed when exit?

    >
    > > The way I used to put it was: the storage lasts until the program flow
    > > passes the end of the variable's scope (unless the program exits before
    > > that).  It is surprisingly hard to get the words exactly right, but
    > > most people find the intent more natural than the wording used to
    > > explain it!

    >
    > >> Then what
    > >> are the advantages (not counting the "syntactic sugar" I can think of)
    > >> of having VLAs into the standard?

    >
    > > It allows you to have arrays of variable size, efficiently allocated.
    > > malloc and free will usually be slower and you need to do the freeing.
    > > The real advantage, though, comes with code like your 2D array
    > > parameter called C.  Such things are simpler with VLA parameters.

    >
    > Well, I think you clarified a lot. Thanks!
    >
    > One last question, you are saying that VLAs are "usually" faster than
    > the malloc/free counterpart, although arrays are passed by means of
    > pointers. How can it be so, if the array lives as long as the code
    > block it refers to is executing? I mean, the code will effectively
    > allocate and deallocate memory, so there must be an explanation for
    > being faster. Is heap and stack actually different in this?
    >
    > As for 2D arrays, I'm using a simple array with appropriate indexing,
    > is there a performance reason for preferring a variable length array?


    Automatic variables are typically placed on a *cough* stack. The
    generation of these variables is really just a simple subtraction.
    The generation of dynamic memory is more complicated and requires
    functions to track the allocations carefully.
    The speed of access will be similar between automatic and dynamically
    allocated memory. It is the book keeping of generation and disposal
    that differ significantly.

    So why not just use automatic memory all the time? It is a very
    limited resource, and when it fails, it does not fail gracefully like
    malloc() which will tell you that something went wrong by returning an
    NULL pointer. If an automatic allocation fails you can expect a core
    dump or some sort of undefined behavior.



    > Thanks for bearing with me :)
    user923005, Mar 21, 2008
    #4
  5. Sensei

    santosh Guest

    Sensei <Sensei's e-mail is at Mac-dot-com> wrote:

    > On 2008-03-21 16:19:10 +0100, Ben Bacarisse <>
    > said:
    >
    >>> Is x allocated on the
    >>> fly as we enter the code block (somehow, on the heap or by any other
    >>> non C-related means)

    >>
    >> Well put. How or where does not matter. It is likely to allocated
    >> from a stack ("the" stack if the compiler is using only one).
    >>
    >>> and then subsequently freed when exit?

    >>
    >> The way I used to put it was: the storage lasts until the program
    >> flow passes the end of the variable's scope (unless the program exits
    >> before
    >> that). It is surprisingly hard to get the words exactly right, but
    >> most people find the intent more natural than the wording used to
    >> explain it!
    >>
    >>> Then what
    >>> are the advantages (not counting the "syntactic sugar" I can think
    >>> of) of having VLAs into the standard?

    >>
    >> It allows you to have arrays of variable size, efficiently allocated.
    >> malloc and free will usually be slower and you need to do the
    >> freeing. The real advantage, though, comes with code like your 2D
    >> array
    >> parameter called C. Such things are simpler with VLA parameters.

    >
    >
    > Well, I think you clarified a lot. Thanks!
    >
    > One last question, you are saying that VLAs are "usually" faster than
    > the malloc/free counterpart, although arrays are passed by means of
    > pointers. How can it be so, if the array lives as long as the code
    > block it refers to is executing? I mean, the code will effectively
    > allocate and deallocate memory, so there must be an explanation for
    > being faster. Is heap and stack actually different in this?


    He is talking about the speed of allocation. VLAs are allocated on most
    implementations on a special area of storage called the stack. In most
    cases, the machine provides instructions to efficiently manage this
    kind of storage. This turns out to be *much* faster than allocation
    through *alloc() which usually allocated storage from the so
    called "heap", which may involve relatively time consuming negotiations
    with the operating systems.

    > As for 2D arrays, I'm using a simple array with appropriate indexing,
    > is there a performance reason for preferring a variable length array?


    Well, VLAs are usually used when you do not know the size of the array
    at compile time. They are, as I said above, usually much more efficient
    than allocating through malloc, and have the added advantage of being
    managed by the compiler. You need not (and cannot) explicitly free
    them.

    If you do know the array size at compile time the static arrays may
    bring you some benefits in terms of increased portability.

    alloca() is another non-standard alternative to both the above discussed
    mechanism. We had a long and detailed thread about it a week or two
    ago. You'll find it in Google Groups's archive.
    santosh, Mar 21, 2008
    #5
  6. Sensei

    Sensei Guest

    On 2008-03-21 20:24:53 +0100, user923005 <> said:


    > Automatic variables are typically placed on a *cough* stack. The
    > generation of these variables is really just a simple subtraction.
    > The generation of dynamic memory is more complicated and requires
    > functions to track the allocations carefully.
    > The speed of access will be similar between automatic and dynamically
    > allocated memory. It is the book keeping of generation and disposal
    > that differ significantly.
    >
    > So why not just use automatic memory all the time? It is a very
    > limited resource, and when it fails, it does not fail gracefully like
    > malloc() which will tell you that something went wrong by returning an
    > NULL pointer. If an automatic allocation fails you can expect a core
    > dump or some sort of undefined behavior.



    Ok, so I understand why VLAs are usually faster, and I understand also
    that sizes of VLAs may be limited compared to malloc'ed arrays, since
    the memory VLAs are allocated into is quite precious and limited.

    Thanks for helping me clarifying my (perhaps) silly doubts!

    --

    Sensei <Sensei's e-mail is at Mac-dot-com>

    There is no reason for any individual to have a computer in his home.
    (Ken Olsen, President, Digital Equipment, 1977)
    Sensei, Mar 22, 2008
    #6
  7. Sensei

    Sensei Guest

    On 2008-03-21 21:05:45 +0100, santosh <> said:

    > He is talking about the speed of allocation. VLAs are allocated on most
    > implementations on a special area of storage called the stack. In most
    > cases, the machine provides instructions to efficiently manage this
    > kind of storage. This turns out to be *much* faster than allocation
    > through *alloc() which usually allocated storage from the so
    > called "heap", which may involve relatively time consuming negotiations
    > with the operating systems.
    >
    >> As for 2D arrays, I'm using a simple array with appropriate indexing,
    >> is there a performance reason for preferring a variable length array?

    >
    > Well, VLAs are usually used when you do not know the size of the array
    > at compile time. They are, as I said above, usually much more efficient
    > than allocating through malloc, and have the added advantage of being
    > managed by the compiler. You need not (and cannot) explicitly free
    > them.
    >
    > If you do know the array size at compile time the static arrays may
    > bring you some benefits in terms of increased portability.
    >
    > alloca() is another non-standard alternative to both the above discussed
    > mechanism. We had a long and detailed thread about it a week or two
    > ago. You'll find it in Google Groups's archive.



    Thanks for the reply, it's always a good thing learning more about
    these matters! :)

    --

    Sensei <Sensei's e-mail is at Mac-dot-com>

    Basic research is what I am doing when I don't know what I am doing.
    (Wernher von Braun)
    Sensei, Mar 22, 2008
    #7
  8. Sensei

    santosh Guest

    Sensei <Sensei's e-mail is at Mac-dot-com> wrote:

    > On 2008-03-21 20:24:53 +0100, user923005 <> said:
    >
    >
    >> Automatic variables are typically placed on a *cough* stack. The
    >> generation of these variables is really just a simple subtraction.
    >> The generation of dynamic memory is more complicated and requires
    >> functions to track the allocations carefully.
    >> The speed of access will be similar between automatic and dynamically
    >> allocated memory. It is the book keeping of generation and disposal
    >> that differ significantly.
    >>
    >> So why not just use automatic memory all the time? It is a very
    >> limited resource, and when it fails, it does not fail gracefully like
    >> malloc() which will tell you that something went wrong by returning
    >> an
    >> NULL pointer. If an automatic allocation fails you can expect a core
    >> dump or some sort of undefined behavior.

    >
    >
    > Ok, so I understand why VLAs are usually faster, and I understand also
    > that sizes of VLAs may be limited compared to malloc'ed arrays, since
    > the memory VLAs are allocated into is quite precious and limited.


    On modern virtual memory OSes like UNIX systems or Windows, the stack
    can have a theoretical maximum size of 4 Gb, but it's usually limited
    to under 16 Mb by the OS.

    Since it's mapped onto main memory the same way the heap is, it is only
    as precious and limited as the OS and the system's physical memory
    constrain it to be.
    santosh, Mar 22, 2008
    #8
  9. Sensei

    Old Wolf Guest

    On Mar 22, 4:19 am, Ben Bacarisse <> wrote:
    > Sensei <Sensei's e-mail is at Mac-dot-com> writes:
    >
    > > int x[];

    >
    > > would be an incomplete type. This quite puzzles me, since I don't
    > > understand how x[] could then be used in a code block.

    >
    > It can't be.


    In fact it can, here is an example. Arrays of
    incomplete type can still decay to pointers.
    In fact it's not uncommon to have 'extern int y[];'
    where 'y' has incomplete type but is then used.

    #include <stdio.h>

    int x[];

    int main()
    {
    for (int i = 0; i != 5; ++i)
    printf("%d\n", x);
    }

    int x[5] = { 3, 6, 9, 12, 15 };
    Old Wolf, Mar 23, 2008
    #9
  10. Old Wolf <> writes:

    > On Mar 22, 4:19 am, Ben Bacarisse <> wrote:
    >> Sensei <Sensei's e-mail is at Mac-dot-com> writes:
    >>
    >> > int x[];

    >>
    >> > would be an incomplete type. This quite puzzles me, since I don't
    >> > understand how x[] could then be used in a code block.

    >>
    >> It can't be.

    >
    > In fact it can,


    Yes, and I went on to say this (albeit without an example). It seems
    rather unfair to clip that and claim I missed something.

    It will become impossible to write explanations for people new to C if
    every statement must stand up when taken out of the context of the
    reply. If you wanted to explain this point to the OP, then you could
    have replied to their "I don't understand how x[] could then be used
    in a code block" (with a note to say you are taking "code block" to
    mean file-scope declarations).

    --
    Ben.
    Ben Bacarisse, Mar 23, 2008
    #10
  11. Sensei

    Old Wolf Guest

    On Mar 24, 1:57 am, Ben Bacarisse <> wrote:
    > Old Wolf <> writes:
    > > On Mar 22, 4:19 am, Ben Bacarisse <> wrote:
    > >> Sensei <Sensei's e-mail is at Mac-dot-com> writes:

    >
    > >> > int x[];

    >
    > >> > would be an incomplete type. This quite puzzles me, since I don't
    > >> > understand how x[] could then be used in a code block.

    >
    > >> It can't be.

    >
    > > In fact it can,

    >
    > Yes, and I went on to say this (albeit without an example).


    You contradicted yourself, then. I don't think there is any
    grounds to make an absolute statement "It can't be" (which
    you made an entire paragraph on its own), when in fact it
    can be, and it is not uncommon to do so in real code.

    You mention the example of int x[]; but didn't mention at
    all the much more common:
    extern int x[];

    > It seems
    > rather unfair to clip that and claim I missed something.


    I'm sure you didn't miss it personally, but I think it
    is confusing to say the very least to say "Y is not possible."
    and go on quoting more text from the original, and then later
    down the page bury a sentence "except for..."

    > It will become impossible to write explanations for people new to C if
    > every statement must stand up when taken out of the context of the
    > reply. If you wanted to explain this point to the OP, then you could
    > have replied to their "I don't understand how x[] could then be used
    > in a code block" (with a note to say you are taking "code block" to
    > mean file-scope declarations).


    I'm taking "code block" to mean stuff at block scope. The case
    I am highlighting is the one where x is used at block scope,
    as the OP said, but declared at file scope.
    Old Wolf, Mar 24, 2008
    #11
  12. Sensei

    user923005 Guest

    On Mar 22, 12:33 am, Sensei <Sensei's e-mail is at Mac-dot-com> wrote:
    > On 2008-03-21 20:24:53 +0100, user923005 <> said:
    >
    > > Automatic variables are typically placed on a *cough* stack. The
    > > generation of these variables is really just a simple subtraction.
    > > The generation of dynamic memory is more complicated and requires
    > > functions to track the allocations carefully.
    > > The speed of access will be similar between automatic and dynamically
    > > allocated memory.  It is the book keeping of generation and disposal
    > > that differ significantly.

    >
    > > So why not just use automatic memory all the time?  It is a very
    > > limited resource, and when it fails, it does not fail gracefully like
    > > malloc() which will tell you that something went wrong by returning an
    > > NULL pointer.  If an automatic allocation fails you can expect a core
    > > dump or some sort of undefined behavior.

    >
    > Ok, so I understand why VLAs are usually faster, and I understand also
    > that sizes of VLAs may be limited compared to malloc'ed arrays, since
    > the memory VLAs are allocated into is quite precious and limited.


    This is typically true. Many systems have a fixed size for automatic
    variables (and often you can change the amount by a linker switch or
    program that modifies the executable after linking).
    With allocated memory, you will sometimes get the amount of virtual
    memory that is possible on the machine, and sometimes you will get the
    amount allowed by a user limit of some kind.

    At any rate, you have perceived correctly that the (typically) faster
    allocation of automatic memory comes at a price of danger.
    Because C allows recursion, it is hard to know how big automatic
    memory can become except heuristically by measurement. The same goes
    for allocated memory, of course, but often there is more allocated
    memory available to a process and it is always possible to catch a
    problem with calloc() or malloc() because NULL will be returned if the
    memory is not available.

    These sorts of details are more or less common practice but do not
    necessarily predict how an arbitrary C implementation might behave.
    user923005, Mar 24, 2008
    #12
  13. Sensei

    Sensei Guest

    On 2008-03-24 23:58:09 +0100, user923005 <> said:

    >> Ok, so I understand why VLAs are usually faster, and I understand also
    >> that sizes of VLAs may be limited compared to malloc'ed arrays, since
    >> the memory VLAs are allocated into is quite precious and limited.

    >
    > This is typically true. Many systems have a fixed size for automatic
    > variables (and often you can change the amount by a linker switch or
    > program that modifies the executable after linking).
    > With allocated memory, you will sometimes get the amount of virtual
    > memory that is possible on the machine, and sometimes you will get the
    > amount allowed by a user limit of some kind.
    >
    > At any rate, you have perceived correctly that the (typically) faster
    > allocation of automatic memory comes at a price of danger.
    > Because C allows recursion, it is hard to know how big automatic
    > memory can become except heuristically by measurement. The same goes
    > for allocated memory, of course, but often there is more allocated
    > memory available to a process and it is always possible to catch a
    > problem with calloc() or malloc() because NULL will be returned if the
    > memory is not available.
    >
    > These sorts of details are more or less common practice but do not
    > necessarily predict how an arbitrary C implementation might behave.


    Thanks! It's always a good thing to know more about common behaviors
    and good practices.

    --

    Sensei <Sensei's e-mail is at Mac-dot-com>

    We know Linux is the best, it can do infinite loops in five seconds.
    (Linus Torvalds)
    Sensei, Mar 25, 2008
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. aegis

    sizeof and incomplete types

    aegis, Dec 4, 2005, in forum: C Programming
    Replies:
    2
    Views:
    345
    Eric Sosman
    Dec 4, 2005
  2. Replies:
    2
    Views:
    374
  3. Replies:
    7
    Views:
    1,104
    Juha Nieminen
    Nov 29, 2007
  4. Aggelidis Nikos

    VLAs with threads

    Aggelidis Nikos, May 25, 2008, in forum: C Programming
    Replies:
    7
    Views:
    283
    Szabolcs Borsanyi
    May 25, 2008
  5. Replies:
    7
    Views:
    239
Loading...

Share This Page