VLA question

Discussion in 'C Programming' started by Philip Lantz, Jun 14, 2013.

  1. Philip Lantz

    Philip Lantz Guest

    The following declaration of v is a variable length array and is thus a
    constraint violation in compilers for C prior to C99 and in
    implementations for C11 that define __STDC_NO_VLA__.

    void f()
    const int n = 5;
    int v[n];

    For implementations that do support it, is there any reason they can't
    treat it exactly as they would treat the following, regardless of what
    appears in the rest of the function body?

    void f()
    const int n = 5;
    int v[5];
    Philip Lantz, Jun 14, 2013
    1. Advertisements

  2. No, there is not, because any attempt to modify `n` results in undefined
    behavior and compilers are allowed to presume that there is no undefined
    behavior in the programs they compile.

    On the other hand, compilers are also not required to notice that `n` is
    initialized with a compile-time constant and thus that this
    transformation is possible.

    Bart v Ingen Schenau
    Bart van Ingen Schenau, Jun 14, 2013
    1. Advertisements

  3. Even if you could change the value of n, it wouldn't affect the VLA.
    The size of a VLA is determined when it's defined:

    int n = 5;
    int v[n];
    n = 42; /* v still has 5 elements */

    VLAs are an exception to the rule that the argument of sizeof is not
    evaluted, but I don't think that matters in this case. Evaluating
    `sizeof v` in principle evalutes `v`, but that has no visible effects.
    I'm not even sure what it means to evaluate `v`. (The rule does mean
    that evaluating `sizeof int[func()]` will call func.)

    One difference is type compatibility. Two array types with size
    specifiers with compatible element types are compatible if their
    sizes are constant and equal *or* if their sizes are not both
    constant. If they're not constant and are unequal, using them in
    a context that requires compatible types has undefined behavior.

    So this:

    int v[5];
    int (*ptr)[6] = v;

    is a constraint violation, but this;

    int n = 5;
    int v[n];
    int (*ptr)[6] = v;

    is not a constraint violation, but its behavior is undefined.

    Still, a compiler is permitted to treat them identically. It can
    generate the same code for both. Most compilers will treat them
    differently in the sense that they'll issue a diagnostic (or at
    least a *different* diagnostic) in the first case, but I don't
    think that distinction is required; when the behavior is undefined,
    a diagnostic is permitted, and when the behavior is defined,
    compilers can issue whatever diagnostics they like.
    Keith Thompson, Jun 14, 2013
  4. Without VLAs, array sizes are constrained to "constant integer
    expressions". For some reason, a "const int" is not a "constant integer
    expression", so using one as such would be a constraint violation.

    This is why most "constants" in C are done with #define rather than
    const int, which defeats type safety.

    The C-like subset of C++ does not have this defect, so it's puzzling why
    later revisions of C itself have not fixed it. This is simpler, for
    instance, than introducing VLAs in the first place.
    As far as I know, they do.

    Stephen Sprunk, Jun 15, 2013
  5. Philip Lantz

    Eric Sosman Guest

    Simpler, well, maybe -- but some complications remain:

    const int n = 1 + rand() % 53;
    int v[n];

    Just knowing that `n' is `const' is not enough; one must also
    remember things about its initializer.

    extern const int n;
    int v[n];

    Again, just knowing that it's `const' is not enough.

    One can write rules to deal with such things -- C++ has
    such rules -- but the very existence of the rules shows that
    "simpler" is a bit more complex than "dead easy."
    Eric Sosman, Jun 15, 2013
  6. Surely the compiler would know whether n's value was known or not. An
    unknown value would only be allowed if VLAs were supported.
    Fair enough. However, I think it's a reasonable to expect this to be
    fixed by now, especially given the C-like subset of C++ did so long ago.
    Just borrow those rules--as C has done with several other of its
    improvements over the years, eg. prototypes, // comments, etc..

    Stephen Sprunk, Jun 15, 2013
  7. You can also define enumeration constants:

    void f() {
    enum { n = 5 };
    int v[n];

    Now v is not a VLA.

    Enumeration constants can only be of type int, and this trick is
    arguably an abuse of the enumeration type mechanism (the latter
    doesn't particularly bother me).
    Keith Thompson, Jun 16, 2013
  8. There was a lengthy discussion of this in comp.std.c a couple of years
    ago (which didn't reach any real conclusions other that it hadn't been
    officially proposed for C201X, and it was already too late to add it
    anyway -- a moot point now.)

    Look for a thread with the subject "C++-style treatment of const?",
    Message-ID: <>.
    Keith Thompson, Jun 16, 2013
  9. This doesn't directly address your question, but one difference is that
    VLAs can't be initialized.

    int not_a_vla[5] = { 0 }; /* ok */
    int vla[n] = { 0 }; /* constraint violation */
    Keith Thompson, Jun 16, 2013
  10. Philip Lantz

    Tim Rentsch Guest

    This assumes that other people agree that it's a problem, and
    obviously they don't. What are the benefits, and what makes
    them worth the costs? Unless and until someone presents a
    compelling argument on this question, it's unlikely that C
    will adopt such a change. And rightly so.
    Tim Rentsch, Jun 16, 2013
  11. Philip Lantz

    Ian Collins Guest

    One could argue that VLAs are the cost of not making this simple and
    long overdue change. The archaic kludges (textual substitution, abuse
    of enums) used as a work around the lack of proper const are yet another
    cost we have to bear.
    Ian Collins, Jun 16, 2013
  12. Philip Lantz

    Öö Tiib Guest

    The dual meaning of 'const' in C++ is sometimes confusing to some.
    'constexpr' was introduced to make it possible to differentiate
    "compile-time constants" from "immutable objects". C might likely win
    if it keeps 'const' like it is (better than in C++ IMO) and loans
    'constexpr' to get rid of those enum and preprocessor workarounds.
    Öö Tiib, Jun 17, 2013
  13. Philip Lantz

    James Kuyper Guest

    On 06/16/2013 06:44 PM, william@wilbur.25thandClement.com wrote:
    I think that they make handling mult-dimensional arrays with dimensions
    that are not compile time constants a lot easier than it was in C90. I
    think there's a lot of people other than Fortran emigres who would
    appreciate that.
    James Kuyper, Jun 17, 2013
  14. I don't think one could argue that *persuasively*. VLAs were intended
    to make it possible to define arrays whose length is determined at
    execution time. The fact that they make

    const int n = 5;
    int vla[n];

    legal is just a side effect, and probably an unintended one.
    Keith Thompson, Jun 17, 2013
  15. Philip Lantz

    Tim Rentsch Guest

    If having VLAs is one of the consequences of not adopting the C++
    rule allowing constant-expression-defined-variables in other
    constant expressions, then IMO that consequence alone makes it
    worth not including the C++ feature, even if there were no other
    reasons favoring excluding it.

    Ignoring that aspect (which I think was given not completely
    seriously anyway), let's draw up a list of consequences, or
    attributes, for each of the three approaches. This turns out to
    be a bigger list than I was expecting:

    #define enum 'const' variables
    -------------------- -------------------- ---------------------
    textual binding lexical binding lexical binding
    no scoping regular scoping regular scoping
    (or, scope may be
    limited w/ #undef)

    no reference as a no reference as a may be referenced
    variable variable as a variable,
    eg, address taken

    uses preprocessor
    works easily w/ -D=

    arithmetic types only type int arithmetic types
    (address values pointer types?
    allowed, but using
    textual binding)

    individual values sets of values individual values
    (allowing enhanced
    type checking)

    K&R C C90 (or before?) would need additional
    language definition
    (for C)

    textual + semantic syntactic (ie, name semantic (ie, to know
    (ie, whether value defined always may whether a variable can
    may be used in a be used in constant be used as a constant,
    constant expression expression) some semantic analysis
    depends on both needed)
    textual analysis and
    semantic analysis)

    no change to other no change to other interacts with other
    language aspects language aspects language aspects
    (based on my perhaps
    flawed understanding
    of the C++ feature)

    same in C and C++ same in C and C++ present in C++,
    (slightly different not present in C
    semantics in the
    two languages?)

    Now on to (my own) subjective reactions.

    Comparing #define and 'const' variables, the main plusses for
    using #define are also its main minuses: it uses a simple,
    long-known and well-understood mechanism, with all of the usual
    warts associated with the preprocessor. Some people depolore
    using the preprocessor on general principles and try to avoid it
    pretty much at all costs, but I'm not in that camp. I know the
    preprocessor has warts, but really they aren't that bad here for
    just #define of simple constant expressions. Lack of lexically
    scoped definitions is a drawback for #define; this is partially
    fixable with #undef, which works sort of okay but definitely has
    a high wart factor.

    Looking at the flip side, 'const' variables don't have the warts
    that using #define does: analysis is easier, and lexical scoping
    is a plus. I'm not sure how important the scoping issue is; I
    would want to look at a variety of code bases before assigning a
    particular weight to that aspect. The big drawback of 'const'
    variables is that adding them to C would mean a larger and more
    complicated language definition. It's easy to underestimate the
    impact of "little" changes like this. I recently went through
    the exercise of reading the C++ language definition document (ie,
    the C++ Standard, although not the most recently approved one).
    Until doing that I didn't really appreciate just how large and
    complicated C++ has become. How did it get that way? As the
    saying goes, one feature at a time...

    Another consideration: not having 'const' variables in C
    increases the semantic gap between C and C++. I'm not sure if
    the weight for this should be positive or negative, but in any
    case the absolute value is small IMO.

    Comparing using 'enum' and const variables is more interesting.
    Relative to const variables, the biggest downside of defining
    constants using 'enum' is that they are limited to values of type
    int. (I know the syntac form is unappealing to some but to me
    this seems like a minor issue.) Now look at the positives: we
    know something defined in an enum can be used in a constant
    expression; enum lets us define several related values that can
    be identified together, eg, for checking the cases of 'switch()'
    statements; a value defined in an enum is identified with a
    particular type, which facilitates improved typed checking should
    we want to do that.

    Besides offering a larger range of types, const variables are
    part of "ordinary" C, ie, the definitions look just like
    executable C code. I don't think this is a big plus. In fact it
    may not be a plus at all - compile time and execution time are
    fundamentally different regimes, and making them look the same
    may be more confusing than helpful. Using 'const' to define
    compile time variables may be seen as a very limited form of
    metaprogramming. It certainly isn't obvious that lowering
    the boundary between programming and metaprogramming leads to
    better programs. Clearly it is possible make them look the
    same, but that doesn't mean it's desirable.

    Bottom line: I stand by my original assessment -- until and
    unless someone presents a compelling argument for introducing a
    new language feature (for C) in this area, the language is better
    left as is. (Note: for such an argument to be compelling, it
    should include a statement of what use cases it means to address,
    along with some sort of evidence that those use cases are
    significant and substantial.) Furthermore, even assuming that
    adopting a new language feature seems advisable, extending enums
    in some way that would allow for greater type variability looks
    like a better bet than adopting the const variable feature.
    Tim Rentsch, Jun 21, 2013
  16. Philip Lantz

    Lew Pitcher Guest

    enum was documented in Nov 15, 1978 in an addendum to the K&R C manual,
    entitled "Recent Changes to C".

    While not strictly "K&R" C (i.e., not stated in "The C Programming Language"
    (c) 1978), it /did/ appear (along with K&R Appendix A "C Reference Manual")
    in the Seventh Edition, Volume 2 Unix Programmers Manual (c) 1983, 1979.

    So, enum /does/ predate C90, and postdates K&R by only a few months.

    Lew Pitcher, Jun 21, 2013
  17. A thorough and thoughtful analysis; I'm not sure I agree with you on
    every point, but most of my disagreement would probably fall into how
    each is weighted anyway.
    I come at this from a rather different angle. Looking at the C-like
    subset of C++ (no classes, templates, namespaces, and overloading), I
    find it to be a superior version of C than C itself. That gap has been
    narrowing over time, eg. adopting // comments, so what I propose is to
    merely complete that task all at once. The individual changes could
    probably not be justified on their own, but as a group, IMHO they are.

    Stephen Sprunk, Jun 25, 2013
  18. Quirks such as sizeof('x') giving different values in C and C++, and
    int *x = malloc(10 * sizeof(int)) being a type incompatible error in
    C++ but not C are best ironed out, I agree. // comments have been widely
    accepted, but I have been bitten by compilers rejecting them. Whilst
    you can usually set a flag, this often isn't acceptable, user needs to
    be able to compile with a simple " C_compiler compile code.c " command.
    Malcolm McLean, Jun 26, 2013
  19. Philip Lantz

    James Kuyper Guest

    On 06/26/2013 08:50 AM, Malcolm McLean wrote:
    Few compilers are fully conforming in their default mode, and the ways
    in which they fail to conform are different for different compilers. Do
    you really want to insist on code being compilable in every compiler's
    default mode?
    James Kuyper, Jun 26, 2013
  20. Ideally yes.
    Whilst it's possible to artificially construct legal, defined ANSI C programs
    which will break on various compilers in default mode, usually you do that
    either by artificially using little-used constructs


    x = y //*ha ha two fingers to you */z;

    or, more forgiveably, you use identifiers like min which are often defined
    by the environment.

    There are a few areas where things are difficult, e.g. handling PI and nan().
    Malcolm McLean, Jun 26, 2013
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.