disgusting compiler !! hahaha!!

Discussion in 'C Programming' started by raj shekar, May 7, 2014.

  1. So why did Fortran do that in 1956, and standardize it in 1966?

    The first might be related to the index registers, but the latter
    is harder to explain. It wasn't fixed until 1977.

    -- glen
    glen herrmannsfeldt, May 8, 2014
    1. Advertisements

  2. raj shekar

    David Brown Guest

    And C++ now uses [[ ]] for attribute syntax, such as for marking
    parameters "unused" or marking functions with "noreturn".


    Since attributes could work just as well in C as in C++ (they don't need
    any C++ specific features), it is not unlikely that this feature will
    get "back-ported" to C, at least as extensions in some compilers.

    And while the C++ people don't seem to mind using the same keywords or
    symbols for completely different purposes, even when the parsing can be
    ambiguous, C people are less keen. So overall, I think that rules out
    [[ ]] as a choice of operators here. (Not that I think C would benefit
    from a 1-based index operator, but that's beside the point.)
    David Brown, May 8, 2014
    1. Advertisements

  3. raj shekar

    Kaz Kylheku Guest

    Fortran was implemented by people who had no idea what they were doing,
    and had nobody to learn it from.

    They had no reason to suspect that one-based arrays suck for programming.

    They made other mistakes, like removing all whitespace before tokenizing and
    parsing, thinking it might be a good idea.

    Today, we really should scrub the word Fortran from our vocabularies.
    Kaz Kylheku, May 8, 2014
  4. raj shekar

    BartC Guest

    So [[x]] just means [(x)-1] ?

    The latter is probably better: no language or compile upgrades are needed,
    it will work in every C system as it is; it tells you exactly what it means
    compared with the more cryptic [[x]] (which has other meanings elsewhere);
    and no true 1-based indexing syntax uses [[x]]; they will use [x].

    And most of the time it can also be written as [x-1] (although it will be
    confusing when the index is already x-1 or x+1, giving an index of x-2 or
    just x).

    The idea however is just to be able to write [x] like everywhere else.
    BartC, May 8, 2014
  5. raj shekar

    Kaz Kylheku Guest


    Kaz Kylheku, May 8, 2014
  6. raj shekar

    BartC Guest

    Usually from contamination with C. If it has curly braces, or the
    implementation is based on a curly-braced language, the probability is that
    the arrays are 0-based.

    Higher-level languages have less reason start to counting from zero which is
    less intuitive. Most counting in real-life is 1-based.
    I would be surprised if Lisp couldn't support N-based arrays if it wanted
    to; if not then it would be the only thing it couldn't do.
    Lua and Ada are 1-based. The latter also has N-based.
    What's stupid is having the 1st, 2nd, 3rd and 4th elements of an array
    indexed as 0, 1, 2 and 3.
    If the choice is *only* between 0 and 1, then 0 is more versatile. But it's
    not hard to allow both or any. Both 0 and 1 bases have their uses; 1-based I
    think is more useful as the default base.

    But imagine you had a language feature that looked like this:

    x = ( n | a, b, c, ... | z);

    which selects one of a, b, c etc depending on n (with z being a default
    value); ie. it selects the nth value.

    Should n be 1-based (so n=1 selects a), or 0-based (n=1 selects b); which is
    more natural?
    BartC, May 8, 2014
  7. raj shekar

    David Brown Guest

    When you just going to have one type of array, then indexing by integers
    starting from 0 is the simplest and clearest - you have the offset from
    the start of the array.

    If you are going to look for more options, then you should really allow
    ranges and different integral types (such as different sized integers,
    integer ranges, contiguous enumerated types, etc.). That would lead to
    clearer code and better compile-time checking for ranges and types -
    similar to Pascal:

    int xs[1 .. 100];
    char rotationCypher['a' .. 'z'];

    I don't see it happening in the C world - especially as it is already
    possible in C++ (but with an uglier template declaration syntax, of course).
    David Brown, May 8, 2014
  8. raj shekar

    BartC Guest

    But it doesn't always make sense to keep thinking of offsets. Look at the
    'A'..'Z' example below, and tell me where offsets from the start of the
    array come in, on the line with rotationcypher['F'].
    Why can't we just have those 1..100 and 'a'..'z' bounds without bringing all
    that other stuff into it?

    I can write this [** not in C **]:

    ['A'..'Z']char rotationcypher
    print rotationcypher['F']

    Which I can machine-translate to C and it takes care of the offsets needed
    to make it work with C's 0-based arrays. Something like this:

    unsigned char rotationcypher[26];

    No special type attributes for the index or bounds, no templates, nothing
    special except providing the right offsets.

    Clearly such bounds are useful and can make for more readable code; why do
    want to deny this to C programmers?
    BartC, May 8, 2014
  9. raj shekar

    David Brown Guest

    The point is that when you use 0-based arrays, you can think of offsets.

    For non-zero based arrays, the compiler hides the offset from you. In
    the 'a' .. 'z', the compiler would put the " -'a' " in for you.
    That format is pointlessly different from C style - if these sorts of
    arrays are ever to make sense in C, C++, or a C-like language, then they
    would be written in roughly the form I gave.
    I can't make sense of you here. I said specifically that arrays with
    ranges or enumerated types as indexes "would lead to clearer code and
    better compile-time checking for ranges and types". I am not "trying to
    deny this to C programmers" - I think it would be a useful enhancement
    to the language, which is why I wrote about it. But I also think that
    it is unlikely to become a part of C, especially as you can implement it
    in C++.
    David Brown, May 8, 2014
  10. raj shekar

    Stefan Ram Guest

    If intermediate pointers to unallocated memory were not a problem,
    we could have any offset we like in C:

    int a_[] ={ 1, 2, 3 }; int * a = a_ - 1; /* a now is 1-based */

    . This might cause UB, but should work in many environments.
    Stefan Ram, May 8, 2014
  11. raj shekar

    Walter Banks Guest

    I have written a few compilers for several languages. C
    compilers are actually fairly complex compared to many
    other languages a lot because the language has evolved
    over time and so much of it is compiler defined but
    constrained by conventional wisdom.

    Walter Banks, May 8, 2014
  12. raj shekar

    Walter Banks Guest

    Pascal can define index ranges when an array is declared. In code
    generation the differences are trivial but it often makes applications
    very readable.

    Walter Banks, May 8, 2014
  13. raj shekar

    Walter Banks Guest

    C implementation of arrays has many choices for some processors
    conflates arrays to pointers is a good approach for others the ISA is
    more effective in accessing arrays is to manage the indices.

    Walter Banks, May 8, 2014
  14. raj shekar

    Walter Banks Guest

    There is a lot of evidence that the added 4 bits a 36 bit float (as opposed
    to a 32 bit single precession) would change a lot of the prcession problems
    for most applications.

    Something that has been missed in many processor designs is the effect of data
    widths on the ability of a processor to be used in applications. 2^^N widths
    have a minimum advantage in hardware implementation. I have worked on
    several processors designs where a non standard data width contributed
    substantially to application throughput.

    Walter Banks, May 8, 2014
  15. raj shekar

    BartC Guest

    Exactly. So why shouldn't there be the option for a lower bound that isn't
    zero? As you say, it's not that difficult for a compiler to deal with it.
    I said that was not C. I wrote it that way because it is *an actual working
    example* of a language that does C-like things yet allows any array lower
    bound. And that C output was actual output (with some types adjusted to make
    it clearer).

    In C you'd put the dimensions where it normally expects them (I don't know
    what syntax would be proposed. I use either [lower..upper] or
    [lower:length] or [length] or [lower:].)
    I think that simply having a choice of lower bound would also be a useful
    enhancement, and one considerably simpler to implement than turning C into
    Pascal or Ada at the same time, with all these type checks (which are much
    more difficult than you might think when you do it properly).

    The compiler can do range checking it if likes. But at present, an array
    like char a[100] already has a range of 0..99, and few compilers seem to
    bother checking it! (A quick test shows only 2 out of 6 checking it at
    normal warning levels. gcc doesn't seem to be it any any level, but
    doubtless there will be an option hidden away to do so.)
    A decision to switch languages is not really an answer! (And it sounds like
    C++ only manages it with some trouble.)

    Anyway not everyone likes to drag in the complexity of a C++ compiler just
    for one or two extra features which ought to be in C already.
    BartC, May 8, 2014
  16. raj shekar

    Kaz Kylheku Guest

    1. Indexing isn't necessarily counting!

    [ ] [ ] [ ]
    ^ ^ ^ ^
    0 1 2 3

    The array index is the left corner of the box. The count is the right corner.

    The index indicates: what is the displacement? How many elements are
    before this one?

    2. Counting is not one based. It is zero based. To count items, you must
    start with zero:

    count = 0

    while (uncounted items remain) {
    check off next item
    count ++

    if there are no items, the count is zero. With the indexing
    diagram, again, counting works like this:

    step 0: initialize

    [ ] [ ]

    step 1: first box is counted

    [ ] [ ]
    ^ ^
    0 1

    step 2: second box is counted

    [ ] [ ]
    ^ ^ ^
    0 1 2

    C certainly uses one-based counting for array length: an array
    which contains only element [0] has length 1.
    Indeed, ANSI Common Lisp has displaced arrays: array objects which virtually
    reference the data in other arrays, with displacement.

    Those were not there in the beginning; just zero-based arrays.
    Not at all. Index 0 indicats that the array is empty b efore we push
    the first element there.

    These concepts can coexist in a language. Lisp:

    (elt '(a b c) 0) -> a

    (first '(a b c)) -> a

    The symbols first, second, ... are never subject to scaling or
    displacement, and correspond to natural language concepts.

    Note that clocks measure the day from 00:00 to 23:59, not from 01:01
    to 24:60. People generally do not have a problem with this.

    Also, countdown timers go to zero. If you cook something with your
    microwave for 27 seconds, it starts at 27, and counts down to zero,
    once per second.

    When year 2000 rolled around, numerous people around the world
    thought that it's the start of the new millennium and celebrated.

    Those pointing out that it actually ranges from 2001 to 3001 were ridiculed as
    dweebs and party poopers.

    "Ordinary people" can, and do, regard zero based systems as natural,
    while at the same time regarding one based counting as natural also,
    depending on context.
    So "useful" and "versatile" are opposites, of sorts.
    Zero all the way, without a question.

    For instance, suppose I want to regard that list as pairs. I want to select
    either (a, b) or (c, d) based on n. It's easy: just take elements 2*n,
    and 2*n+1. If n is 1 based, I have to do algebra: 2*(n-1) and 2*(n-1)+1,
    which goes to 2n-2 and 2n-2+1 = 2n-1.

    Indexing multi-dimensionally gets even more retarded.
    Kaz Kylheku, May 8, 2014
  17. raj shekar

    Kaz Kylheku Guest

    Because one would hope that C programmers can work it out to:

    unsigned char cipher['Z' - 'A' + 1]

    printf("%c", cipher['F' - 'A']);
    Kaz Kylheku, May 8, 2014
  18. With zero-based arrays

    image[y*width+x] = value;

    in C you have to do this all the time.
    Malcolm McLean, May 8, 2014
  19. Ada arrays are not 1-based; you always have to define both lower and
    upper bounds. (The predefined type String happens to use an index
    subtype with a lower bound of 1.)

    I find arguments of the form "This is stupid!" "No, that's stupid!"
    intensely boring.
    Keith Thompson, May 8, 2014
  20. raj shekar

    Joe Pfeiffer Guest

    My first language (not counting BASIC in high school) was Pascal -- you
    could pick any integer upper and lower bound you wanted, and it did
    range checking. I regard this as doing it Right.
    Joe Pfeiffer, May 8, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.