Why index starts in C from 0 and not 1

Discussion in 'C Programming' started by kapilk, Aug 16, 2004.

  1. kapilk

    kapilk Guest

    Sir,

    I know that the array index starts in C from 0 and not 1 can any
    body pls. tell me the reason.

    Is it because in the subscript i can have a unsigned integer and
    these start from 0

    Thanks
    kapilk, Aug 16, 2004
    #1
    1. Advertising

  2. kapilk

    Allan Bruce Guest

    "kapilk" <> wrote in message
    news:...
    > Sir,
    >
    > I know that the array index starts in C from 0 and not 1 can any
    > body pls. tell me the reason.
    >
    > Is it because in the subscript i can have a unsigned integer and
    > these start from 0
    >
    > Thanks


    I think it is due to the way that the compilers work. If you have an array
    of sometype then the way to access these uses the notation

    addressOfStartOfArray + (index * sizeof(sometype))

    if the accesses were from 1, then this would add extra computation and
    therefore be slower. Also, almost every programming language adopts 0 as
    the initial index.

    Allan
    Allan Bruce, Aug 16, 2004
    #2
    1. Advertising

  3. kapilk on 16 Aug 2004 04:38:08 -0700 writes:

    > Sir,
    >
    > I know that the array index starts in C from 0 and not 1 can any
    > body pls. tell me the reason.
    >
    > Is it because in the subscript i can have a unsigned integer and
    > these start from 0


    IMHO no, it was an arbitrary decision, it just made sense that way.

    You can think of the index like a value to add to the basic pointer.

    #include <stdio.h>

    int main (int argc, char *argv[])
    {
    /* here test is a pointer to the first of these 5 characters.
    the five characters are consecutive in memory */
    char test [5] = {'t', 'e', 's', 't', '\n'};
    printf ("%c == %c\n", test [0], * (test + 0));
    printf ("%c == %c\n", test [1], * (test + 1)); /* adding 1 you point
    to the next character */
    printf ("%c == %c\n", test [2], * (test + 2));
    printf ("%c == %c\n", test [3], * (test + 3));
    return 0;
    }

    --
    Marco Parrone <> [0x45070AD6]
    Marco Parrone, Aug 16, 2004
    #3
  4. On Mon, 16 Aug 2004, kapilk wrote:

    > Sir,
    >
    > I know that the array index starts in C from 0 and not 1 can any
    > body pls. tell me the reason.
    >
    > Is it because in the subscript i can have a unsigned integer and
    > these start from 0


    I would suspect it has something to do with the fact that C language is a
    language designed to work closely with the hardware architecture and most
    assembly languages that has an indexed addressing mode start at zero.

    On the other hand, C language originated on a PDP-11. The PDP-11 assembly
    language just uses a fixed source and destination for things like
    assignment (MOV), addition (ADD), subtraction (SUB) and comparison (CMP).
    In other words, there is not address+offset mode like the C68000 or more
    modern processors.

    If you believe this is why C starts at zero you'll have to ask the
    question, why does assembly language start at zero? But you'll have to ask
    it in an assembly language newsgroup.

    --
    Send e-mail to: darrell at cs dot toronto dot edu
    Don't send e-mail to
    Does It Matter, Aug 16, 2004
    #4
  5. Does It Matter <> wrote:

    > On the other hand, C language originated on a PDP-11. The PDP-11 assembly
    > language just uses a fixed source and destination for things like
    > assignment (MOV), addition (ADD), subtraction (SUB) and comparison (CMP).
    > In other words, there is not address+offset mode like the C68000 or more
    > modern processors.


    That's incorrect (the PDP-11 has 8 addressing modes - including offsets
    from a register value).

    --
    Thomas E. Dickey
    http://invisible-island.net
    ftp://invisible-island.net
    Thomas Dickey, Aug 16, 2004
    #5
  6. Does It Matter <> wrote:

    > If you believe this is why C starts at zero you'll have to ask the
    > question, why does assembly language start at zero? But you'll have to ask
    > it in an assembly language newsgroup.


    ....or Pascal, or other languages that don't date from 1959.

    --
    Thomas E. Dickey
    http://invisible-island.net
    ftp://invisible-island.net
    Thomas Dickey, Aug 16, 2004
    #6
  7. kapilk

    Default User Guest

    kapilk wrote:
    >
    > Sir,
    >
    > I know that the array index starts in C from 0 and not 1 can any
    > body pls. tell me the reason.
    >
    > Is it because in the subscript i can have a unsigned integer and
    > these start from 0



    Probably because the array indexing operator is really syntactic sugar
    for pointer operations.


    ptr == *(ptr + i);


    Obviously, when using pointer arithmetic, the first element is at ptr +
    0, so the first element when using [] to access it is ptr[0].




    Brian Rodenborn
    Default User, Aug 16, 2004
    #7
  8. > I know that the array index starts in C from 0 and not 1 can any
    >body pls. tell me the reason.
    >
    > Is it because in the subscript i can have a unsigned integer and
    >these start from 0


    My answer to this is that C starting from zero is likely to be influenced
    by a lot of *MATHEMATICS* starting from zero.

    Also, it is more likely that loading or storing an element of an array can
    be accomplished with a single machine instruction if you don't have to
    deal with the offset of 1.
    Gordon L. Burditt
    Gordon Burditt, Aug 16, 2004
    #8
  9. kapilk wrote:
    > Sir,
    >
    > I know that the array index starts in C from 0 and not 1 can any
    > body pls. tell me the reason.
    >
    > Is it because in the subscript i can have a unsigned integer and
    > these start from 0
    >


    Maybe because the index value is reallythe offset from the start
    of the array...

    One never knows though.

    --
    Thomas.
    Thomas Stegen, Aug 16, 2004
    #9
  10. kapilk

    Lew Pitcher Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Thomas Stegen wrote:

    > kapilk wrote:
    >
    >> Sir,
    >> I know that the array index starts in C from 0 and not 1 can any
    >> body pls. tell me the reason.
    >>
    >> Is it because in the subscript i can have a unsigned integer and
    >> these start from 0
    >>

    >
    > Maybe because the index value is reallythe offset from the start
    > of the array...


    Bingo!

    "Rather more surprising, at least at first sight, is the fact that a reference
    to a can also be written as *(a+i). In evaluating a, C converts it to
    *(a+i) immediately; the two forms are completely equivalent. Applying the
    operator & to both parts of this equivalence, it follows that &a and a+i are
    identical: a+i is the address of the i-th element beyond a." (from Section 5.3
    of "The C Programming Language" by Brian W. Kernighan and Dennis M. Ritchie, (c)
    1978)

    So, the genesis of C has a+i being the same as a. If a is an array, then
    &a[1] is the same as a+1, and thus a+0 must be the same as &a[0]. This makes
    arrays zero based.


    This is not to say that the C standard retains this bias. Simply that it came
    from the fact that the index value of an array was really the offset of the
    specific item from the start of the array.

    - --
    Lew Pitcher
    IT Consultant, Enterprise Application Architecture,
    Enterprise Technology Solutions, TD Bank Financial Group

    (Opinions expressed are my own, not my employers')
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (MingW32)

    iD8DBQFBIQ6FagVFX4UWr64RAu9NAKD0AjpIVqgsBerdAA3Rt355FnHdjACfTyUG
    293Wn2tpoVhKs4IHcx2PwIY=
    =ck5V
    -----END PGP SIGNATURE-----
    Lew Pitcher, Aug 16, 2004
    #10
  11. Lew Pitcher wrote:

    > Thomas Stegen wrote:
    >
    >>kapilk wrote:
    >>
    >>> I know that the array index starts in C from 0 and not 1.
    >>> Can anybody please tell me the reason?
    >>>
    >>> Is it because in the subscript i can have a unsigned integer
    >>> and these start from 0


    No.

    >> Maybe because the index value is really
    >> the offset from the start of the array...

    >
    > Bingo!
    >
    > "Rather more surprising, at least at first sight,
    > is the fact that a reference to a can also be written as *(a+i).
    > In evaluating a, C converts it to *(a+i) immediately;
    > the two forms are completely equivalent.
    > Applying the operator & to both parts of this equivalence,
    > it follows that &a and a+i are identical:
    > a+i is the address of the i-th element beyond a."
    > (from Section 5.3 of "The C Programming Language"
    > by Brian W. Kernighan and Dennis M. Ritchie, (c) 1978)
    >
    > So, the genesis of C has a+i being the same as a.
    > If a is an array, then &a[1] is the same as a+1,
    > and thus a+0 must be the same as &a[0]. This makes arrays zero based.
    >
    > This is not to say that the C standard retains this bias.
    > Simply that it came from the fact that the index value of an array
    > was really the offset of the specific item from the start of the array.


    You forgot to answer, "Why?"

    In order to reference element a,
    the computer must first calculate its address.
    If you use a [one-based] index,
    the compiler would be obliged to calculate

    (a + i - 1)

    Today, good optimizing C compilers
    would eliminate the superfluous subtraction
    but, when K & R were designing C,
    compilers usually didn't have the resources
    (fast processors and large memories)
    required to perform such optimizations.
    E. Robert Tisdale, Aug 16, 2004
    #11
  12. Does It Matter wrote:


    > On the other hand, C language originated on a PDP-11. The PDP-11 assembly
    > language just uses a fixed source and destination for things like
    > assignment (MOV), addition (ADD), subtraction (SUB) and comparison (CMP).
    > In other words, there is not address+offset mode like the C68000 or more
    > modern processors.


    This is just silly. Please check the eight addressing modes in the
    PDP-11 before posting more (just barely topical) "information."
    Martin Ambuhl, Aug 16, 2004
    #12
  13. kapilk

    Joe Wright Guest

    kapilk wrote:
    > Sir,
    >
    > I know that the array index starts in C from 0 and not 1 can any
    > body pls. tell me the reason.
    >
    > Is it because in the subscript i can have a unsigned integer and
    > these start from 0
    >
    > Thanks


    Because I like it that way! But really, it's hard to say.

    IBM was the first major OEM disk drive maker. IBM numbers tracks
    from 0 and sectors from 1. Why? Seagate, Western Digital, Maxtor,
    etc. do the same. Why?

    Bytes in a record are numbered from 0 while columns on a punch card
    number from 1. Go figure.
    --
    Joe Wright mailto:
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
    Joe Wright, Aug 17, 2004
    #13
  14. kapilk

    Dan Pop Guest

    In <> (kapilk) writes:

    > I know that the array index starts in C from 0 and not 1 can any
    >body pls. tell me the reason.


    Because the language designers decided to make array an alternate
    notation for *(array + i). They could have chosen to make array
    an alternate notation for *(array + i - 1), in which case array
    indices would have been 1-based, but they didn't.

    I don't know if this is an original C feature or merely inherited from
    one of its predecessors (CPL, BCPL, B).

    To someone with a solid assembly background, 0-based indexing appears as
    the most natural option, because this is how indexed addressing modes
    work on most processors supporting them. And the processor for which
    C was originally designed was no exception.

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email:
    Dan Pop, Aug 17, 2004
    #14
  15. kapilk

    Dan Pop Guest

    In <cfq6sn$q5u$2surf.net> "Allan Bruce" <> writes:

    >if the accesses were from 1, then this would add extra computation and
    >therefore be slower. Also, almost every programming language adopts 0 as
    >the initial index.


    The most popular languages at the time C was designed used 1-based
    indexing: FORTRAN, BASIC, Pascal.

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email:
    Dan Pop, Aug 17, 2004
    #15
  16. In article <cfq6sn$q5u$2surf.net>,
    Allan Bruce <> wrote:

    >I think it is due to the way that the compilers work. If you have an array
    >of sometype then the way to access these uses the notation
    >
    >addressOfStartOfArray + (index * sizeof(sometype))
    >
    >if the accesses were from 1, then this would add extra computation and
    >therefore be slower.


    Only if the compilers were particularly stupid.

    Real compilers would just produce

    (addressOfStartOfArray - sizeof(sometype)) + (index * sizeof(sometype))

    where the first parenthesized expression is known at compile time.

    C arrays start at zero because it's The Right Thing to do.

    -- Richard
    Richard Tobin, Aug 17, 2004
    #16
  17. kapilk

    boa Guest

    Richard Tobin wrote:

    > In article <cfq6sn$q5u$2surf.net>,
    > Allan Bruce <> wrote:
    >
    >
    >>I think it is due to the way that the compilers work. If you have an array
    >>of sometype then the way to access these uses the notation
    >>
    >>addressOfStartOfArray + (index * sizeof(sometype))
    >>
    >>if the accesses were from 1, then this would add extra computation and
    >>therefore be slower.

    >
    >
    > Only if the compilers were particularly stupid.
    >
    > Real compilers would just produce
    >
    > (addressOfStartOfArray - sizeof(sometype)) + (index * sizeof(sometype))
    >
    > where the first parenthesized expression is known at compile time.


    Always? Even when the "array" is a pointer to dynamically allocated memory?

    >
    > C arrays start at zero because it's The Right Thing to do.


    Agreed. ;-)

    boa@home
    boa, Aug 17, 2004
    #17
  18. In article <2hqUc.172$>,
    boa <> wrote:

    >>>addressOfStartOfArray + (index * sizeof(sometype))
    >>>
    >>>if the accesses were from 1, then this would add extra computation and
    >>>therefore be slower.


    >> Real compilers would just produce
    >>
    >> (addressOfStartOfArray - sizeof(sometype)) + (index * sizeof(sometype))
    >>
    >> where the first parenthesized expression is known at compile time.


    >Always? Even when the "array" is a pointer to dynamically allocated memory?


    True, I was assuming addressOfStartOfArray was supposed to be a constant.

    But in many common cases, other optimizations will remove the
    overhead. For example, when looping over the array, the index can be
    adjusted instead of the base.

    -- Richard
    Richard Tobin, Aug 17, 2004
    #18
  19. (Dan Pop) writes:
    > In <cfq6sn$q5u$2surf.net> "Allan Bruce"
    > <> writes:
    >
    > >if the accesses were from 1, then this would add extra computation and
    > >therefore be slower. Also, almost every programming language adopts 0 as
    > >the initial index.

    >
    > The most popular languages at the time C was designed used 1-based
    > indexing: FORTRAN, BASIC, Pascal.


    Quibble: Pascal allows arrays to be based however the user specifies.
    For example (if I remember the syntax correctly):

    type
    My_Array = array[37 .. 42] of Integer;

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Aug 17, 2004
    #19
  20. kapilk

    kal Guest

    (Richard Tobin) wrote in message news:<cftdmd$1k89$>...

    > True, I was assuming addressOfStartOfArray was supposed to be a constant.
    >
    > But in many common cases, other optimizations will remove the
    > overhead. For example, when looping over the array, the index can be
    > adjusted instead of the base.


    Not a satisfactory explanation. Your earlier statement was incorrect.
    kal, Aug 18, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jody Fisher
    Replies:
    0
    Views:
    477
    Jody Fisher
    Jul 29, 2003
  2. Mr. SweatyFinger

    why why why why why

    Mr. SweatyFinger, Nov 28, 2006, in forum: ASP .Net
    Replies:
    4
    Views:
    863
    Mark Rae
    Dec 21, 2006
  3. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,764
    Smokey Grindel
    Dec 2, 2006
  4. Casey Hawthorne
    Replies:
    1
    Views:
    470
    Robert Klemme
    Oct 27, 2006
  5. Tomasz Chmielewski

    sorting index-15, index-9, index-110 "the human way"?

    Tomasz Chmielewski, Mar 4, 2008, in forum: Perl Misc
    Replies:
    4
    Views:
    271
    Tomasz Chmielewski
    Mar 4, 2008
Loading...

Share This Page