String and Character Array...

Discussion in 'C Programming' started by Shhnwz.a, Dec 5, 2006.

  1. Shhnwz.a

    Shhnwz.a Guest

    Hi,
    I am in confusion regarding jargons.
    When it is technically correct to say.. String or Character Array.in c.
    just give me your perspectives in this issue.
    Thanx in Advance.
     
    Shhnwz.a, Dec 5, 2006
    #1
    1. Advertising

  2. Shhnwz.a

    santosh Guest

    Shhnwz.a wrote:
    > Hi,
    > I am in confusion regarding jargons.
    > When it is technically correct to say.. String or Character Array.in c.
    > just give me your perspectives in this issue.
    > Thanx in Advance.


    In C, all strings are character arrays but the reverse is not always
    true. For example:
    char carray[BUFSIZ];
    declares an array of BUFSIZ characters.
    char string[] = "Hello";
    declares an array of type char having six elements. In can also be
    treated as a string as long as string[5] == '\0' is true.

    If you do string[5] = 'o', then string becomes an array of char but is
    no longer a string. It will invoke undefined behaviour if passed to
    functions that expect a nul terminated string.

    In short, a string is an array of character, terminated by a nul
    character. A plain array of characters has no such requirement.
     
    santosh, Dec 5, 2006
    #2
    1. Advertising

  3. "Shhnwz.a" <> writes:
    > I am in confusion regarding jargons.
    > When it is technically correct to say.. String or Character Array.in c.
    > just give me your perspectives in this issue.


    A character array is a type (or an object of that type). For example,
    "char[10]" is a particular character array type; given "char
    obj[10];", "obj" is an object of that type.

    A string in C is not a data type; it's a data layout. As the standard
    says:

    A _string_ is a contiguous sequence of characters terminated by
    and including the first null character.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Dec 5, 2006
    #3
  4. Shhnwz.a said:

    > Hi,
    > I am in confusion regarding jargons.
    > When it is technically correct to say.. String or Character Array.in c.


    An array is a cardboard box[1]. A string is something you might put in it.
    You might put something else in there instead, of course - or the box might
    just be full of junk - but it's kind of made for a string, just like
    egg-boxes are kind of made for eggs.


    [1] ...except that, nowadays, arrays are hardly ever made from cardboard.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
     
    Richard Heathfield, Dec 5, 2006
    #4
  5. Shhnwz.a

    jaysome Guest

    On Tue, 05 Dec 2006 06:47:47 GMT, Keith Thompson <>
    wrote:

    >"Shhnwz.a" <> writes:
    >> I am in confusion regarding jargons.
    >> When it is technically correct to say.. String or Character Array.in c.
    >> just give me your perspectives in this issue.

    >
    >A character array is a type (or an object of that type). For example,
    >"char[10]" is a particular character array type; given "char
    >obj[10];", "obj" is an object of that type.
    >
    >A string in C is not a data type; it's a data layout. As the standard
    >says:
    >
    > A _string_ is a contiguous sequence of characters terminated by
    > and including the first null character.


    And here is an _object_ of a character array _type_ initialized to a
    _string_:

    char obj[10] = "Hello!";

    --
    jay
     
    jaysome, Dec 5, 2006
    #5
  6. Shhnwz.a

    santosh Guest

    Keith Thompson wrote:
    > "Shhnwz.a" <> writes:
    > > I am in confusion regarding jargons.
    > > When it is technically correct to say.. String or Character Array.in c.
    > > just give me your perspectives in this issue.

    >
    > A character array is a type (or an object of that type). For example,
    > "char[10]" is a particular character array type; given "char
    > obj[10];", "obj" is an object of that type.


    But it's not a first class type. Also dynamic arrays are an exception.
     
    santosh, Dec 5, 2006
    #6
  7. "santosh" <> writes:
    > Keith Thompson wrote:
    >> "Shhnwz.a" <> writes:
    >> > I am in confusion regarding jargons.
    >> > When it is technically correct to say.. String or Character Array.in c.
    >> > just give me your perspectives in this issue.

    >>
    >> A character array is a type (or an object of that type). For example,
    >> "char[10]" is a particular character array type; given "char
    >> obj[10];", "obj" is an object of that type.

    >
    > But it's not a first class type.


    That's probably true for any reasonable definition of "first class",
    but it's not at all clear what that means. The language defines
    different sets of operations for different types; array types just
    happen to have relatively fewer defined operations than other types.

    We can't add pointers, and we can't compare structures. Maybe
    pointers are second class, structures are third class, and arrays are
    fourth class?

    > Also dynamic arrays are an exception.


    Um, an exception to what?

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Dec 5, 2006
    #7
  8. Shhnwz.a

    pete Guest

    jaysome wrote:
    >
    > On Tue, 05 Dec 2006 06:47:47 GMT, Keith Thompson <>
    > wrote:
    >
    > >"Shhnwz.a" <> writes:
    > >> I am in confusion regarding jargons.
    > >> When it is technically correct to say..
    > >> String or Character Array.in c.
    > >> just give me your perspectives in this issue.

    > >
    > >A character array is a type (or an object of that type).
    > >For example,
    > >"char[10]" is a particular character array type; given "char
    > >obj[10];", "obj" is an object of that type.
    > >
    > >A string in C is not a data type;
    > > it's a data layout. As the standard says:
    > >
    > > A _string_ is a contiguous sequence of characters terminated by
    > > and including the first null character.

    >
    > And here is an _object_ of a character array _type_ initialized to a
    > _string_:
    >
    > char obj[10] = "Hello!";


    There's actually seven different strings contained
    within that object.

    strlen(obj)
    strlen(obj + 6)

    --
    pete
     
    pete, Dec 6, 2006
    #8
  9. pete <> writes:
    > jaysome wrote:

    [...]
    >> And here is an _object_ of a character array _type_ initialized to a
    >> _string_:
    >>
    >> char obj[10] = "Hello!";

    >
    > There's actually seven different strings contained
    > within that object.
    >
    > strlen(obj)
    > strlen(obj + 6)


    Very cute. But there are actually ten of them; you missed:

    strlen(obj + 7)
    strlen(obj + 8)
    strlen(obj + 9)

    (The array elements not explicitly initialized are set to zero, i.e.,
    '\0'.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Dec 6, 2006
    #9
  10. Shhnwz.a

    Guest

    Shhnwz.a wrote:
    > I am in confusion regarding jargons.
    > When it is technically correct to say.. String or Character Array.in c.
    > just give me your perspectives in this issue.


    Well, in C the issue is highly clouded because its not done in the same
    way in any other language, and the syntax doesn't lead to an obvious
    disambiguation between the two.

    A C language string is a sequence of characters, and is supposed to be
    terminated by the first '\0' appearing at the character beyond the end
    of the string. Thus, the length of the string is dependent on its
    content (i.e., where you first find the \0').

    An array in C is a sequence of base type elements with a given size,
    but whose contents are arbitrary. So an array of characters may be 100
    elements long with half of them being the value '\0'. I.e., the length
    of the array is not dependent on its contents.

    Now the confusion arises because C does not have any kind of special
    container for where strings are stored. In fact, the only sensible way
    of doing things in C is to store the string inside of a character
    array. Sometimes this storage is implicit and sometimes its explicit.
    The problem is that you can declare an array of characters, and store a
    string in it, but then you refer to the variable name for the array of
    characters and use it as if it were the string. So you see the
    difference between the two has to be conceptual and semantic -- you
    don't even have a seperation of syntax to help you.

    So:

    char a[100] = "Hello ";
    void foo (char * b) {
    char c[] = "world\n";
    char * d;
    d = b + strlen(b);
    strcpy (d, c);
    }

    The variable a is an array of 100 characters which happens to hold the
    string "Hello ". The c variable is an array whose length is 7 and
    holds the string "world\n" -- this is just a special cases of C that
    lets the length of arrays be determined by their initializer, and C
    recognizing the "world\n" initiailizer as basically being { 'w', 'o',
    'r', 'l', 'd', '\n', '\0' }. d is, of course, a pointer, which can
    point inside of arrays, or any other valid storage. The parameter b is
    a pointer to a character -- it is not enforced by the language to point
    to a character array, nor is it required that the contents be a valid
    string (it may in fact be NULL). The line d = b + strlen(b) interprets
    b *as if* b were pointing to valid string contents, and computing the
    pointer which points to one position past the end of the string b
    (i.e., it should be pointing to a '\0'). Then the strcpy() call at the
    end interprets both d and c as if they were strings, and will
    implicitely end up using the tail of b's storage container to hold the
    final concatenated result.

    So whether or not a, b, or c is a string, depends on how you use them.
    If at some point in the foo function, I had put c[2] = (char) rand();
    then we might interpret that as using c in an array-like fashion
    (rand() might output 0). If we invoke the expressions "sizeof (a)" or
    "sizeof (c)" we would also see that these are array interpretations are
    not affected by the length of the string content they may contain.

    Learning all this nonsense is kind of a right of passage for becoming a
    C programmer. It might give you an appreciation for one particular
    twisted way of doing low-level string programming. What is often not
    stated in all this, is the fact that this manifestation of strings is
    both slow and error-prone. There is no reward for the pain of doing
    things this way. Its just penance without purpose. But such is the
    way things are in the C language.

    If you have learnt strings from another programming language, then
    chances are you will find my alternate implementation of strings for
    the C language far more intuitive and sensible: http://bstring.sf.net/
    .. (However, if you are a beginner just learning C you still need to
    learn about memory allocation semantics to use it without issue.) In
    "The Better String Library" strings are their own type and are not
    directly interchangable with character arrays, so the distinction
    between the two is completely obvious.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    , Dec 6, 2006
    #10
  11. writes:
    > Shhnwz.a wrote:
    >> I am in confusion regarding jargons.
    >> When it is technically correct to say.. String or Character Array.in c.
    >> just give me your perspectives in this issue.

    >
    > Well, in C the issue is highly clouded because its not done in the same
    > way in any other language, and the syntax doesn't lead to an obvious
    > disambiguation between the two.


    I disagree (see below).

    > A C language string is a sequence of characters, and is supposed to be
    > terminated by the first '\0' appearing at the character beyond the end
    > of the string. Thus, the length of the string is dependent on its
    > content (i.e., where you first find the \0').


    Yes, except that the '\0' isn't beyond the end of the string; it's
    part of the string.

    C99 7.1.1p1:
    A _string_ is a contiguous sequence of characters terminated by
    and including the first null character.

    > An array in C is a sequence of base type elements with a given size,
    > but whose contents are arbitrary. So an array of characters may be 100
    > elements long with half of them being the value '\0'. I.e., the length
    > of the array is not dependent on its contents.


    Yes.

    But I don't agree that the distinction is all that difficult. You
    just has to realize that an array is a data *type*, and a string is a
    data *format*. And you can naturally use an array to store a string
    (or to store data that might not be a string).

    I acknowledge that C strings have their share of problems, but
    understanding just what a "string" is shouldn't be one of them.
    (And yes, a "string" is a data type in some other languages.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Dec 6, 2006
    #11
  12. Shhnwz.a

    pete Guest

    Keith Thompson wrote:
    >
    > pete <> writes:
    > > jaysome wrote:

    > [...]
    > >> And here is an _object_ of a character array _type_ initialized to a
    > >> _string_:
    > >>
    > >> char obj[10] = "Hello!";

    > >
    > > There's actually seven different strings contained
    > > within that object.
    > >
    > > strlen(obj)
    > > strlen(obj + 6)

    >
    > Very cute. But there are actually ten of them; you missed:
    >
    > strlen(obj + 7)
    > strlen(obj + 8)
    > strlen(obj + 9)
    >
    > (The array elements not explicitly initialized are set to zero, i.e.,
    > '\0'.)


    YOU OUTCUTED ME !!!

    --
    pete
     
    pete, Dec 6, 2006
    #12
  13. Shhnwz.a

    rkk Guest

    please refer to c-faq.com website for more details about this. It has
    lot of information about this & may avoid your confusion. Click the
    link below:
    http://c-faq.com/aryptr/practdiff.html

    Good luck

    pete wrote:
    > Keith Thompson wrote:
    > >
    > > pete <> writes:
    > > > jaysome wrote:

    > > [...]
    > > >> And here is an _object_ of a character array _type_ initialized to a
    > > >> _string_:
    > > >>
    > > >> char obj[10] = "Hello!";
    > > >
    > > > There's actually seven different strings contained
    > > > within that object.
    > > >
    > > > strlen(obj)
    > > > strlen(obj + 6)

    > >
    > > Very cute. But there are actually ten of them; you missed:
    > >
    > > strlen(obj + 7)
    > > strlen(obj + 8)
    > > strlen(obj + 9)
    > >
    > > (The array elements not explicitly initialized are set to zero, i.e.,
    > > '\0'.)

    >
    > YOU OUTCUTED ME !!!
    >
    > --
    > pete
     
    rkk, Dec 6, 2006
    #13
  14. Shhnwz.a

    rkk Guest

    please refer to c-faq.com website for more details about this. It has
    lot of information about this & may avoid your confusion. Click the
    link below:
    http://c-faq.com/aryptr/practdiff.html

    Good luck

    pete wrote:
    > Keith Thompson wrote:
    > >
    > > pete <> writes:
    > > > jaysome wrote:

    > > [...]
    > > >> And here is an _object_ of a character array _type_ initialized to a
    > > >> _string_:
    > > >>
    > > >> char obj[10] = "Hello!";
    > > >
    > > > There's actually seven different strings contained
    > > > within that object.
    > > >
    > > > strlen(obj)
    > > > strlen(obj + 6)

    > >
    > > Very cute. But there are actually ten of them; you missed:
    > >
    > > strlen(obj + 7)
    > > strlen(obj + 8)
    > > strlen(obj + 9)
    > >
    > > (The array elements not explicitly initialized are set to zero, i.e.,
    > > '\0'.)

    >
    > YOU OUTCUTED ME !!!
    >
    > --
    > pete
     
    rkk, Dec 6, 2006
    #14
  15. "rkk" <> writes:
    > pete wrote:
    >> Keith Thompson wrote:
    >> >
    >> > pete <> writes:
    >> > > jaysome wrote:
    >> > [...]
    >> > >> And here is an _object_ of a character array _type_ initialized to a
    >> > >> _string_:
    >> > >>
    >> > >> char obj[10] = "Hello!";
    >> > >
    >> > > There's actually seven different strings contained
    >> > > within that object.
    >> > >
    >> > > strlen(obj)
    >> > > strlen(obj + 6)
    >> >
    >> > Very cute. But there are actually ten of them; you missed:
    >> >
    >> > strlen(obj + 7)
    >> > strlen(obj + 8)
    >> > strlen(obj + 9)
    >> >
    >> > (The array elements not explicitly initialized are set to zero, i.e.,
    >> > '\0'.)

    >>
    >> YOU OUTCUTED ME !!!

    >
    > please refer to c-faq.com website for more details about this. It has
    > lot of information about this & may avoid your confusion. Click the
    > link below:
    > http://c-faq.com/aryptr/practdiff.html
    >
    > Good luck


    Please don't top-post (I've corrected it here). Read the following:

    http://www.caliburn.nl/topposting.html
    http://www.cpax.org.uk/prg/writings/topposting.php

    I don't believe anyone you quoted was confused about the distinction
    between strings and character arrays. The FAQ entry you cited is
    about the distinction between arrays and pointers, which isn't really
    germane.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Dec 6, 2006
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Velvet
    Replies:
    9
    Views:
    14,947
    Joerg Jooss
    Jan 19, 2006
  2. Brand Bogard

    8 bit character string to 16 bit character string

    Brand Bogard, May 25, 2006, in forum: C Programming
    Replies:
    8
    Views:
    763
    those who know me have no need of my name
    May 28, 2006
  3. herman
    Replies:
    5
    Views:
    7,654
    =?ISO-8859-1?Q?Erik_Wikstr=F6m?=
    Aug 30, 2007
  4. PerlFAQ Server
    Replies:
    0
    Views:
    405
    PerlFAQ Server
    Jan 25, 2011
  5. Bart Vandewoestyne
    Replies:
    8
    Views:
    789
    Bart Vandewoestyne
    Sep 25, 2012
Loading...

Share This Page