Pointer Declaration/Array definition

Discussion in 'C Programming' started by ur8x@ur8x.com, Aug 22, 2004.

  1. Guest

    Why does this declaration give undefined result:

    file1: extern char * p;
    file2: char p[10];

    Let's assume p has been initialized, now accessing p...
    , Aug 22, 2004
    #1
    1. Advertising

  2. <> wrote in message
    news:cg9amc$ogh$...
    |
    | Why does this declaration give undefined result:
    |
    | file1: extern char * p;
    Allocates memory to store a pointer, which may later be changed
    to refer to any memory location.
    | file2: char p[10];
    Allocates memory for 10 characters at a fixed address.
    When the variable p is used, the array is implicitly
    converted to a pointer to the first element of the array.
    |
    | Let's assume p has been initialized, now accessing p...

    What is supposed to happen if code that includes
    file1 contains a statement such as:
    p = NULL;


    hth,
    Ivan
    --
    http://ivan.vecerina.com/contact
    Ivan Vecerina, Aug 22, 2004
    #2
    1. Advertising

  3. CBFalconer Guest

    wrote:
    >
    > Why does this declaration give undefined result:
    >
    > file1: extern char * p;
    > file2: char p[10];
    >
    > Let's assume p has been initialized, now accessing p...


    file1 thinks that p is a pointer to a char. file2 thinks that p
    is an array of 10 chars. This is why the "extern char *p;" should
    be in a header file that is included in both file1 and file2, and
    then the compiler would complain. This follows the simple
    principle that header files are used to export things other
    modules need to know about.

    --
    fix (vb.): 1. to paper over, obscure, hide from public view; 2.
    to work around, in a way that produces unintended consequences
    that are worse than the original problem. Usage: "Windows ME
    fixes many of the shortcomings of Windows 98 SE". - Hutchison
    CBFalconer, Aug 22, 2004
    #3
  4. -berlin.de Guest

    wrote:

    > Why does this declaration give undefined result:


    > file1: extern char * p;
    > file2: char p[10];


    Other people already explained why this won't work, i.e. because
    a char array and a char pointer are very different things, having
    not much in common. I guess your confusion is coming from the
    fact that under certain conditions the name of an array is dealt
    with as if it would be a pointer to (the first element of) the
    array, e.g. in

    char p[ ] = "hello word";
    char *pp = p;

    But this only happens when the array is used in "value context",
    i.e. if it is used as if it had a value. Then, and only then, it
    is taken to mean (often called "it decays into") the address of
    the first element of the array.

    But in

    extern char *p;

    'p' isn't used in "value context" (the compiler even doesn't know
    that somewhere else an array of chars named 'p' was defined since
    that's in a different source file), so the "decay to pointer" rule
    doesn't get involved.
    Regards, Jens
    --
    \ Jens Thoms Toerring ___ -berlin.de
    \__________________________ http://www.toerring.de
    -berlin.de, Aug 22, 2004
    #4
  5. Guest

    Ivan Vecerina <> wrote:
    > <> wrote in message
    > news:cg9amc$ogh$...
    > |
    > | Why does this declaration give undefined result:
    > |
    > | file1: extern char * p;
    > Allocates memory to store a pointer, which may later be changed
    > to refer to any memory location.
    > | file2: char p[10];
    > Allocates memory for 10 characters at a fixed address.
    > When the variable p is used, the array is implicitly
    > converted to a pointer to the first element of the array.
    > |
    > | Let's assume p has been initialized, now accessing p...


    Yes, well if p is NULL, accessing p wouldn't make sense.
    But let's say p[] has been initialized, if the array is
    implicitly converted to a point to the first element, shouldn't
    pointer arithmetic get us to p + i * sizeof(char)?



    > What is supposed to happen if code that includes
    > file1 contains a statement such as:
    > p = NULL;



    > hth,
    > Ivan
    > --
    > http://ivan.vecerina.com/contact
    , Aug 22, 2004
    #5
  6. Guest

    Ok, here is what I want to know: What exactly happens when
    p is called, as far accessing and dereferncing that makes
    the code wrong (yes, I know it should not work, I just want
    to know why).

    Thanks.


    -berlin.de wrote:
    > wrote:


    >> Why does this declaration give undefined result:


    >> file1: extern char * p;
    >> file2: char p[10];


    > Other people already explained why this won't work, i.e. because
    > a char array and a char pointer are very different things, having
    > not much in common. I guess your confusion is coming from the
    > fact that under certain conditions the name of an array is dealt
    > with as if it would be a pointer to (the first element of) the
    > array, e.g. in


    > char p[ ] = "hello word";
    > char *pp = p;


    > But this only happens when the array is used in "value context",
    > i.e. if it is used as if it had a value. Then, and only then, it
    > is taken to mean (often called "it decays into") the address of
    > the first element of the array.


    > But in


    > extern char *p;


    > 'p' isn't used in "value context" (the compiler even doesn't know
    > that somewhere else an array of chars named 'p' was defined since
    > that's in a different source file), so the "decay to pointer" rule
    > doesn't get involved.
    > Regards, Jens
    > --
    > \ Jens Thoms Toerring ___ -berlin.de
    > \__________________________ http://www.toerring.de
    , Aug 22, 2004
    #6
  7. -berlin.de Guest

    wrote:
    > -berlin.de wrote:


    Please be so kind not to top-post.

    > Ok, here is what I want to know: What exactly happens when
    > p is called, as far accessing and dereferncing that makes
    > the code wrong (yes, I know it should not work, I just want
    > to know why).


    In the process of compiling and linking the symbol 'p' will
    get replaced by a certain memory address. The code in file2
    knows that at this address there's a string, e.g. "ABCDEFG".
    But the code in file1 assumes that at that address a pointer
    to char is stored. Since you have "ABCDEFG" at that address
    the code in file1 will interpret this value stored there as
    an address like 0x61626364' (assuming you have 4 byte char
    wide addresses on a big-endian machine and ASCII charset, so
    0x61 == 'A' etc.). But that's of course no address but just
    the bit pattern of the start of the string. If you then use
    'p' it tries to dereference that address (0x61626364 + i),
    an address to which you proably have no access to and thus
    you get a segmentation fault.
    Regards, Jens
    --
    \ Jens Thoms Toerring ___ -berlin.de
    \__________________________ http://www.toerring.de
    -berlin.de, Aug 22, 2004
    #7
  8. <> wrote in message
    news:cgaeug$p5e$...
    | Ivan Vecerina <> wrote:
    | > <> wrote in message
    | > news:cg9amc$ogh$...
    | > |
    | > | Why does this declaration give undefined result:
    | > |
    | > | file1: extern char * p;
    | > Allocates memory to store a pointer, which may later be changed
    | > to refer to any memory location.

    | > | file2: char p[10];
    | > Allocates memory for 10 characters at a fixed address.
    | > When the variable p is used, the array is implicitly
    | > converted to a pointer to the first element of the array.
    | > |
    | > | Let's assume p has been initialized, now accessing p...
    |
    | Yes, well if p is NULL, accessing p wouldn't make sense.
    | But let's say p[] has been initialized, if the array is
    | implicitly converted to a point to the first element, shouldn't
    | pointer arithmetic get us to p + i * sizeof(char)?

    This "implicit conversion" is performed by the compiler, when
    it knows that an array is being used as if it were a pointer.
    But the generated code and memory layout is very different.

    For the array, the assembly pseudocode for p[1] looks like:
    - if p is an array:
    1) load the address of p in register A
    2) increment register A
    3) read the byte at address A
    - if p is a pointer:
    1) load the address of p in register A
    2) load the pointer at address A into register B
    3) increment register B
    4) read the byte at address B

    The memory layout is what my previous comments where trying
    to explain (left quoted above).


    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
    Brainbench MVP for C++ <> http://www.brainbench.com
    Ivan Vecerina, Aug 22, 2004
    #8
  9. Guest

    > get replaced by a certain memory address. The code in file2
    > knows that at this address there's a string, e.g. "ABCDEFG".
    > But the code in file1 assumes that at that address a pointer
    > to char is stored. Since you have "ABCDEFG" at that address
    > the code in file1 will interpret this value stored there as
    > an address like 0x61626364' (assuming you have 4 byte char
    > wide addresses on a big-endian machine and ASCII charset, so
    > 0x61 == 'A' etc.). But that's of course no address but just
    > the bit pattern of the start of the string. If you then use
    > 'p' it tries to dereference that address (0x61626364 + i),
    > an address to which you proably have no access to and thus
    > you get a segmentation fault.


    Excellent, so the p is treated as if it holding an address
    to the actual data intended to be read. Thanks.

    P.S. Sorry about the top-posting, I just switched my default
    editor to emacs.
    , Aug 22, 2004
    #9
  10. Guest

    Ivan Vecerina <> wrote:

    > This "implicit conversion" is performed by the compiler, when
    > it knows that an array is being used as if it were a pointer.
    > But the generated code and memory layout is very different.


    > For the array, the assembly pseudocode for p[1] looks like:
    > - if p is an array:
    > 1) load the address of p in register A
    > 2) increment register A
    > 3) read the byte at address A
    > - if p is a pointer:
    > 1) load the address of p in register A
    > 2) load the pointer at address A into register B
    > 3) increment register B
    > 4) read the byte at address B


    > The memory layout is what my previous comments where trying
    > to explain (left quoted above).


    Thanks. Referring to some other posts, does this "implicit
    conversion" also known as "decaying convention?"
    , Aug 22, 2004
    #10
  11. Chris Torek Guest

    In article <news:cgaf5k$p5e$> <> wrote:
    >Ok, here is what I want to know: What exactly happens when
    >p is called, as far accessing and dereferncing that makes
    >the code wrong (yes, I know it should not work, I just want
    >to know why).


    In some cases, a picture is worth a thousand words. (Be sure to
    view this in a fixed-width font.)

    void f(void) {
    char a[6] = { '1', '2', '3' };
    char *p;
    ...
    }

    +-----------------------------------+
    | '1' | '2' | '3' | 0 | 0 | 0 |
    +-----------------------------------+


    +-------------------+ /------------->
    | <garbage address> |---------/
    +-------------------+

    The larger box represents "a", which is made up of six bytes (each
    char in C is a "C byte", always). The six bytes have known values
    because we initialized "a".

    The smaller box represents p, the pointer. We did not initialize
    it, so (assuming these are inside a function, as in the example
    code) it is full of trash. If viewed as a pointer, the result is
    unpredictable -- in this case I have drawn it as a "wild pointer"
    pointing off into the weeds somewhere.

    Now, if we set p to point to the first element of "a":

    p = &a[0];

    we get a new picture:

    +-----------------------------------+
    | '1' | '2' | '3' | 0 | 0 | 0 |
    +-----------------------------------+
    ^
    |
    +--------------------+
    |
    +-------------------+ |
    | <valid address> -|---+
    +-------------------+

    Now p contains an arrow pointing to &a[0].

    When you write a, the compiler says to itself: "aha, `a', that
    is declared as an array, and you want to do something with the
    `value' of `a' -- index it like an array, in this case -- so I will
    construct a pointer pointing to &a[0] and use that."

    This special rule about arrays is a quirk of C. Many other languages
    are very different in their treatment of arrays. There is no
    fundamental reason the C language *has* to work this way; it just
    does. That means that you simply have to memorize this rule. It
    is a thing you have to know about C that has no reason other than
    "the guy who wrote the language decided to do it that way" -- rather
    like the syntax for declarations.

    On the other hand, when you write p, the compiler takes the
    pointer value p already has -- here, pointing to &a[0] -- and
    follows the arrow and then "moves right" according to the number
    in "i". Moreover, if you have the variable "p", you can set it
    to point to some place other than &a[0]:

    p = &a[2];

    makes p point to the '3', and p[1] is the first 0 (or '\0' -- same
    thing) byte, while p[-2] and p[-1] now exist, naming the '1' and
    '2' in a[0] and a[1] respectively. This is because the compiler
    generates code that follows the arrow and then "moves right" as
    requested, and you have already moved right -- which lets you move
    left again, if you want to.

    The difference between using a pointer ("p") and using the array
    name ("a"), then, is that when you use the array name, the compiler
    has to take an extra step to *construct* the pointer it needs, just
    so that it can then follow the pointer. Curiously, this extra work
    *can* (not necessarily "does", just "can") result in faster machine
    code. The reason is that the compiler is allowed to know a lot
    more about the pointer it constructed here, *because it constructed
    it*. It is not some unknown pointer taken in off the street, with
    a mysterious and shady background. The constructed pointer has a
    solid pedigree. Of course, given a local variable like "p", a
    smart compiler can probably look around and figure out whether "p"
    has a similar pedigree -- so on *good* compilers, there tends to
    be little if any performance difference. On not-so-good compilers,
    it is hard to tell which will be faster -- the array, because the
    compiler knows about the pointer it makes, or the pointer, because
    the compiler does not have to do the extra "make a pointer" step.
    Or perhaps neither will be faster there, either.

    The moral of the "performance story" above, as it were, is: use
    whichever one is clearer to the human programmer. On a good compiler
    it will make no real difference, and on a bad one, you cannot predict
    what kind of difference it will make.

    For more on The Rule about arrays and pointers in C, see also
    <http://web.torek.net/torek/c/pa.html>.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Aug 22, 2004
    #11
  12. <> wrote in message
    news:cgapcd$t35$...
    | Ivan Vecerina <> wrote:
    | > This "implicit conversion" is performed by the compiler, when
    | > it knows that an array is being used as if it were a pointer.
    | > But the generated code and memory layout is very different.
    ....
    | Thanks. Referring to some other posts, does this "implicit
    | conversion" also known as "decaying convention?"

    Some like to say that arrays "decay" into pointers,
    which illustrates the fact that the conversion is
    not (easily) reversed. But I've also seen it use
    to designate the fact that function parameters declared
    as having an array type are actually treated as pointers.
    E.g.:
    int f( char param[16] );
    is interpreted by the compiler as:
    int f( char *param );


    hth
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
    Ivan Vecerina, Aug 22, 2004
    #12
  13. Dan Pop Guest

    In <cg9amc$ogh$> writes:


    >Why does this declaration give undefined result:
    >
    >file1: extern char * p;
    >file2: char p[10];


    Why did you expect anything else? It's the same as:

    file1: extern double c;
    file2: char c;

    All declarations of the same object must match its definition.

    If you think that there is anything special about pointers and arrays
    in this context, read the FAQ.

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email:
    Dan Pop, Aug 23, 2004
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sangeetha

    Array of pointer Vs Pointer to Array

    sangeetha, Oct 8, 2004, in forum: C Programming
    Replies:
    9
    Views:
    346
    Tim Rentsch
    Oct 9, 2004
  2. joshc
    Replies:
    9
    Views:
    595
  3. Bolin
    Replies:
    4
    Views:
    407
  4. , India

    pointer to an array vs pointer to pointer

    , India, Sep 20, 2011, in forum: C Programming
    Replies:
    5
    Views:
    449
    James Kuyper
    Sep 23, 2011
  5. Noob

    Declaration vs definition of array

    Noob, Mar 27, 2013, in forum: C Programming
    Replies:
    18
    Views:
    383
    Tim Rentsch
    Apr 2, 2013
Loading...

Share This Page