2D arrays, pointers, and pointer arithmetic: question

Discussion in 'C Programming' started by gdotone@gmail.com, Aug 24, 2012.

  1. Guest

    I'm reading "A Book on C", 4th edition. The author gives the following example when discussing two dimensional arrays:

    a[3][5], so let a[][] be of type int.


    col 1 col 2 col 3 col 4 col 5
    row 1 a[0][0] a[0][1] a[0][2] a[0][3] a[0][4]
    row 2 a[1][0] a[1][1] a[1][2] a[1][3] a[1][4]
    row 3 a[2][0] a[2][1] a[2][2] a[2][3] a[2][4]

    The author then says that the following expression are equivalent

    a[j]
    *(a + j)
    (*(a + i))[j]
    *((*(a+i)) + j)
    *(&a[0][0] + 5*i + j)

    I understand that the table is just a convenient representation of the 2D array.
    In memory the 2D array is actually contiguous. OK.

    My question has to do with saying or writing out in words what is being shown when using the *, dereferencing, and &, address operators.

    Is this the correct way of saying:

    *(a + j) the base address is pointed to by a and we move along the contiguous memory j (interval), so we add j, j of size int, to get the value at that location we use the dereference operator.

    I believe I understand what's happening, but am I saying it right.

    (*(a + i))[j] the base address is at a, i is added to the base address, ok, I understand that we are looking at the row location here, but I'm stuck at how to say this.

    let's say addresses are given in decimal, an integer in one byte and each address points to a byte. I hope I'm saying this right?
    at address 100, a[0][0]
    at address 109, a[1][0]
    (if this thought is wrong please clear this up)

    let's also say we are starting at location 100 (decimal).
    consider a[1][0]. row 2, col 1 for the following discussion please.

    (*(a + i ))[j] --- (*(a + 1))[0]
    so, a, the start of the array is pointing to location 100
    What does it mean to add 1 to a?
    I mean, I understand that we are now at the address of the second row. I guess this should be known from the additional [0] indicating this is not just a one byte move?

    And what about the others like: *((*(a+i)) + j)
    Let me stop here, I think I'm starting to confuse myself. (ok, yes i'm already confused. :) )

    Thanks for the help everyone,

    g.
     
    , Aug 24, 2012
    #1
    1. Advertising


  2. > *(a + j) the base address is pointed to by a and we move along the


    Yes, but that makes a sound like a pointer. I would say "a, is the i'th element of a 2d array, which is therefore a 1-d array".


    > contiguous memory j (interval), so we add j, j of size int, to get the
    > value at that location we use the dereference operator.


    I would say "we then implicitly convert a to a pointer-to-int, add j andthen dereference it". Which I think is what you said.


    > (*(a + i))[j] the base address is at a, i is added to the base address, ok, I


    "Implicitly convert `a` into a pointer-to-array-of-ints", add `i` to it (which means increment the pointer by 3*sizeof(int) bytes). Then get an array-of-ints by dereferencing that pointer (this step doesn't do an actual memory lookup, it just changes the type; arrays are weird that way). Finally give us the j'th integer in that array.


    I'll stop now, because I am supposed to be doing real work. I hope it helps.
     
    Adrian Ratnapala, Aug 24, 2012
    #2
    1. Advertising

  3. James Kuyper Guest

    On 08/24/2012 11:28 AM, wrote:
    > I'm reading "A Book on C", 4th edition. The author gives the following example when discussing two dimensional arrays:
    >
    > a[3][5], so let a[][] be of type int.
    >
    >
    > col 1 col 2 col 3 col 4 col 5
    > row 1 a[0][0] a[0][1] a[0][2] a[0][3] a[0][4]
    > row 2 a[1][0] a[1][1] a[1][2] a[1][3] a[1][4]
    > row 3 a[2][0] a[2][1] a[2][2] a[2][3] a[2][4]
    >
    > The author then says that the following expression are equivalent
    >
    > a[j]
    > *(a + j)
    > (*(a + i))[j]
    > *((*(a+i)) + j)
    > *(&a[0][0] + 5*i + j)


    <pedantic>
    This last expression takes a pointer to the first element of an array
    containing only 5 elements, and (for i > 0) attempts to dereference it
    after adding an integer greater than or equal to 5. Such derefencing has
    undefined behavior (6.5.6p8). This interpretation of the standard is
    controversial, but IMNSHO correct. In practice, on many machines it will
    work exactly as expected. It's unlikely to fail except on systems which
    provide run-time array bounds checking, which is rare.

    Otherwise, the only way it's likely to fail is if the optimizer takes
    advantage of the fact that it has undefined behavior by performing
    optimizations that depend upon such code not being executed. For
    instance, an implementation that performs an optimization based upon the
    assumption that a[0]+i never refers to the same object a[1]+j,
    regardless of the values of i and j, is perfectly legal despite the fact
    that such an optimization could break code like this.
    </pedantic>

    > I understand that the table is just a convenient representation of the 2D array.
    > In memory the 2D array is actually contiguous. OK.
    >
    > My question has to do with saying or writing out in words what is being shown when using the *, dereferencing, and &, address operators.
    >
    > Is this the correct way of saying:
    >
    > *(a + j) the base address is pointed to by a and we move along the contiguous memory j (interval), so we add j, j of size int, to get the value at that location we use the dereference operator.
    >
    > I believe I understand what's happening, but am I saying it right.


    Not quite, a doesn't point at the base address. a is an expression
    of array type, and in most contexts (this being one of them) gets
    automatically converted to a pointer (address) to the first element of
    the array.

    > (*(a + i))[j] the base address is at a, i is added to the base address, ok, I understand that we are looking at the row location here, but I'm stuck at how to say this.


    a is an expression of array type, which automatically gets converted
    into a pointer to the first element of that array, a[0]. Adding 'i' to
    that pointer gives the address of the 'i+1'th element of that array,
    a. Dereferencing that pointer gives an expression of array type,
    referring to a. Like all such expressions, in most contexts, it gets
    automatically converted to a pointer to the first element of that array
    a[0]. Whenever x[j] is a valid expression, it is equivalent to
    *(x+j). Adding 'j' to that pointer gives a pointer to the 'j+1'th
    element of a, which is a[j]. Dereferencing that pointer gives the
    value of that element.

    > let's say addresses are given in decimal, an integer in one byte and each address points to a byte. I hope I'm saying this right?
    > at address 100, a[0][0]
    > at address 109, a[1][0]
    > (if this thought is wrong please clear this up)


    a[1][0] is stored at a position that is exactly 5*sizeof a[0][0] bytes
    after the position of a[0][0]. Therefore, assuming a linear address
    space (which C does NOT require), the difference in the addresses must
    be an exact multiple of 5. 109-100=9 doesn't qualify.

    > let's also say we are starting at location 100 (decimal).
    > consider a[1][0]. row 2, col 1 for the following discussion please.
    >
    > (*(a + i ))[j] --- (*(a + 1))[0]
    > so, a, the start of the array is pointing to location 100
    > What does it mean to add 1 to a?


    a + 1 => &a[0] + 1 => &a[1].
     
    James Kuyper, Aug 24, 2012
    #3
  4. James Kuyperæ–¼ 2012å¹´8月25日星期六UTC+8上åˆ12時01分58秒寫é“:
    > On 08/24/2012 11:28 AM, wrote:
    >
    > > I'm reading "A Book on C", 4th edition. The author gives the following example when discussing two dimensional arrays:

    >
    > >

    >
    > > a[3][5], so let a[][] be of type int.

    >
    > >

    >
    > >

    >
    > > col 1 col 2 col 3 col 4 col 5

    >
    > > row 1 a[0][0] a[0][1] a[0][2] a[0][3] a[0][4]

    >
    > > row 2 a[1][0] a[1][1] a[1][2] a[1][3] a[1][4]

    >
    > > row 3 a[2][0] a[2][1] a[2][2] a[2][3] a[2][4]

    >
    > >

    >
    > > The author then says that the following expression are equivalent

    >
    > >

    >
    > > a[j]

    >
    > > *(a + j)

    >
    > > (*(a + i))[j]

    >
    > > *((*(a+i)) + j)

    >
    > > *(&a[0][0] + 5*i + j)

    >
    >
    >
    > <pedantic>
    >
    > This last expression takes a pointer to the first element of an array
    >
    > containing only 5 elements, and (for i > 0) attempts to dereference it
    >
    > after adding an integer greater than or equal to 5. Such derefencing has
    >
    > undefined behavior (6.5.6p8). This interpretation of the standard is
    >
    > controversial, but IMNSHO correct. In practice, on many machines it will
    >
    > work exactly as expected. It's unlikely to fail except on systems which
    >
    > provide run-time array bounds checking, which is rare.
    >
    >
    >
    > Otherwise, the only way it's likely to fail is if the optimizer takes
    >
    > advantage of the fact that it has undefined behavior by performing
    >
    > optimizations that depend upon such code not being executed. For
    >
    > instance, an implementation that performs an optimization based upon the
    >
    > assumption that a[0]+i never refers to the same object a[1]+j,
    >
    > regardless of the values of i and j, is perfectly legal despite the fact
    >
    > that such an optimization could break code like this.
    >
    > </pedantic>
    >
    >
    >
    > > I understand that the table is just a convenient representation of the 2D array.

    >
    > > In memory the 2D array is actually contiguous. OK.

    >
    > >

    >
    > > My question has to do with saying or writing out in words what is beingshown when using the *, dereferencing, and &, address operators.

    >
    > >

    >
    > > Is this the correct way of saying:

    >
    > >

    >
    > > *(a + j) the base address is pointed to by a and we move along the contiguous memory j (interval), so we add j, j of size int, to get the value at that location we use the dereference operator.

    >
    > >

    >
    > > I believe I understand what's happening, but am I saying it right.

    >
    >
    >
    > Not quite, a doesn't point at the base address. a is an expression
    >
    > of array type, and in most contexts (this being one of them) gets
    >
    > automatically converted to a pointer (address) to the first element of
    >
    > the array.
    >
    >
    >
    > > (*(a + i))[j] the base address is at a, i is added to the base address,ok, I understand that we are looking at the row location here, but I'm stuck at how to say this.

    >
    >
    >
    > a is an expression of array type, which automatically gets converted
    >
    > into a pointer to the first element of that array, a[0]. Adding 'i' to
    >
    > that pointer gives the address of the 'i+1'th element of that array,
    >
    > a. Dereferencing that pointer gives an expression of array type,
    >
    > referring to a. Like all such expressions, in most contexts, it gets
    >
    > automatically converted to a pointer to the first element of that array
    >
    > a[0]. Whenever x[j] is a valid expression, it is equivalent to
    >
    > *(x+j). Adding 'j' to that pointer gives a pointer to the 'j+1'th
    >
    > element of a, which is a[j]. Dereferencing that pointer gives the
    >
    > value of that element.
    >
    >
    >
    > > let's say addresses are given in decimal, an integer in one byte and each address points to a byte. I hope I'm saying this right?

    >
    > > at address 100, a[0][0]

    >
    > > at address 109, a[1][0]

    >
    > > (if this thought is wrong please clear this up)

    >
    >
    >
    > a[1][0] is stored at a position that is exactly 5*sizeof a[0][0] bytes
    >
    > after the position of a[0][0]. Therefore, assuming a linear address
    >
    > space (which C does NOT require), the difference in the addresses must
    >
    > be an exact multiple of 5. 109-100=9 doesn't qualify.
    >
    >
    >
    > > let's also say we are starting at location 100 (decimal).

    >
    > > consider a[1][0]. row 2, col 1 for the following discussion please.

    >
    > >

    >
    > > (*(a + i ))[j] --- (*(a + 1))[0]

    >
    > > so, a, the start of the array is pointing to location 100

    >
    > > What does it mean to add 1 to a?

    >
    >
    >
    > a + 1 => &a[0] + 1 => &a[1].


    In summary one should view int ** a, and int *a2[4000], and int a3[4000][4000]
    differently.

    The last one has been very easy to be used in the unix or linux
    systems long time ago.
     
    88888 Dihedral, Aug 24, 2012
    #4
  5. Guest

    Again thanks everyone.

    I think have a better understanding of what is happening and what the other expressions are doing and how to describe them.

    I created a simple program, creating a 2X3 array and explored the address of the expressions, in piece and whole.

    #include <stdio.h>
    #include <stdlib.h>

    #define SPACE_BETWEEN_LINES "\n\n"

    int main(void)
    {
    int a[2][3];
    int i,j;

    printf ( SPACE_BETWEEN_LINES );

    for ( i = 0; i < 2; i++ )
    for ( j = 0; j < 3; j++ )
    a[j] = 0;

    printf ( SPACE_BETWEEN_LINES );

    for ( i = 0; i < 2; i++ )
    for ( j = 0; j < 3; j++ )
    printf ( "The address for a[%d][%d] is %p\n", i, j, &a[j] );

    printf ( SPACE_BETWEEN_LINES );

    for ( i = 0; i < 2; i++ )
    for ( j = 0; j < 3; j++ )
    printf ( "The address for ((*(a + %d)) + %d) is %p\n",
    i, j, ((*(a+i)) +j) );

    printf ( SPACE_BETWEEN_LINES );

    printf ( "The starting address of a is: %p\n\n", a );

    printf ( "The second row starting address: %p\n\n", a + 1 );

    printf ( "The a[1][0] address is : %p\n\n", &a[1][0] );

    printf ( SPACE_BETWEEN_LINES );

    printf ( "*(a+0) address is: %p\n\n", *(a+0) );

    printf ( "*(a+0) + 1 address is: %p\n", *(a+0) + 1);

    printf ( "The value at *(a+0) is %d\n", **(a+0) ); /* *(*(a+0)) */

    printf ( "\n" );

    return 1;
    }

    Output:


    The address for a[0][0] is 0x7fff67ca8bf0
    The address for a[0][1] is 0x7fff67ca8bf4
    The address for a[0][2] is 0x7fff67ca8bf8
    The address for a[1][0] is 0x7fff67ca8bfc
    The address for a[1][1] is 0x7fff67ca8c00
    The address for a[1][2] is 0x7fff67ca8c04


    The address for ((*(a + 0)) + 0) is 0x7fff67ca8bf0
    The address for ((*(a + 0)) + 1) is 0x7fff67ca8bf4
    The address for ((*(a + 0)) + 2) is 0x7fff67ca8bf8
    The address for ((*(a + 1)) + 0) is 0x7fff67ca8bfc
    The address for ((*(a + 1)) + 1) is 0x7fff67ca8c00
    The address for ((*(a + 1)) + 2) is 0x7fff67ca8c04


    The starting address of a is: 0x7fff67ca8bf0

    The second row starting address: 0x7fff67ca8bfc

    The a[1][0] address is : 0x7fff67ca8bfc



    *(a+0) address is: 0x7fff67ca8bf0

    *(a+0) + 1 address is: 0x7fff67ca8bf4
    The value at *(a+0) is 0

    Ok, *(a+0) returns and address, *(a+0) + 1 returns an address.

    So, given a[][], a two dimensional array, *(a+i) will return the address of the starting row, *(a+i) + j, returns the address of row,column, &a[j].
    The i term added to a is being added to the base address of the array and itself, i, representing the size of a row. j then adds to that address giving the column position.

    Or something like that... :)

    Thanks again.
     
    , Aug 27, 2012
    #5
  6. James Kuyper Guest

    On 08/27/2012 04:35 PM, wrote:
    ....
    > printf ( "The address for a[%d][%d] is %p\n", i, j, &a[j] );


    Whenever you print a pointer using "%p", the expression you print should
    have the type 'void*'. In all cases in this code, that would require a
    cast: (void*). Your code appears to work as expected, because on your
    system, as on many others, the conversion to void* is a no-op for these
    pointer types. However there are systems where such code won't work as
    expected without the cast; it might not work at all, causing your
    program to fail.
    I didn't notice any other problems with your code, and your description
    of what you're expecting to see seems to match the standard (assuming a
    linear address space - which is very common, but NOT required).
     
    James Kuyper, Aug 27, 2012
    #6
  7. Guest

    On Monday, August 27, 2012 4:46:25 PM UTC-4, James Kuyper wrote:
    > On 08/27/2012 04:35 PM, wrote:
    >
    > ...
    >
    > > printf ( "The address for a[%d][%d] is %p\n", i, j, &a[j] );

    >
    >
    >
    > Whenever you print a pointer using "%p", the expression you print should
    >
    > have the type 'void*'. In all cases in this code, that would require a
    >
    > cast: (void*). Your code appears to work as expected, because on your
    >
    > system, as on many others, the conversion to void* is a no-op for these
    >
    > pointer types. However there are systems where such code won't work as
    >
    > expected without the cast; it might not work at all, causing your
    >
    > program to fail.
    >
    > I didn't notice any other problems with your code, and your description
    >
    > of what you're expecting to see seems to match the standard (assuming a
    >
    > linear address space - which is very common, but NOT required).


    Thanks for letting me know that and thanks for explaining the linear address space is common but not guaranteed. The books that I'm reading don't make these points clear.

    Thanks again. With you guy's help I may actual write good code one day. These groups are the best thing since slice bread. :)
     
    , Aug 27, 2012
    #7
  8. On Mon, 27 Aug 2012 13:35:35 -0700 (PDT), wrote:

    snip

    >I created a simple program, creating a 2X3 array and explored the address of the expressions, in piece and whole.
    >
    >#include <stdio.h>
    >#include <stdlib.h>
    >
    >#define SPACE_BETWEEN_LINES "\n\n"
    >
    >int main(void)
    >{
    > int a[2][3];
    > int i,j;
    >
    > printf ( SPACE_BETWEEN_LINES );
    >
    > for ( i = 0; i < 2; i++ )
    > for ( j = 0; j < 3; j++ )
    > a[j] = 0;


    10*i+j might have been a better value so when you print any element of
    the array you can tell it is the correct one.

    snip

    > printf ( "The value at *(a+0) is %d\n", **(a+0) ); /* *(*(a+0)) */


    *(a+0) is by definition the same as a[0]. a[0] is an array of 3 int.
    It doesn't make sense to talk about a single value for such an array.

    **(a+0) is same as a[0][0] (from recursive application of the above
    definition). a[0][0] is the single int you print. But it is not the
    value of a[0].

    >
    > printf ( "\n" );
    >
    > return 1;


    This is a non-portable return value from main. Usually successful
    completion returns 0. If you really wanted a non-zero value,
    EXIT_FAILURE is portable and always non-zero.

    >}
    >
    >Output:
    >
    >
    >The address for a[0][0] is 0x7fff67ca8bf0
    >The address for a[0][1] is 0x7fff67ca8bf4
    >The address for a[0][2] is 0x7fff67ca8bf8
    >The address for a[1][0] is 0x7fff67ca8bfc
    >The address for a[1][1] is 0x7fff67ca8c00
    >The address for a[1][2] is 0x7fff67ca8c04
    >
    >
    >The address for ((*(a + 0)) + 0) is 0x7fff67ca8bf0
    >The address for ((*(a + 0)) + 1) is 0x7fff67ca8bf4
    >The address for ((*(a + 0)) + 2) is 0x7fff67ca8bf8
    >The address for ((*(a + 1)) + 0) is 0x7fff67ca8bfc
    >The address for ((*(a + 1)) + 1) is 0x7fff67ca8c00
    >The address for ((*(a + 1)) + 2) is 0x7fff67ca8c04
    >
    >
    >The starting address of a is: 0x7fff67ca8bf0
    >
    >The second row starting address: 0x7fff67ca8bfc
    >
    >The a[1][0] address is : 0x7fff67ca8bfc
    >
    >
    >
    >*(a+0) address is: 0x7fff67ca8bf0
    >
    >*(a+0) + 1 address is: 0x7fff67ca8bf4
    >The value at *(a+0) is 0
    >
    >Ok, *(a+0) returns and address, *(a+0) + 1 returns an address.
    >
    >So, given a[][], a two dimensional array, *(a+i) will return the address of the starting row, *(a+i) + j, returns the address of row,column, &a[j].


    * is not a function so return is not the best term to use. The
    expression *(a+i) and its identical twin a will, with a few
    exceptions that don't apply here, both EVALUATE to the address of the
    i-th row, not the starting row. If you meant the start of row i, that
    would be correct.

    >The i term added to a is being added to the base address of the array and itself, i, representing the size of a row. j then adds to that address giving the column position.


    When a pointer and an integer are added, the value of the integer is
    scaled by the size of the object pointed to, or if you prefer, the
    size of the object type pointed to.

    Given a pointer p to type T and an integer k, the C expression
    p + k
    is mathematically equivalent to
    p + k * sizeof(T)
    or the synonymous
    p + k * sizeof *p

    Furthermore, discounting the exceptions alluded to above, an
    expression with array type is converted to the address of the first
    array element with type pointer to element type.

    So, the expression
    a
    is converted to &a[0] with type pointer to array of 3 int,
    syntactically written as int(*)[3]. Since sizeof(int) is 4 on
    your system, an array of 3 int has size 12.

    Combining these two facts tells us that the expression a+i will
    evaluate to the address i*12 bytes beyond the start of the array and
    still have type pointer to array of 3 int. The net effect is that a+i
    evaluates to the address of the i-th row

    The expression *(a+i) dereferences this pointer and the result is an
    array of 3 int. Consequently, this expression is also converted as
    described above. The result is the address of the first element of
    the i-th row &a[0] with type pointer to int. In other words,
    &a[0].

    Adding j to this expression will evaluate to the address j*4 bytes
    beyond the start of the row and still have type pointer to int. In
    other words, &a[j].

    --
    Remove del for email
     
    Barry Schwarz, Aug 28, 2012
    #8
  9. Barry Schwarz <> writes:
    > On Mon, 27 Aug 2012 13:35:35 -0700 (PDT), wrote:

    [...]
    > * is not a function so return is not the best term to use. The
    > expression *(a+i) and its identical twin a will, with a few
    > exceptions that don't apply here, both EVALUATE to the address of the
    > i-th row, not the starting row. If you meant the start of row i, that
    > would be correct.


    The usual term is "yield": functions return values, and expressions
    yield values.

    [...]

    >>The i term added to a is being added to the base address of the array
    >>and itself, i, representing the size of a row. j then adds to that
    >>address giving the column position.

    >
    > When a pointer and an integer are added, the value of the integer is
    > scaled by the size of the object pointed to, or if you prefer, the
    > size of the object type pointed to.


    Another way to look at it is that the value is *not* "scaled" by the
    size of the object pointed to; rather, the addition yields a pointer
    that points N *objects* past the object that the original pointer
    points to. (There has to be an array for this to make sense, possibly
    the 1-element array that's equivalent to a single object.)

    The standard's description doesn't talk about scaling (N1570 6.5.6p8):

    When an expression that has integer type is added to or
    subtracted from a pointer, the result has the type of the
    pointer operand. If the pointer operand points to an element
    of an array object, and the array is large enough, the result
    points to an element offset from the original element such that
    the difference of the subscripts of the resulting and original
    array elements equals the integer expression. In other words, if
    the expression P points to the i -th element of an array object,
    the expressions (P)+N(equivalently, N+(P)) and (P)-N(where N
    has the valuen ) point to, respectively, the i+n-th and i−n-th
    elements of the array object, provided they exist.

    Referring to addition being "scaled" implies that pointers are
    *really* pointers to bytes. That's likely true on the machine code
    level, but it's not true in the C abstract machine; rather pointers
    *really* point to whatever type they're defined to point to.

    > Given a pointer p to type T and an integer k, the C expression
    > p + k
    > is mathematically equivalent to
    > p + k * sizeof(T)
    > or the synonymous
    > p + k * sizeof *p


    I understand what you mean, but the fact that you're using "+"
    with two very different meaning could be confusing.

    [...]

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Aug 28, 2012
    #9
  10. On 2012-08-27, James Kuyper <> wrote:
    > On 08/27/2012 04:35 PM, wrote:
    > ...
    >> printf ( "The address for a[%d][%d] is %p\n", i, j, &a[j] );

    >
    > Whenever you print a pointer using "%p", the expression you print should
    > have the type 'void*'. In all cases in this code, that would require a


    Is there really any reason to cast an (int *) into a (void *). The only
    reason I could imagine is to keep your compiler quiet.
     
    Adrian Ratnapala, Aug 28, 2012
    #10
  11. Eric Sosman Guest

    On 8/28/2012 3:06 PM, Adrian Ratnapala wrote:
    > On 2012-08-27, James Kuyper <> wrote:
    >> On 08/27/2012 04:35 PM, wrote:
    >> ...
    >>> printf ( "The address for a[%d][%d] is %p\n", i, j, &a[j] );

    >>
    >> Whenever you print a pointer using "%p", the expression you print should
    >> have the type 'void*'. In all cases in this code, that would require a

    >
    > Is there really any reason to cast an (int *) into a (void *). The only
    > reason I could imagine is to keep your compiler quiet.


    This is Question 5.17 at the comp.lang.c Frequently Asked
    Questions (FAQ) page, <http://www.c-faq.com/>.

    --
    Eric Sosman
    d
     
    Eric Sosman, Aug 28, 2012
    #11
  12. James Kuyper Guest

    On 08/28/2012 03:06 PM, Adrian Ratnapala wrote:
    > On 2012-08-27, James Kuyper <> wrote:
    >> On 08/27/2012 04:35 PM, wrote:
    >> ...
    >>> printf ( "The address for a[%d][%d] is %p\n", i, j, &a[j] );

    >>
    >> Whenever you print a pointer using "%p", the expression you print should
    >> have the type 'void*'. In all cases in this code, that would require a

    >
    > Is there really any reason to cast an (int *) into a (void *). The only
    > reason I could imagine is to keep your compiler quiet.


    The standard says that the behavior is undefined. It does so because
    there are real platforms out there where void* is incompatible with
    int*. For instance, on platforms where the word size is larger than than
    a byte, a void* must include information identifying the particular byte
    within a word where an object starts, because it might be required to
    point at an object smaller than a word. However, for word-aligned types,
    the pointer need only identify which word it starts at. There have been
    real machines where pointers to word-aligned types have been smaller
    than pointers to smaller types, for precisely this reason.
    It's also permissible for pointers of the same size to use different
    representations. I don't know of any examples, but I wouldn't recommend
    ruling out that possibility.

    Is that a good enough reason for you? If you're certain your code will
    never be used on such machines, you can ignore the issue - in which case
    someone, not necessarily you, will pay the price if your certainty turns
    out to be unjustified. If you do this, you'll have plenty of company.
     
    James Kuyper, Aug 28, 2012
    #12
  13. James Kuyper <> writes:
    > On 08/28/2012 03:06 PM, Adrian Ratnapala wrote:

    [...]
    >> Is there really any reason to cast an (int *) into a (void *). The only
    >> reason I could imagine is to keep your compiler quiet.

    >
    > The standard says that the behavior is undefined. It does so because
    > there are real platforms out there where void* is incompatible with
    > int*.

    [...]

    Strictly speaking, void* and int* are incompatible types regardless of
    the platform (the standard has its own definition of compatibility for
    types). The point, of course, is that they can have different
    representations.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Aug 28, 2012
    #13
  14. James Kuyper Guest

    On 08/28/2012 05:11 PM, Keith Thompson wrote:
    > James Kuyper <> writes:
    >> On 08/28/2012 03:06 PM, Adrian Ratnapala wrote:

    > [...]
    >>> Is there really any reason to cast an (int *) into a (void *). The only
    >>> reason I could imagine is to keep your compiler quiet.

    >>
    >> The standard says that the behavior is undefined. It does so because
    >> there are real platforms out there where void* is incompatible with
    >> int*.

    > [...]
    >
    > Strictly speaking, void* and int* are incompatible types regardless of
    > the platform (the standard has its own definition of compatibility for
    > types). The point, of course, is that they can have different
    > representations.


    I should have said "have incompatible representations". I didn't mean to
    bring C's definition of compatible types into the discussion.
    Incompatible representations is a sufficient, but not necessary,
    condition for types to be incompatible.
     
    James Kuyper, Aug 28, 2012
    #14
  15. In comp.lang.c, you wrote:
    >>
    >> Is there really any reason to cast an (int *) into a (void *). The only
    >> reason I could imagine is to keep your compiler quiet.

    >
    > The standard says that the behavior is undefined. It does so because
    > there are real platforms out there where void* is incompatible with
    > int*. For instance, on platforms where the word size is larger than than

    <snip>

    > Is that a good enough reason for you? If you're certain your code will

    That's a good enough reason.

    I had mistakenly thought the (void *) requirement was mere pedanticism,
    because I thought pointers always had the same size unless explictly
    modified by something like "near/far". And I agree "always, on the
    machines you know about is a far, far cry from "always".

    BTW: I don't know if my misconception is "frequent" enough to warrant an
    FAQ change, but it doesn't look like the FAQ covers this. Contra Eric,
    5.17 is about something else; 4.17 / 19.40 are closer but not quite
    relevant.
     
    Adrian Ratnapala, Aug 29, 2012
    #15
  16. James Kuyper Guest

    On 08/29/2012 03:20 AM, Adrian Ratnapala wrote:
    > In comp.lang.c, you wrote:
    >>>
    >>> Is there really any reason to cast an (int *) into a (void *). The only
    >>> reason I could imagine is to keep your compiler quiet.

    >>
    >> The standard says that the behavior is undefined. It does so because
    >> there are real platforms out there where void* is incompatible with
    >> int*. For instance, on platforms where the word size is larger than than

    > <snip>
    >
    >> Is that a good enough reason for you? If you're certain your code will

    > That's a good enough reason.
    >
    > I had mistakenly thought the (void *) requirement was mere pedanticism,


    Almost every seemly pedantic issue associated with the standard exists
    because there are implementations where the issue is relevant, or at
    least there were such implementations at the time the rule was created.
    There's a few that exist just because the committee didn't see any need
    to impose a requirement on future implementations just because every
    current implementation satisfies that requirement, but those are rare.

    > because I thought pointers always had the same size unless explictly
    > modified by something like "near/far". ...


    In general, pointers to different types can, in principle, have
    different representations, alignment requirements, or even sizes. The
    word-size issue is the only one I'm aware of that actually comes up on
    real implementations, but there's probably other reasons I'm unaware of.
    There's only a few exceptions:

    Pointers to void have the same representation and alignment requirements
    as pointers to character types (6.2.5p268).

    Identically-qualified pointers to compatible types are compatible
    (6.7.6.1p2), and pointer types that differ only in the way that they are
    qualified have the same representation and alignment (6.2.5p26).

    > ... And I agree "always, on the
    > machines you know about is a far, far cry from "always".


    Good - I'd thought your comment reflected a more casual attitude toward
    portability issues. The tone of my reply reflected my disdain for people
    with such attitudes; I apologize for that.

    > BTW: I don't know if my misconception is "frequent" enough to warrant an
    > FAQ change, but it doesn't look like the FAQ covers this. Contra Eric,
    > 5.17 is about something else;


    5.17 is also about other things, but it mentions systems which use
    multiple different (and presumably incompatible) pointer types: the
    Eclipse MV series from Data General, the HP 3000, "several of the
    machines above" (which is a little vague), and some 64-bit Cray
    machines. All three of the specific example involve word pointers vs.
    byte pointers.
    --
    James Kuyper
     
    James Kuyper, Aug 29, 2012
    #16

  17. >> BTW: I don't know if my misconception is "frequent" enough to warrant an
    >> FAQ change, but it doesn't look like the FAQ covers this. Contra Eric,
    >> 5.17 is about something else;

    >
    > 5.17 is also about other things, but it mentions systems which use
    > multiple different (and presumably incompatible) pointer types: the


    All true, but it doesn't directly tackle my error, only the consequences
    of it. That's fine in an FAQ, since it could be that many people are
    confused about NULL pointers while not so many have specific wrong ideas
    about pointer representations.

    On the other hand, if my error *is* common. Or at least if people often
    ask that turn out to involve pointer representations, then the FAQ
    should probably include something similar to what you just wrote.
     
    Adrian Ratnapala, Aug 29, 2012
    #17
  18. Eric Sosman Guest

    On 8/29/2012 3:20 AM, Adrian Ratnapala wrote:
    > [... about converting int* to void* for "%p" conversion ...]
    > BTW: I don't know if my misconception is "frequent" enough to warrant an
    > FAQ change, but it doesn't look like the FAQ covers this. Contra Eric,
    > 5.17 is about something else; 4.17 / 19.40 are closer but not quite
    > relevant.


    5.17 is mostly about representations of null pointers, but
    not entirely. If you'll read the entire thing, you'll find two
    specific mentions of different pointer formats, plus a link to
    further examples.

    14.17 and 19.40(d) are about `near' and `far', whose
    connection to the matter at hand eludes me.

    --
    Eric Sosman
    d
     
    Eric Sosman, Aug 29, 2012
    #18
  19. Eric Sosman <> writes:
    > On 8/29/2012 3:20 AM, Adrian Ratnapala wrote:
    >> [... about converting int* to void* for "%p" conversion ...]
    >> BTW: I don't know if my misconception is "frequent" enough to warrant an
    >> FAQ change, but it doesn't look like the FAQ covers this. Contra Eric,
    >> 5.17 is about something else; 4.17 / 19.40 are closer but not quite
    >> relevant.

    >
    > 5.17 is mostly about representations of null pointers, but
    > not entirely. If you'll read the entire thing, you'll find two
    > specific mentions of different pointer formats, plus a link to
    > further examples.
    >
    > 14.17 and 19.40(d) are about `near' and `far', whose
    > connection to the matter at hand eludes me.


    `near` and `far`, which are of course non-standard, can result in
    different pointer types having different sizes. Passing a `near`
    or `far` pointer to printf with a "%p" format is likely to give
    bad results in practice.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Aug 29, 2012
    #19
  20. Keith Thompson wrote:
    > `near` and `far`, which are of course non-standard, can result in
    > different pointer types having different sizes. Passing a `near` or
    > `far` pointer to printf with a "%p" format is likely to give bad results
    > in practice.


    Using near or far pointers in any way at all will invoke an undefined
    behavior, so I think it's safe to say bad results are likely with or
    without attempts to printf() such pointers!
     
    Edward Rutherford, Aug 29, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kelvSYC

    Arrays and Pointers to Arrays

    kelvSYC, Sep 26, 2003, in forum: C Programming
    Replies:
    2
    Views:
    400
  2. Christopher Benson-Manica

    Pointer arithmetic involving NULL pointers

    Christopher Benson-Manica, Sep 10, 2004, in forum: C Programming
    Replies:
    22
    Views:
    988
    Dan Pop
    Sep 14, 2004
  3. Replies:
    1
    Views:
    672
    -berlin.de
    Mar 28, 2005
  4. joshc
    Replies:
    5
    Views:
    583
    Keith Thompson
    Mar 31, 2005
  5. Bernd Gaertner
    Replies:
    5
    Views:
    461
    Bernd Gaertner
    Nov 13, 2007
Loading...

Share This Page