Addressing multidimensional arrays

Discussion in 'C Programming' started by Baboon, Dec 11, 2008.

  1. Baboon

    Baboon Guest

    Suppose I have a multidimensional array, like so

    T arr[N][M];

    and in one place of my code it makes sense to access all N*M elements
    via one single index.

    Now, if I understand the C standard correctly, even though all elements
    are continuous in memory (they are, are they not?), indices can only ever
    be 0..N-1 and 0..M-1, respectively.

    But, if I have a pointer T *p pointing at arr[0][0], and I increment p
    N*M times, and access arr along the way, would that invoke UB? I mean,
    arr is still one single object, right?

    What if I use p with i=0..M*N-1 ?
    Baboon, Dec 11, 2008
    #1
    1. Advertising

  2. Baboon

    Guest

    On Dec 11, 3:06 am, Baboon <> wrote:
    > Suppose I have a multidimensional array, like so
    >
    > T arr[N][M];
    >
    > and in one place of my code it makes sense to access all N*M elements
    > via one single index.
    >
    > Now, if I understand the C standard correctly, even though all elements
    > are continuous in memory (they are, are they not?), indices can only ever
    > be 0..N-1 and 0..M-1, respectively.
    >
    > But, if I have a pointer T *p pointing at arr[0][0], and I increment p
    > N*M times, and access arr along the way, would that invoke UB? I mean,
    > arr is still one single object, right?


    It's UB. Here's a relevant discussion via google groups.
    <http://groups.google.com/group/comp.lang.c/browse_thread/thread/
    b413ac5f3d3214ed/584ff61646b8bab6>

    > What if I use p with i=0..M*N-1 ?


    p is presumably T (*)[M]. p is T [M]. You don't have 0 - M*N - 1
    such objects. (ie p doesn't point to that many objects).
    So it's plain wrong.
    , Dec 11, 2008
    #2
    1. Advertising

  3. Baboon

    Guest

    On Dec 11, 6:46 am, Barry Schwarz <> wrote:
    > On Wed, 10 Dec 2008 17:26:41 -0800 (PST), wrote:
    > >On Dec 11, 3:06 am, Baboon <> wrote:
    > >> Suppose I have a multidimensional array, like so

    >
    > >> T arr[N][M];

    >
    > >> and in one place of my code it makes sense to access all N*M elements
    > >> via one single index.

    >
    > >> Now, if I understand the C standard correctly, even though all elements
    > >> are continuous in memory (they are, are they not?), indices can only ever
    > >> be 0..N-1 and 0..M-1, respectively.

    >
    > >> But, if I have a pointer T *p pointing at arr[0][0], and I increment p
    > >> N*M times, and access arr along the way, would that invoke UB? I mean,
    > >> arr is still one single object, right?

    >
    > >> What if I use p with i=0..M*N-1 ?

    >
    > >p is presumably T (*)[M]. p is T [M]. You don't have 0 - M*N - 1

    >
    > What part of "I have a pointer T *p" leads you to presume that p is
    > anything other than a simple T*.


    I can't recall what it was that I was thinking of at the time, most
    likely I misread something. Sorry.
    , Dec 11, 2008
    #3
  4. Baboon

    Phil Carmody Guest

    "Jujitsu Lizard" <> writes:
    > "Baboon" <> wrote in message
    > news:...
    >>
    >> Suppose I have a multidimensional array, like so
    >>
    >> T arr[N][M];
    >>
    >> and in one place of my code it makes sense to access all N*M elements
    >> via one single index.
    >>
    >> Now, if I understand the C standard correctly, even though all elements
    >> are continuous in memory (they are, are they not?), indices can only ever
    >> be 0..N-1 and 0..M-1, respectively.
    >>
    >> But, if I have a pointer T *p pointing at arr[0][0], and I increment p
    >> N*M times, and access arr along the way, would that invoke UB? I mean,
    >> arr is still one single object, right?
    >>
    >> What if I use p with i=0..M*N-1 ?

    >
    > The trick you are suggesting is done every day of the week with
    > various arrays. If you have an array a[N][M], C is guaranteed to
    > store the M*N elements contiguously in row-major order.
    >
    > The link that the other responder posted cited various obscure
    > optimizations that might break things. This is true, but it rarely
    > happens in practice. The reason is that when people are messing with
    > arrays using a single index (one dimensional) when the array is
    > defined with multiple dimensions, what normally happens is that one
    > calls a function that accepts a T * pointer. It is rare that someone
    > defines an array a[N][M] and then uses a[N-1][M] or something like
    > this in that form. It normally doesn't occur that way in the code.
    >
    > This code should be fine:
    >
    > #define M 33
    > #define N 41
    >
    > long arr[N][M];
    >
    > void f(long *p)
    > {
    > unsigned i, j;
    >
    > for (i=0; i<N; i++)
    > {
    > for (j=0; j<M; j++)
    > {
    > p[i * M + j] = 0;
    > }
    > }
    > }
    >
    > int main(void)
    > {
    > f(arr);


    arr isn't a long*

    > return(0);
    > }
    >
    > This is done every day of the week.


    People write code that fails to compile without a diagnostic
    that's telling them they're doing something they shouldn't
    every day of the week?

    Quite probably true. Doesn't make it right.

    Phil
    --
    I tried the Vista speech recognition by running the tutorial. I was
    amazed, it was awesome, recognised every word I said. Then I said the
    wrong word ... and it typed the right one. It was actually just
    detecting a sound and printing the expected word! -- pbhj on /.
    Phil Carmody, Dec 11, 2008
    #4
  5. Baboon

    jameskuyper Guest

    Jujitsu Lizard wrote:
    > "Phil Carmody" <> wrote in message
    > news:...
    > >
    > >> This code should be fine:
    > >>
    > >> #define M 33
    > >> #define N 41
    > >>
    > >> long arr[N][M];
    > >>
    > >> void f(long *p)
    > >> {
    > >> unsigned i, j;
    > >>
    > >> for (i=0; i<N; i++)
    > >> {
    > >> for (j=0; j<M; j++)
    > >> {
    > >> p[i * M + j] = 0;
    > >> }
    > >> }
    > >> }
    > >>
    > >> int main(void)
    > >> {
    > >> f(arr);

    > >
    > > arr isn't a long*
    > >
    > >> return(0);
    > >> }
    > >>
    > >> This is done every day of the week.

    > >
    > > People write code that fails to compile without a diagnostic
    > > that's telling them they're doing something they shouldn't
    > > every day of the week?
    > >
    > > Quite probably true. Doesn't make it right.

    >
    > Phil,
    >
    > In this particular case, I'd be grateful if you could be more specific. Are
    > you saying it is wrong?


    You're passing an argument of type long (*)[M] to a function where the
    corresponding parameter is declared as 'long*'. This is not a
    conversion that can be done implicitly. Since there's a prototype for
    the function in scope, this is a constraint violation; if it was a non-
    prototyped declaration it would be undefined behavior. Even if you
    made the conversion explictly, the standard fails to specify where the
    result of the conversion points.

    You can avoid all of those problems by passing arr[0] rather than arr
    to the function. Then the only problem that you have left is indexing
    an array (arr[0]) beyond it's length (M). While a conforming
    implementation of C is certainly permitted to compile such code in a
    way that it formats your hard disk, this is a very popular bit of
    undefined behavior, and as a result most compilers will handle it the
    way you expect it to be handled.
    jameskuyper, Dec 11, 2008
    #5
  6. "Jujitsu Lizard" <> writes:

    > "Phil Carmody" <> wrote in message
    > news:...
    >>
    >>> This code should be fine:
    >>>
    >>> #define M 33
    >>> #define N 41
    >>>
    >>> long arr[N][M];
    >>>
    >>> void f(long *p)
    >>> {
    >>> unsigned i, j;
    >>>
    >>> for (i=0; i<N; i++)
    >>> {
    >>> for (j=0; j<M; j++)
    >>> {
    >>> p[i * M + j] = 0;
    >>> }
    >>> }
    >>> }
    >>>
    >>> int main(void)
    >>> {
    >>> f(arr);

    >>
    >> arr isn't a long*
    >>
    >>> return(0);
    >>> }
    >>>
    >>> This is done every day of the week.

    >>
    >> People write code that fails to compile without a diagnostic
    >> that's telling them they're doing something they shouldn't
    >> every day of the week?
    >>
    >> Quite probably true. Doesn't make it right.

    >
    > In this particular case, I'd be grateful if you could be more specific. Are
    > you saying it is wrong?


    "Wrong" is a tricky word in C. Your code requires a diagnostic since
    it contains a constraint violation. This is, in one sense at least,
    as wrong as you can get, but the code might well work anyway. To be
    clear about it, arr converts to long (*)[33] not long * and there is
    no implicit conversion between these two types.

    There are lots of alternatives. You could pass &arr[0][0], arr[0] or
    even (void *)arr. All of these expressions will convert to a pointer
    of the right type, but there is a subtle question -- how much can be
    added to the resulting pointer inside f whilst staying within the
    permissions granted by 6.5.6 p8?

    When a pointer points to an array, you are permitted to use + and - to
    construct pointers that lie within (and "one past the end") of that
    array. An argument can be made that passing (void *)arr is the safest
    option despite it looking odd. If you pass arr[0] assignments to
    p[33] and higher will be undefined as far as standard C is concerned
    although most implementations will use more relaxed semantics.
    Because of the definition of [], * and & there is no real difference
    between passing &arr[0][0] and passing arr[0].

    --
    Ben.
    Ben Bacarisse, Dec 11, 2008
    #6
  7. "Jujitsu Lizard" <> writes:
    > "Phil Carmody" <> wrote in message
    > news:...
    >>
    >>> This code should be fine:
    >>>
    >>> #define M 33
    >>> #define N 41
    >>>
    >>> long arr[N][M];
    >>>
    >>> void f(long *p)
    >>> {
    >>> unsigned i, j;
    >>>
    >>> for (i=0; i<N; i++)
    >>> {
    >>> for (j=0; j<M; j++)
    >>> {
    >>> p[i * M + j] = 0;
    >>> }
    >>> }
    >>> }
    >>>
    >>> int main(void)
    >>> {
    >>> f(arr);

    >>
    >> arr isn't a long*
    >>
    >>> return(0);
    >>> }
    >>>
    >>> This is done every day of the week.

    >>
    >> People write code that fails to compile without a diagnostic
    >> that's telling them they're doing something they shouldn't
    >> every day of the week?
    >>
    >> Quite probably true. Doesn't make it right.

    >
    > Phil,
    >
    > In this particular case, I'd be grateful if you could be more specific. Are
    > you saying it is wrong?


    I think he's saying it contains a constraint violation.

    When I compile your code, I get the following diagnostic on
    the call "f(arr);":

    c.c: In function 'main':
    c.c:21: warning: passing argument 1 of 'f' from incompatible pointer type

    Your function f expects an argument of type long*. You pass it an
    argument of type long(*)[N]. My compiler merely issued a warning, but
    it would have been perfectly legal for it to reject the program.

    You can avoid the constraint violation by changing the call to either
    f(arr[0]);
    or
    f(&arr[0][0]);

    But then you're invoking undefined behavior in f, though it's likely
    to behave as you expect under most if not all implementations. Most
    compilers will allow you to treat a two-dimensional array as a
    one-dimensional array, but they're not required to do so. A compiler
    that generates code that performs bounds checking, or whose
    optimization phase is particularly aggressive, might result in
    something other than the behavior you expect.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Dec 11, 2008
    #7
  8. Baboon

    James Kuyper Guest

    Jujitsu Lizard wrote:
    > "Keith Thompson" <> wrote in message
    > news:...
    >>
    >> You can avoid the constraint violation by changing the call to either
    >> f(arr[0]);
    >> or
    >> f(&arr[0][0]);

    >
    > The second form suggested would be my preferred one. Thanks for the
    > heads up.
    >
    >> But then you're invoking undefined behavior in f, though it's likely
    >> to behave as you expect under most if not all implementations. Most
    >> compilers will allow you to treat a two-dimensional array as a
    >> one-dimensional array, but they're not required to do so. A compiler
    >> that generates code that performs bounds checking, or whose
    >> optimization phase is particularly aggressive, might result in
    >> something other than the behavior you expect.

    ....
    > The "function boundary" is "long *". The function should work correctly
    > with a pointer to a long. That implies that array operations on a
    > pointer to a long should accept any unsigned integer and generate a
    > correct memory reference.


    You may desire such a feature; that's not how C works. Unsigned integers
    and pointers are different types. While any integer can be converted to
    a pointer, the conversion never occurs implicitly, it must always be
    explicitly requested with a cast. Per 6.3.2.3p5, even when you do use a
    cast to force the conversion, "Except as previously specified, the
    result is implementation-defined, might not be correctly aligned, might
    not point to an entity of the referenced type, and might be a trap
    representation."

    > The compiler has to honor the function boundary. The function accepts a
    > pointer to long. It should treat it as if it could be ANY pointer to long.


    Not quite. For instance, consider the following example:

    long arr[3][5];

    int add_arr(int *p, int *q)
    {
    for(int row=0; row<3; row++)
    for(int col=0; col<3; col++)
    arr[row][col] += p[12];
    }

    Now, in the general case, an implementation is required to generate code
    for functions like add_arr that works correctly even if p points at a
    location within "arr". "Correctly" means, in this case, that if one of
    += operations changes the value referred to by p[12], then the next pass
    through the loop should use the updated value.

    However, in this particular case, there's no legal way to pass a value
    'p' to add_arr such that p[12] points to an element of arr. That's
    because the largest amount that you can add to any pointer that points
    to any element of 'arr' is 5; add more than that and you're in a
    different sub-array of arr, violating

    As a result, an implementation is free to perform an optimization that
    reads the value of p[12] just once, and keeps that value in a register
    for the entire loop.

    That's a rather contrived example, but its the best I could com up with
    right now.

    > The key issue in my mind, which might be crackpot, is whether the
    > compiler is allowed to optimize WITHIN a function based on what it
    > believes it knows about all function invocations.


    If it believes it correctly, yes. If the only way you can create a
    situation which doesn't match that belief is by writing code that has
    undefined behavior, it is permitted to make such optimizations - the
    undefined behavior will take the form of having those optimizations fail.
    James Kuyper, Dec 12, 2008
    #8
  9. Baboon

    jameskuyper Guest

    Jujitsu Lizard wrote:
    > "James Kuyper" <> wrote in message
    > news:XQl0l.1196$...
    > > Jujitsu Lizard wrote:
    > >> "Keith Thompson" <> wrote in message
    > >> news:...
    > >>>
    > >>> You can avoid the constraint violation by changing the call to either
    > >>> f(arr[0]);
    > >>> or
    > >>> f(&arr[0][0]);
    > >>
    > >> The second form suggested would be my preferred one. Thanks for the
    > >> heads up.
    > >>
    > >>> But then you're invoking undefined behavior in f, though it's likely
    > >>> to behave as you expect under most if not all implementations. Most
    > >>> compilers will allow you to treat a two-dimensional array as a
    > >>> one-dimensional array, but they're not required to do so. A compiler
    > >>> that generates code that performs bounds checking, or whose
    > >>> optimization phase is particularly aggressive, might result in
    > >>> something other than the behavior you expect.

    ....
    > What I was trying to say is that if you have
    >
    > long arr[3][5];
    >
    > and you call a function defined as
    >
    > void myfunc(long *q)
    >
    > and you call it as
    >
    > myfunc(&(arr[0][0]));
    >
    > within that function it does something like:
    >
    > q[14] = 0;
    >
    > that should work per the definition of C.


    "The definition of C" is the C standard, which specifies that this
    code has undefined behavior. Therefore, "work per the definition of C"
    doesn't tell you much about what such a program may do. Any behavior
    "works per the definition of C", including aborting your program.

    > ... In this case the "function
    > boundary" consists a pointer to a long. I don't believe the compiler may
    > make assumptions about _which_ pointer to long may be passed or about how
    > large the index on q[] might be. But I have to look this up.


    q[14] is defined as being equivalent to *(q+14). When an object is
    part of an array, the rules of pointer arithmetic (6.5.6p8) define
    what the range of integer values is that can be added to a pointer to
    that object. Those rules are based upon the size of the array and the
    position of the object within that array. An object that is not a part
    of an array is treated, for the purpose of these rules, as if it were
    the only element of a 1-element array.

    The key point is, those rules cannot be interpreted, in this context,
    as referring to the array 'arr'. If such an interpretation were
    possible, then the part of those rules which tells you where q+14
    points to, would mean that it points at arr[14], which is nonsense.
    The array that those rules refer to must have an element type that is
    the same as the type pointed at by the pointer. The only array
    containing arr[0][0] with that property is arr[0], not arr itself. The
    element type of arr is char[5], not char. arr[0] has a length of 5,
    not a length of 15.

    Therefore, in order to add an integer value 'i' to &arr[0][0], what
    those rules require is that 0<=i && i<=5. Otherwise, the behavior is
    undefined. Those rules also specify that, while you can add 5 to &arr
    [0][0], any attempt to dereference that pointer has undefined
    behavior.

    > >> The compiler has to honor the function boundary. The function accepts a
    > >> pointer to long. It should treat it as if it could be ANY pointer to
    > >> long.

    > >
    > > Not quite. For instance, consider the following example:
    > >
    > > long arr[3][5];
    > >
    > > int add_arr(int *p, int *q)


    The parameter "q was added at an intermediate stage when I was
    thinking about demonstrating this issue in a different fashion. I
    should have removed it when I changed my mind. Please ignore it.

    > > {
    > > for(int row=0; row<3; row++)
    > > for(int col=0; col<3; col++)
    > > arr[row][col] += p[12];
    > > }
    > >
    > > Now, in the general case, an implementation is required to generate code
    > > for functions like add_arr that works correctly even if p points at a
    > > location within "arr". "Correctly" means, in this case, that if one of +=
    > > operations changes the value referred to by p[12], then the next pass
    > > through the loop should use the updated value.
    > >
    > > However, in this particular case, there's no legal way to pass a value 'p'
    > > to add_arr such that p[12] points to an element of arr. That's because the
    > > largest amount that you can add to any pointer that points to any element
    > > of 'arr' is 5; add more than that and you're in a different sub-array of
    > > arr, violating


    Sorry: that should have ended with the citation of 6.5.6p8.

    > > As a result, an implementation is free to perform an optimization that
    > > reads the value of p[12] just once, and keeps that value in a register for
    > > the entire loop.
    > >
    > > That's a rather contrived example, but its the best I could com up with
    > > right now.

    ....
    > I do like your example. I understand it.
    >
    > If what you say is correct, I have a lot of reading to do.
    >
    > Personally, I wouldn't think twice about using the function above in the way
    > you indicated is a violation. If you are correct, that means I'm dangerous
    > and I need to retrain myself. I'm not being sarcastic there--I'm serious.
    >
    > My mental model of arrays in C is that they are defined to be in row-major
    > order JUST SO THAT YOU CAN MONKEY WITH THEM IN UNINTENDED WAYS. My
    > understanding of C is that every array is really one-dimensional and the
    > possibility of multidimensional arrays is just for programmer convenience.


    That last part, at least, is correct. In this case, 'arr' is a one-
    dimensional array whose element type is char[5]. arr[0] is a one-
    dimensional array whose element type is char. The standard refers to
    arr as a 'multidimensional' array, but the way C handles such arrays
    means that they are really just arrays of arrays. What difference does
    that distinction make? In languages with true multidimensional arrays,
    it's just as easy (syntactically, at least) to refer to a column of an
    array as to a row. There's no C syntax for referring to a column of
    arr.
    jameskuyper, Dec 12, 2008
    #9
  10. Baboon

    Tim Rentsch Guest

    Ben Bacarisse <> writes:

    > "Jujitsu Lizard" <> writes:
    >
    > > "Phil Carmody" <> wrote in message
    > > news:...
    > >>
    > >>> This code should be fine:
    > >>>
    > >>> #define M 33
    > >>> #define N 41
    > >>>
    > >>> long arr[N][M];
    > >>>
    > >>> void f(long *p)
    > >>> {
    > >>> unsigned i, j;
    > >>>
    > >>> for (i=0; i<N; i++)
    > >>> {
    > >>> for (j=0; j<M; j++)
    > >>> {
    > >>> p[i * M + j] = 0;
    > >>> }
    > >>> }
    > >>> }
    > >>>
    > >>> int main(void)
    > >>> {
    > >>> f(arr);
    > >>
    > >> arr isn't a long*
    > >>
    > >>> return(0);
    > >>> }
    > >>>
    > >>> This is done every day of the week.
    > >>
    > >> People write code that fails to compile without a diagnostic
    > >> that's telling them they're doing something they shouldn't
    > >> every day of the week?
    > >>
    > >> Quite probably true. Doesn't make it right.

    > >
    > > In this particular case, I'd be grateful if you could be more specific. Are
    > > you saying it is wrong?

    >
    > "Wrong" is a tricky word in C. Your code requires a diagnostic since
    > it contains a constraint violation. This is, in one sense at least,
    > as wrong as you can get, but the code might well work anyway. To be
    > clear about it, arr converts to long (*)[33] not long * and there is
    > no implicit conversion between these two types.
    >
    > There are lots of alternatives. You could pass &arr[0][0], arr[0] or
    > even (void *)arr. All of these expressions will convert to a pointer
    > of the right type, but there is a subtle question -- how much can be
    > added to the resulting pointer inside f whilst staying within the
    > permissions granted by 6.5.6 p8?
    >
    > When a pointer points to an array, you are permitted to use + and - to
    > construct pointers that lie within (and "one past the end") of that
    > array. An argument can be made that passing (void *)arr is the safest
    > option despite it looking odd. [...]


    Wouldn't you agree that there's a (marginally) better argument that
    the safest option is to use (void*) &arr instead? There seems no
    question that this pointer may be used to access any location within
    the confines of the object arr (or to point to the first location
    past it).
    Tim Rentsch, Dec 22, 2008
    #10
  11. Tim Rentsch <> writes:

    > Ben Bacarisse <> writes:
    >
    >> "Jujitsu Lizard" <> writes:
    >>
    >> > "Phil Carmody" <> wrote in message
    >> > news:...

    <snip>
    >> >>> #define M 33
    >> >>> #define N 41
    >> >>>
    >> >>> long arr[N][M];

    <snip>
    >> There are lots of alternatives. You could pass &arr[0][0], arr[0] or
    >> even (void *)arr. All of these expressions will convert to a pointer
    >> of the right type, but there is a subtle question -- how much can be
    >> added to the resulting pointer inside f whilst staying within the
    >> permissions granted by 6.5.6 p8?
    >>
    >> When a pointer points to an array, you are permitted to use + and - to
    >> construct pointers that lie within (and "one past the end") of that
    >> array. An argument can be made that passing (void *)arr is the safest
    >> option despite it looking odd. [...]

    >
    > Wouldn't you agree that there's a (marginally) better argument that
    > the safest option is to use (void*) &arr instead? There seems no
    > question that this pointer may be used to access any location within
    > the confines of the object arr (or to point to the first location
    > past it).


    Yes I would. Reading what I wrote again, (void *)&arr feels what I
    /intended/ to write but for some reason did not.

    --
    Ben.
    Ben Bacarisse, Dec 23, 2008
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. d[ - - ]b

    Multidimensional arrays? anything else?

    d[ - - ]b, May 18, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    1,664
    AlexS
    May 18, 2004
  2. Jay
    Replies:
    1
    Views:
    2,537
    BarryNL
    Jan 30, 2004
  3. geclinke
    Replies:
    1
    Views:
    5,028
    jackie
    Jun 18, 2004
  4. Philipp
    Replies:
    21
    Views:
    1,101
    Philipp
    Jan 20, 2009
  5. Francesco
    Replies:
    2
    Views:
    1,083
    Francesco
    Nov 6, 2009
Loading...

Share This Page