Contiguous allocation of multi-dimensional vector

Discussion in 'C++' started by nw, Aug 9, 2007.

  1. nw

    nw Guest

    Hi,

    We've been having a discussion at work and I'm wondering if anyone
    here would care to offer an opinion or alternative solution. Aparently
    in the C programming HPC community it is common to allocate
    multidimentional arrays like so:

    int *v_base = (int *) malloc(1000000*sizeof(int));
    int **v = (int **) malloc(1000*sizeof(int *));

    for(int i=0;i<1000;i++) {
    v = v_base+(i*1000);
    }

    The logic behind this is that the entire multi-dimensional array will
    be contiguous in memory and therefore it is more likely that the whole
    array will be loaded in to cache, improving access time.

    The following workaround I believe will simulate similar behaviour
    using the STL:

    vector<int> v_base(1000000);
    vector<int *> v;

    for(int i=0;i<1000;i++) {
    v.push_back(&v_base[0]+(i*1000));
    }

    However will obviously break if you use insert or remove anywhere, and
    doesn't seem like a great idea. The standard way of allocating this
    two dimensional array as I understand it is:

    vector<vector<int> > v(1000,vector<int>(1000));

    So I guess my question is this: Is there any advantage to allocating
    the vector contiguously or should the normal STL allocation method
    place the vectors contiguously in memory? Is there likely to be much
    vector overhead placed between the 1d vector<int>s such as the size of
    the array?

    My gut feeling is that the normal STL way will produce similar
    results, and the tests I've done seem to back this up, but I'd be
    interested in hearing other peoples opinions.

    Any suggestions gratefully received and my apologies if this has been
    asked previously or in the FAQ and I missed it.
     
    nw, Aug 9, 2007
    #1
    1. Advertising

  2. nw

    joe Guest

    > From: nw
    >
    >
    > Hi,
    >
    > We've been having a discussion at work and I'm wondering if anyone
    > here would care to offer an opinion or alternative solution. Aparently
    > in the C programming HPC community it is common to allocate
    > multidimentional arrays like so:
    >
    > int *v_base = (int *) malloc(1000000*sizeof(int));
    > int **v = (int **) malloc(1000*sizeof(int *));
    >
    > for(int i=0;i<1000;i++) {
    > v = v_base+(i*1000);
    > }


    Hmmmm, back in the day, I wouldn't have bothered with v. Any cell can
    be
    directly calculated for an array of v[HEIGHT, WIDTH] with WIDTH * (row
    - 1)
    + column. Of course, memory was at a premium.

    >
    > The logic behind this is that the entire multi-dimensional array will
    > be contiguous in memory and therefore it is more likely that the whole
    > array will be loaded in to cache, improving access time.


    and there is only one allocation per array (malloc/free can be pretty
    time
    consuming). Of course, on the other side of the fence, it may be
    easier to
    come up to 1000 4000 byte blocks than one 4000000 byte block.

    >
    > The following workaround I believe will simulate similar behaviour
    > using the STL:
    >
    > vector<int> v_base(1000000);
    > vector<int *> v;
    >
    > for(int i=0;i<1000;i++) {
    > v.push_back(&v_base[0]+(i*1000));
    > }
    >
    > However will obviously break if you use insert or remove anywhere, and
    > doesn't seem like a great idea. The standard way of allocating this
    > two dimensional array as I understand it is:


    This is a non-issue because you would never insert or delete in a
    multidimensional
    array unless you were explicitly supporting ragged arrays and that is
    a
    different sort of problem.

    >
    > vector<vector<int> > v(1000,vector<int>(1000));
    >
    > So I guess my question is this: Is there any advantage to allocating
    > the vector contiguously or should the normal STL allocation method
    > place the vectors contiguously in memory? Is there likely to be much
    > vector overhead placed between the 1d vector<int>s such as the size of
    > the array?
    >
    > My gut feeling is that the normal STL way will produce similar
    > results, and the tests I've done seem to back this up, but I'd be
    > interested in hearing other peoples opinions.
    >
    > Any suggestions gratefully received and my apologies if this has been
    > asked previously or in the FAQ and I missed it.


    The best answer would depend upon what you are using the arrays for.
    The
    problem with a vector of vectors is that you lose a lot of locality of
    reference
    and therefore may well have more cache misses; there is another layer
    of indirection; and since insert/erase is available, you have the
    possibility
    of turning your multidimensional array into a ragged array without
    even
    half trying.

    The good news is that this is C++ and you can wrap whatever logic you
    choose
    in a class and only allow the operations which make sense.
    Personally, I
    would probably start off with something like:

    template <typename T, size_t N, size_t M>
    struct marray
    {
    typedef size_t size_type;
    typedef T value_type;
    typedef T& reference;
    typedef const T& const_reference;

    reference at(size_type x, size_type y)
    {
    return m_array[x,y];
    }

    const_reference at(size_type x, size_type y) const
    {
    return m_array[x,y];
    }

    private:
    T m_array[N,M];
    };

    And flesh it out with what I needed (iterators, row_iterators, etc).
    Then I
    could choose whether I wanted it dynamic or not and it would be a
    single
    allocation either way. If I needed a ragged array, well that would be
    different.

    joe
     
    joe, Aug 9, 2007
    #2
    1. Advertising

  3. nw

    BobR Guest

    nw <> wrote in message...
    >
    > However will obviously break if you use insert or remove anywhere, and
    > doesn't seem like a great idea. The standard way of allocating this
    > two dimensional array as I understand it is:
    >
    > vector<vector<int> > v(1000,vector<int>(1000));


    Or:

    {// ------------
    std::vector< std::vector< int > > Array2( 3, 3 );

    for( std::size_t x(0); x < Array2.size(); ++x ){
    std::cout<<"array.at("<<x<<") address="
    <<std::hex<<&Array2.at(x)
    <<std::dec<<std::endl;
    for( std::size_t y(0); y < Array2.at( x ).size(); ++y ){
    std::cout<<" array.at("<<x<<").at("<<y
    <<") address="<<std::hex<<&Array2.at(x).at(y)
    <<std::dec<<std::endl;
    } // for(y)
    } // for(x)
    }// ------------

    To answer all your questions: test and profile it.

    --
    Bob R
    POVrookie
     
    BobR, Aug 9, 2007
    #3
  4. nw

    terminator Guest

    On Aug 9, 10:07 pm, joe <> wrote:
    > > From: nw

    >
    > > Hi,

    >
    > > We've been having a discussion at work and I'm wondering if anyone
    > > here would care to offer an opinion or alternative solution. Aparently
    > > in the C programming HPC community it is common to allocate
    > > multidimentional arrays like so:

    >
    > > int *v_base = (int *) malloc(1000000*sizeof(int));
    > > int **v = (int **) malloc(1000*sizeof(int *));

    >
    > > for(int i=0;i<1000;i++) {
    > > v = v_base+(i*1000);
    > > }

    >
    > Hmmmm, back in the day, I wouldn't have bothered with v. Any cell can
    > be
    > directly calculated for an array of v[HEIGHT, WIDTH] with WIDTH * (row
    > - 1)
    > + column. Of course, memory was at a premium.
    >
    >
    >
    > > The logic behind this is that the entire multi-dimensional array will
    > > be contiguous in memory and therefore it is more likely that the whole
    > > array will be loaded in to cache, improving access time.

    >
    > and there is only one allocation per array (malloc/free can be pretty
    > time
    > consuming). Of course, on the other side of the fence, it may be
    > easier to
    > come up to 1000 4000 byte blocks than one 4000000 byte block.
    >
    >
    >
    > > The following workaround I believe will simulate similar behaviour
    > > using the STL:

    >
    > > vector<int> v_base(1000000);
    > > vector<int *> v;

    >
    > > for(int i=0;i<1000;i++) {
    > > v.push_back(&v_base[0]+(i*1000));
    > > }

    >
    > > However will obviously break if you use insert or remove anywhere, and
    > > doesn't seem like a great idea. The standard way of allocating this
    > > two dimensional array as I understand it is:

    >
    > This is a non-issue because you would never insert or delete in a
    > multidimensional
    > array unless you were explicitly supporting ragged arrays and that is
    > a
    > different sort of problem.
    >
    >
    >
    > > vector<vector<int> > v(1000,vector<int>(1000));

    >
    > > So I guess my question is this: Is there any advantage to allocating
    > > the vector contiguously or should the normal STL allocation method
    > > place the vectors contiguously in memory? Is there likely to be much
    > > vector overhead placed between the 1d vector<int>s such as the size of
    > > the array?

    >
    > > My gut feeling is that the normal STL way will produce similar
    > > results, and the tests I've done seem to back this up, but I'd be
    > > interested in hearing other peoples opinions.

    >
    > > Any suggestions gratefully received and my apologies if this has been
    > > asked previously or in the FAQ and I missed it.

    >
    > The best answer would depend upon what you are using the arrays for.
    > The
    > problem with a vector of vectors is that you lose a lot of locality of
    > reference
    > and therefore may well have more cache misses; there is another layer
    > of indirection; and since insert/erase is available, you have the
    > possibility
    > of turning your multidimensional array into a ragged array without
    > even
    > half trying.
    >
    > The good news is that this is C++ and you can wrap whatever logic you
    > choose
    > in a class and only allow the operations which make sense.
    > Personally, I
    > would probably start off with something like:
    >
    > template <typename T, size_t N, size_t M>
    > struct marray
    > {
    > typedef size_t size_type;
    > typedef T value_type;
    > typedef T& reference;
    > typedef const T& const_reference;
    >
    > reference at(size_type x, size_type y)
    > {
    > return m_array[x,y];
    > }
    >
    > const_reference at(size_type x, size_type y) const
    > {
    > return m_array[x,y];
    > }
    >
    > private:
    > T m_array[N,M];
    >
    > };
    >
    > And flesh it out with what I needed (iterators, row_iterators, etc).
    > Then I
    > could choose whether I wanted it dynamic or not and it would be a
    > single
    > allocation either way. If I needed a ragged array, well that would be
    > different.
    >
    > joe- Hide quoted text -


    no double subscription in c++([x,y]).one must use two single
    subscriptions([x][y]) or you are discusing conceptually(not
    syntactically) and [x,y] is considered a function(not a c++
    operator).
    I think variadic templates(to come) with variable number of unsigned
    argument will solve the problem with no overhead(codding time/run time/
    run memory):

    template<typename T,unsigned args...> multiDimArr;

    regards,
    FM.
     
    terminator, Aug 9, 2007
    #4
  5. nw

    joe Guest

    On Aug 9, 5:51 pm, terminator <> wrote:

    > no double subscription in c++([x,y]).one must use two single
    > subscriptions([x][y]) or you are discusing conceptually(not
    > syntactically) and [x,y] is considered a function(not a c++
    > operator).


    Arg!! I know that, somehow it didn't get typed that way though.

    > I think variadic templates(to come) with variable number of unsigned
    > argument will solve the problem with no overhead(codding time/run time/
    > run memory):
    >
    > template<typename T,unsigned args...> multiDimArr;
    >


    You are correct, but it will be many years before that gets out into
    the mainstream,
    so I wouldn't be writing too much code like that.

    joe
     
    joe, Aug 10, 2007
    #5
  6. nw

    James Kanze Guest

    On Aug 9, 5:49 pm, nw <> wrote:

    > We've been having a discussion at work and I'm wondering if anyone
    > here would care to offer an opinion or alternative solution. Aparently
    > in the C programming HPC community it is common to allocate
    > multidimentional arrays like so:


    > int *v_base = (int *) malloc(1000000*sizeof(int));
    > int **v = (int **) malloc(1000*sizeof(int *));


    > for(int i=0;i<1000;i++) {
    > v = v_base+(i*1000);
    > }


    > The logic behind this is that the entire multi-dimensional array will
    > be contiguous in memory and therefore it is more likely that the whole
    > array will be loaded in to cache, improving access time.


    > The following workaround I believe will simulate similar behaviour
    > using the STL:


    > vector<int> v_base(1000000);
    > vector<int *> v;


    > for(int i=0;i<1000;i++) {
    > v.push_back(&v_base[0]+(i*1000));
    > }


    > However will obviously break if you use insert or remove anywhere, and
    > doesn't seem like a great idea.


    You mean that it makes it impossible for v[0] to have a
    different number of elements than v[1]. Depending on the
    application, that could be an advantage, rather than a
    disadvantage.

    > The standard way of allocating this two dimensional array as I
    > understand it is:


    > vector<vector<int> > v(1000,vector<int>(1000));


    It depends on the use. The *standard* way of handling
    multidimensional arrays in C++ is to write your own class to do
    so, with an overloaded operator[] which returns a "proxy" on
    which operator[] is also defined. (Note that in this case, int*
    would be an adequate proxy.) Then, you can do pretty much
    whatever works in the implementation; I'd probably just use a
    one dimension vector, and calculate the indexes.

    > So I guess my question is this: Is there any advantage to allocating
    > the vector contiguously


    There certainly could be, for some applications. To begin with,
    you'll use less total memory---if the inner dimension is 1000,
    it's probably not distinctive, but for something like:
    std::vector< std::vector< int > > v( 1000000,
    std::vector< int >( 5 ) ) ;
    the difference could be enormous.

    > or should the normal STL allocation method
    > place the vectors contiguously in memory? Is there likely to be much
    > vector overhead placed between the 1d vector<int>s such as the size of
    > the array?


    Who knows? It could easily vary, even from one run to the next.

    > My gut feeling is that the normal STL way will produce similar
    > results, and the tests I've done seem to back this up, but I'd be
    > interested in hearing other peoples opinions.


    The real question is whether whatever method you feel most
    comfortable with is fast enough. If so, don't bother looking
    any further. Personally, for a mathematical matrix, I prefer
    the single dimension array, calculating the index. It makes it
    impossible to violate the invariant that all of the sub-arrays
    have the same size. But if that caused performance problems
    (e.g. because multiplication was very slow on my machine), I'd
    try something else.

    Just make sure that whatever you actually do is behind the
    firewall of a class definition, so you can change it at will
    without affecting client code.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Aug 11, 2007
    #6
  7. nw

    terminator Guest

    On Aug 10, 2:59 pm, joe <> wrote:
    > On Aug 9, 5:51 pm, terminator <> wrote:
    >
    > > no double subscription in c++([x,y]).one must use two single
    > > subscriptions([x][y]) or you are discusing conceptually(not
    > > syntactically) and [x,y] is considered a function(not a c++
    > > operator).

    >
    > Arg!! I know that, somehow it didn't get typed that way though.
    >
    > > I think variadic templates(to come) with variable number of unsigned
    > > argument will solve the problem with no overhead(codding time/run time/
    > > run memory):

    >
    > > template<typename T,unsigned args...> multiDimArr;

    >
    > You are correct, but it will be many years before that gets out into
    > the mainstream,
    > so I wouldn't be writing too much code like that.
    >
    > joe


    if your compiler is elegant in recursive templates you can:

    template <typename elem,unsigned sz >
    struct array{
    typedef elem Telement;
    enum{
    Nsize=sz,
    dim=1
    };
    ...//etc
    };

    template<typename arr, unsigned n>
    struct multi_array{
    typedef typename arr::Telement Telement;
    enum{
    Nsize=n*arr::Nsize,
    dim=arr::dim+1
    };
    ...//etc
    };

    multi_array< multi_array <array <int,10> ,11 > , 12 >
    arr3D_10_11_12int;

    regards,
    FM.
     
    terminator, Aug 14, 2007
    #7
  8. nw

    joe Guest

    On Aug 14, 1:08 pm, terminator <> wrote:
    >
    > if your compiler is elegant in recursive templates you can:
    >

    [crafty recursive template removed]

    Quite clever. Probably a bit of overkill for any matrix problem I
    have had, but still quite clever. Seriously though, unless I were
    developing a library that supported such things, I would probably
    still opt for individual implementations/specializations for the
    problem set I actually had rather than a recursive template solution.
    In other words, unless I were developing the code to be a matrix
    library that could handle any dimension matrix, I personally, would
    rather have the two or three specializations for matrixes that I
    actually need, rather than a recursive template that handles every
    possible kind of matrix. The reason is two fold. 1) Debugger
    technology available to me makes debugging recursive template code
    much more difficult than straight forward template code (although that
    can be a challenge too at times). 2) Current optimizer technology
    hasn't performed the wonders for me that sophisticated template
    programmers posit that it should be able to perform. That is, the
    code actually generated isn't as good as far as I can tell. Your
    mileage may vary, but I am a simple person who prefers simple
    solutions when having a general solution isn't required for the
    problem I have. That is not to say that I wouldn't use a third party
    library that provided such matrixes, just that I wouldn't go that
    route myself unless I were developing such a library. :)

    joe
     
    joe, Aug 15, 2007
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sugaray
    Replies:
    1
    Views:
    6,885
    Rob Williscroft
    Dec 25, 2003
  2. Replies:
    2
    Views:
    11,260
    Dietmar Kuehl
    Jun 29, 2004
  3. Roberto Dias
    Replies:
    1
    Views:
    761
    Paolo Alexis Falcone
    Jun 30, 2004
  4. gianluca

    Dynamic allocation of multi dimensional array

    gianluca, Aug 9, 2008, in forum: C Programming
    Replies:
    8
    Views:
    518
  5. Wirianto Djunaidi
    Replies:
    2
    Views:
    228
    Wirianto Djunaidi
    Apr 29, 2008
Loading...

Share This Page