dynamic allocation vs array

Discussion in 'C++' started by Tan Thuan Seah, Sep 10, 2004.

  1. Hi all,

    I was told this in one of the university course I was doing.

    In C we may expect good performance for:
    double a[N][N], c[N][N], d;
    for (i=0; i<N; i++)
    for(j=0; j<N; j++)
    a[j] = a[j] + c[j] *d;

    But this is another matter:
    double *a[N], *c[N], d;
    for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    c = (double *) malloc(N*sizeof(double)); }

    for(i=0; i<N; i++)
    for(j=0; j<N; j++)
    a[j] = a[j] + c[j] * d;


    It seems that we would expect some performance hit if we were to use dynamic
    memory allocation of some sort. But it's not a standard ANSI C++ to have
    array with the size determined during runtime. So is there any good
    recommendation to minimize this performance hit or totally avoiding it
    through some other method? I would expect a linked list to be even worse.
    Any recommendation? Thanks.


    Thuan Seah
     
    Tan Thuan Seah, Sep 10, 2004
    #1
    1. Advertising

  2. Tan Thuan Seah

    Sam Holden Guest

    On Fri, 10 Sep 2004 14:47:41 +1000, Tan Thuan Seah <> wrote:
    > Hi all,
    >
    > I was told this in one of the university course I was doing.
    >
    > In C we may expect good performance for:
    > double a[N][N], c[N][N], d;
    > for (i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] *d;
    >
    > But this is another matter:
    > double *a[N], *c[N], d;
    > for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    > c = (double *) malloc(N*sizeof(double)); }
    >
    > for(i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] * d;
    >
    >
    > It seems that we would expect some performance hit if we were to use dynamic
    > memory allocation of some sort. But it's not a standard ANSI C++ to have
    > array with the size determined during runtime. So is there any good
    > recommendation to minimize this performance hit or totally avoiding it
    > through some other method? I would expect a linked list to be even worse.
    > Any recommendation? Thanks.


    You are using arrays with non-constant size (assuming N is not constant)
    in both examples. C99 allows that, but C++ does not. So both examples
    won't compile under standard C++.

    What were the reasons cited for getting worse performance?

    If it's the cost of the malloc, then you are doing it at setup time
    anyway and it's unavoidable without resorting to platform specific
    stack access (such as alloca).

    If it's the extra pointer deref then that's pretty much unavoidable,
    though you managed to add two of them, which can be avoided by
    the code that follows the next point too.

    If it's that the memory isn't "localised" and hence cache misses are
    more frequent you could use "flattened" arrays:

    double *a = new double[N*N];
    double *c = new double[N*N];
    double d;

    /* ... */

    for(int i=0; i<N; i++)
    for(int j=0; j<N; j++)
    a[i*N+j] += c[i*N+j] * d;

    Have you benchmarked to see if it matters? I get an 8% difference with
    everything as globals (or 76 milliseconds), with things on the stack it
    crashes since two million doubles don't fit on the stack on my machine.

    Is speed or not crashing more important?

    --
    Sam Holden
     
    Sam Holden, Sep 10, 2004
    #2
    1. Advertising

  3. Tan Thuan Seah

    JKop Guest

    Tan Thuan Seah posted:

    > Hi all,
    >
    > I was told this in one of the university course I was doing.
    >
    > In C we may expect good performance for:
    > double a[N][N], c[N][N], d;
    > for (i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] *d;
    >
    > But this is another matter:
    > double *a[N], *c[N], d;
    > for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    > c = (double *) malloc(N*sizeof(double)); }
    >
    > for(i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] * d;
    >
    >
    > It seems that we would expect some performance hit if we were to use
    > dynamic memory allocation of some sort. But it's not a standard ANSI
    > C++ to have array with the size determined during runtime. So is there
    > any good recommendation to minimize this performance hit or totally
    > avoiding it through some other method? I would expect a linked list to
    > be even worse. Any recommendation? Thanks.
    >
    >
    > Thuan Seah


    You could allocate one big chunk of memory and then chop it up like so.

    I'd give an example but it'd take me a few minutes to figure out what your
    code is doing.

    -JKop
     
    JKop, Sep 10, 2004
    #3
  4. Tan Thuan Seah

    Old Wolf Guest

    "Tan Thuan Seah" <> wrote in message news:<414131f8$>...
    > Hi all,
    >
    > I was told this in one of the university course I was doing.
    >
    > In C we may expect good performance for:
    > double a[N][N], c[N][N], d;
    > for (i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] *d;
    >
    > But this is another matter:
    > double *a[N], *c[N], d;
    > for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    > c = (double *) malloc(N*sizeof(double)); }
    >
    >
    > It seems that we would expect some performance hit if we were to use dynamic
    > memory allocation of some sort. But it's not a standard ANSI C++ to have
    > array with the size determined during runtime.


    Try using a vector of vectors. With a modern implementation
    it may not be as slow as you expect; certainly faster than
    the malloc version (which doesnt work anyway , you can't have
    the arrays a[] and c[]). If this is still too slow, you could
    look for a custom Matrix class that someone has already
    written efficiently.
     
    Old Wolf, Sep 10, 2004
    #4
  5. Tan Thuan Seah wrote:

    > Hi all,
    >
    > I was told this in one of the university course I was doing.
    >
    > In C we may expect good performance for:
    > double a[N][N], c[N][N], d;
    > for (i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] *d;
    >
    > But this is another matter:
    > double *a[N], *c[N], d;
    > for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    > c = (double *) malloc(N*sizeof(double)); }
    >
    > for(i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] * d;
    >
    >
    > It seems that we would expect some performance hit if we were to use dynamic
    > memory allocation of some sort. But it's not a standard ANSI C++ to have
    > array with the size determined during runtime. So is there any good
    > recommendation to minimize this performance hit or totally avoiding it
    > through some other method? I would expect a linked list to be even worse.
    > Any recommendation? Thanks.



    Example code:


    #include <vector>


    int main()
    {
    using namespace std;

    int n=8;

    vector<vector<int> > someVector(n, vector<int>(n));


    someVector[3][2]=4;

    }



    If you have to, you may use a valarray. Also check this:

    http://www23.brinkster.com/noicys/cppfaq.htm




    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Sep 10, 2004
    #5
  6. Thanks for everyone's contribution. Actually I am currently working on a
    project on sparse matrix ordering which involves quite a fair bit of graph
    and sets and stuff. I am looking for an efficient way to implement the sets
    and graph. As Old Wolf suggested using vectors, I came across the set in
    STL. Has anyone has anyone good experiences using a combination of vectors
    and set? Do they go well together?

    Last bit would be to figure out a way for the graph. The book I am reading
    (Computer Solution of Large Sparse Positive Definite Systems by Alan George
    and Joseph W. Liu) briefly mentioned about storing the graph with 2 arrays.
    But since the input matrices are going to differ in size, I need to find a
    dynamic way to allocate the arrays. Vector seems pretty to be a potential
    candidate. I am still open to other suggestion though. Thanks.

    Thuan Seah

    "Ioannis Vranos" <> wrote in message
    news:chrvrc$2lla$...
    > Tan Thuan Seah wrote:
    >
    > > Hi all,
    > >
    > > I was told this in one of the university course I was doing.
    > >
    > > In C we may expect good performance for:
    > > double a[N][N], c[N][N], d;
    > > for (i=0; i<N; i++)
    > > for(j=0; j<N; j++)
    > > a[j] = a[j] + c[j] *d;
    > >
    > > But this is another matter:
    > > double *a[N], *c[N], d;
    > > for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    > > c = (double *) malloc(N*sizeof(double)); }
    > >
    > > for(i=0; i<N; i++)
    > > for(j=0; j<N; j++)
    > > a[j] = a[j] + c[j] * d;
    > >
    > >
    > > It seems that we would expect some performance hit if we were to use

    dynamic
    > > memory allocation of some sort. But it's not a standard ANSI C++ to have
    > > array with the size determined during runtime. So is there any good
    > > recommendation to minimize this performance hit or totally avoiding it
    > > through some other method? I would expect a linked list to be even

    worse.
    > > Any recommendation? Thanks.

    >
    >
    > Example code:
    >
    >
    > #include <vector>
    >
    >
    > int main()
    > {
    > using namespace std;
    >
    > int n=8;
    >
    > vector<vector<int> > someVector(n, vector<int>(n));
    >
    >
    > someVector[3][2]=4;
    >
    > }
    >
    >
    >
    > If you have to, you may use a valarray. Also check this:
    >
    > http://www23.brinkster.com/noicys/cppfaq.htm
    >
    >
    >
    >
    > --
    > Ioannis Vranos
    >
    > http://www23.brinkster.com/noicys



    ------------ And now a word from our sponsor ----------------------
    For a quality mail server, try SurgeMail, easy to install,
    fast, efficient and reliable. Run a million users on a standard
    PC running NT or Unix without running out of power, use the best!
    ---- See http://netwinsite.com/sponsor/sponsor_surgemail.htm ----
     
    Tan Thuan Seah, Sep 10, 2004
    #6
  7. Tan Thuan Seah

    PKH Guest

    "Tan Thuan Seah" <> wrote in message
    news:414131f8$...
    > Hi all,
    >
    > I was told this in one of the university course I was doing.
    >
    > In C we may expect good performance for:
    > double a[N][N], c[N][N], d;
    > for (i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] *d;
    >
    > But this is another matter:
    > double *a[N], *c[N], d;
    > for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    > c = (double *) malloc(N*sizeof(double)); }
    >
    > for(i=0; i<N; i++)
    > for(j=0; j<N; j++)
    > a[j] = a[j] + c[j] * d;
    >
    >
    > It seems that we would expect some performance hit if we were to use
    > dynamic
    > memory allocation of some sort. But it's not a standard ANSI C++ to have
    > array with the size determined during runtime. So is there any good
    > recommendation to minimize this performance hit or totally avoiding it
    > through some other method? I would expect a linked list to be even worse.
    > Any recommendation? Thanks.
    >
    >
    > Thuan Seah
    >
    >


    I think most likely operations on a[N][N] would/could perform better than
    the dynamically a[N]* allocated
    because of memory-cacheing. a[N][N] would have all its memory in one chunk,
    while a[N]* could have
    bits be spread around in memory, decreasing cache-performance.

    PKH
     
    PKH, Sep 11, 2004
    #7
  8. That's my guess also. But it seems no way around that unless you are able to
    know the size of array you need in advance. I will take a look at vector of
    the STL, wonder if the performance is as good as array or as bad as dynamic
    memory allocation.

    Thuan Seah


    "PKH" <> wrote in message
    news:yWE0d.6320$...
    >
    > "Tan Thuan Seah" <> wrote in message
    > news:414131f8$...
    > > Hi all,
    > >
    > > I was told this in one of the university course I was doing.
    > >
    > > In C we may expect good performance for:
    > > double a[N][N], c[N][N], d;
    > > for (i=0; i<N; i++)
    > > for(j=0; j<N; j++)
    > > a[j] = a[j] + c[j] *d;
    > >
    > > But this is another matter:
    > > double *a[N], *c[N], d;
    > > for(i=0; i<N; i++) {a = (double *) malloc(N*sizeof(double));
    > > c = (double *) malloc(N*sizeof(double)); }
    > >
    > > for(i=0; i<N; i++)
    > > for(j=0; j<N; j++)
    > > a[j] = a[j] + c[j] * d;
    > >
    > >
    > > It seems that we would expect some performance hit if we were to use
    > > dynamic
    > > memory allocation of some sort. But it's not a standard ANSI C++ to have
    > > array with the size determined during runtime. So is there any good
    > > recommendation to minimize this performance hit or totally avoiding it
    > > through some other method? I would expect a linked list to be even

    worse.
    > > Any recommendation? Thanks.
    > >
    > >
    > > Thuan Seah
    > >
    > >

    >
    > I think most likely operations on a[N][N] would/could perform better than
    > the dynamically a[N]* allocated
    > because of memory-cacheing. a[N][N] would have all its memory in one

    chunk,
    > while a[N]* could have
    > bits be spread around in memory, decreasing cache-performance.
    >
    > PKH
    >
    >
    >
    >
     
    Tan Thuan Seah, Sep 12, 2004
    #8
  9. Tan Thuan Seah wrote:

    > I was told this in one of the university course I was doing.
    >
    > In C we may expect good performance for:
    >
    > double a[N][N], c[N][N], d;
    > for (size_t i = 0; i < N; ++i)
    > for(size_t j = 0; j < N; ++j)
    > a[j] = a[j] + c[j] * d;
    >
    > But this is another matter:
    >
    > double *a[N], *c[N], d;
    > for(size_t i = 0; i < N; ++i) {
    > a = (double*)malloc(N*sizeof(double));
    > c = (double*) malloc(N*sizeof(double)); }


    > for(size_t i = 0; i < N; ++i)
    > for(size_t j = 0; j < N; ++j)
    > a[j] = a[j] + c[j] * d;
    >
    > It seems that we would expect some performance hit
    > if we were to use dynamic memory allocation of some sort.


    Let's try this:

    > cat main.cc

    #include <cstdlib>

    int main(int argc, char* argv[]) {

    if (1 < argc) {
    const
    size_t N = 1024;

    #ifdef YNAMIC
    double *a[N], *c[N], d = 1.0;

    a[0] = (double*)malloc(N*N*sizeof(double));
    c[0] = (double*)malloc(N*N*sizeof(double));

    for(size_t i = 1; i < N; ++i) {
    a = a[0] + i*N;
    c = c[0] + i*N;
    }
    #else //YNAMIC
    double a[N][N], c[N][N], d = 1.0;
    #endif//YNAMIC

    for (size_t i = 0; i < N; ++i)
    for(size_t j = 0; j < N; ++j) {
    a[j] = N*i + j;
    c[j] = N*i + j;
    }

    const
    size_t trips = atoi(argv[1]);
    for (size_t trip = 0; trip < trips; ++trip)
    for (size_t i = 0; i < N; ++i)
    for(size_t j = 0; j < N; ++j)
    a[j] += c[j] * d;
    }
    return EXIT_SUCCESS;
    }

    > g++ -Wall -ansi -pedantic -O2 -o main main.cc
    > time ./main 60

    3.854u 0.130s 0:03.98 100.0% 0+0k 0+0io 0pf+0w
    > g++ -DYNAMIC -Wall -ansi -pedantic -O2 -o main main.cc
    > time ./main 60

    3.865u 0.129s 0:03.99 99.7%
    > g++ --version

    g++ (GCC) 3.4.1

    On my machine, it doesn't seem to matter much
    whether the memory is allocated from automatic or free storage.

    > But it's not a standard ANSI C++ to have array
    > with the size determined during runtime.
    > So is there any good recommendation
    > to minimize this performance hit
    > or totally avoiding it through some other method?
    > I would expect a linked list to be even worse.


    Have you tested it?

    > Any recommendation?


    You may be a victim of Premature Optimization:

    http://c2.com/cgi/wiki?PrematureOptimization
     
    E. Robert Tisdale, Sep 12, 2004
    #9
  10. Tan Thuan Seah

    Old Wolf Guest

    "Tan Thuan Seah" <> wrote:
    > Thanks for everyone's contribution. Actually I am currently working on a
    > project on sparse matrix ordering which involves quite a fair bit of graph
    > and sets and stuff. I am looking for an efficient way to implement the sets
    > and graph. As Old Wolf suggested using vectors, I came across the set in
    > STL. Has anyone has anyone good experiences using a combination of vectors
    > and set? Do they go well together?


    'set' is usually implemented as a tree. If this suits your
    performance requirements, feel free to use it. You can always
    replace the STL set with some other tree that's hand-tuned to
    your requirements, at a later point in time.

    If you need graphs in general, you could try the Boost Graph Library
    at www.boost.org .
     
    Old Wolf, Sep 12, 2004
    #10
  11. Tan Thuan Seah wrote:
    > That's my guess also. But it seems no way around that unless you are able to
    > know the size of array you need in advance. I will take a look at vector of
    > the STL, wonder if the performance is as good as array or as bad as dynamic
    > memory allocation.




    If you are under *severe* time concerns, you must use valarray.


    Also a good idea is to read TC++PL3 chapter 17, for the time costs of
    each standard library container.




    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Sep 12, 2004
    #11
  12. "E. Robert Tisdale" <> wrote in message
    news:ci0erm$bpt$...
    > Have you tested it?


    hm.. not quite. I just based on the fact about computers that I know of.
    Having a pointer is always slower isn't it? Linked list is constructed using
    pointers so I reckon it would be slow also.

    Thuan Seah
     
    Tan Thuan Seah, Sep 12, 2004
    #12
  13. Tan Thuan Seah wrote:
    > "E. Robert Tisdale" <> wrote in message
    > news:ci0erm$bpt$...
    >
    >>Have you tested it?

    >
    >
    > hm.. not quite. I just based on the fact about computers that I know of.
    > Having a pointer is always slower isn't it? Linked list is constructed using
    > pointers so I reckon it would be slow also.



    Than built in arrays? What would be the time cost difference between
    these two for loops?



    int main()
    {
    int array[100], *p=new int [100];


    for(unsigned i=0; i<100; ++i)
    array=1;

    for(unsigned i=0; i<100; ++i)
    p=1;


    delete[] p;
    }



    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Sep 12, 2004
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rave

    dynamic array allocation

    Rave, Sep 25, 2003, in forum: C++
    Replies:
    5
    Views:
    628
    Kevin Goodsell
    Sep 25, 2003
  2. lestat
    Replies:
    2
    Views:
    711
    lestat
    Oct 11, 2003
  3. Ken
    Replies:
    24
    Views:
    3,896
    Ben Bacarisse
    Nov 30, 2006
  4. chris
    Replies:
    6
    Views:
    1,007
    chris
    Oct 28, 2005
  5. Bjarke Hammersholt Roune
    Replies:
    14
    Views:
    1,207
    Bjarke Hammersholt Roune
    Mar 6, 2011
Loading...

Share This Page