The fastest way to fill 2D dynamic array with zeros

romayankin · Feb 25, 2006

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Thanks,
-R.

Bob Hairgrove · Feb 25, 2006

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Thanks,
-R.

It would be much faster to calculate rows*columns and allocate your 2D
array in one call to new[] as if it were a 1D array, rather than doing
one new[] per column. Then you could initialize the elements with a
for loop as if it were a 1D array. I don't know if using a loop is the
fastest way, but all those new[] calls are probably going to be much
more expensive.

If you have millions or billions of doubles, you might want to split
the memory up into buckets and allocate a few threads to initialize
them in parallel. But this is not going to be platform-independent, so
I would just use a loop.

Probably the best idea is to use a std::vector<double> and treat it
like an array ... that way you won't have to call delete[] on your
arrays.

Implementing 2D arrays is also treated by Scott Meyers, among others,
in his book "More Effective C++". Google for "c++" and "2D array" and
you'll get lots of hits.

Gianni Mariani · Feb 25, 2006

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:

Look at "matrix" "C++" on google.

~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

How big are these arrays going to be ?

Victor Bazarov · Feb 25, 2006

Gianni said:
Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:

Click to expand...

Look at "matrix" "C++" on google.

~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];

Click to expand...

I didn't see anybody mention this, but why are you allocating
one less than what seems to be the "number of columns"?

for(int i = 0; i < col - 1; i++) {
mx = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the
extremely fast way to fill the 2D dynamic array with zeros. Does
anyone know something I that might help me?

Click to expand...

Yes, if you do

mx = new double[row]();

(notice the parentheses after the brackets), then the elements
of that array shall be value-initialised, and for 'double' it
means zero-initialised. Beware, not all compilers are capable
of handling that syntax (surprisingly).

How big are these arrays going to be ?

Click to expand...

Why should it matter, Gianni?

V

Axter · Feb 25, 2006

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Here's a wrapper function that will initialize a 2D array to zero, much
more efficient.
template < typename T >
T **Allocate2DArray( int nRows, int nCols)
{
T **ppi = new T*[nRows];
//parentheses after the brackets makes it initialize to zero
T *curPtr = new T [nRows * nCols]();
for( int i = 0; i < nRows; i++) {
*(ppi + i) = curPtr;
curPtr += nCols;
}
return ppi;
}

The above wrapper function only makes two calls to new, which means the
following wrapper function only needs to make two calls to delete:

template < typename T >
void Free2DArray(T** Array)
{
delete [] *Array;
delete [] Array;
}

Example usage:
double **d = Allocate2DArray<double>(x, y);
d[0][0] = 10.0;
d[1][1] = 20.0;
d[3][2] = 2345.09;
Free2DArray(d);

romayankin · Feb 25, 2006

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

How big are these arrays going to be ?

Not quite big the max value is around 1000 elements, in most cases
~100.

I didn't see anybody mention this, but why are you allocating one less than what seems to be the "number of columns"?

This is an atavism. That's how it works for me

OK, thanks everyone for the replies, but that was my fault that I
didn't emphasize that my performance problem does not involve
allocation. It is completely in setting all elements of a particular 2D
array to zero. I'm creating this array only once and then i might need
to zero it tens thousands of times.

Up until now the best option I could come up with is:
~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; i++) {
memset(mx, 0, sizeof(double)*row);
}
~~~~~~~~~~~~~~~~~~

Gianni Mariani · Feb 25, 2006

Victor said:
Gianni said:

[email protected] said:

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:

Click to expand...

Look at "matrix" "C++" on google.

~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];

Click to expand...

Click to expand...

I didn't see anybody mention this, but why are you allocating
one less than what seems to be the "number of columns"?

Good catch. I was thinking "matrix" class so I wasn't looking there.

for(int i = 0; i < col - 1; i++) {
mx = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the
extremely fast way to fill the 2D dynamic array with zeros. Does
anyone know something I that might help me?

Click to expand...

Click to expand...

Yes, if you do

mx = new double[row]();

(notice the parentheses after the brackets), then the elements
of that array shall be value-initialised, and for 'double' it
means zero-initialised. Beware, not all compilers are capable
of handling that syntax (surprisingly).

How big are these arrays going to be ?

Click to expand...

Why should it matter, Gianni?

Big (100's of cache size) sized arrays would benefit by using cache
prefetch and depending on the architecture (multiple instruction issue
and speculative execution) there are a number of things you can do to
re-order the statements in ways the compiler probably would not be able
to optimize for. These kinds of optimizations depend heavily on
architecture.

For example, in SGI's IRIX 4.x days, the memcpy routine was not
optimized for the R400x CPU's and some simple loop unrolling and the use
of "unaligned" ld/st increased performance by significant multiples (the
R4K had a single way associative cache). For the R8K and R10K, with
multiple issue, speculative execution and cache prefetch instructions,
the main loop was very different to get the most out of the CPU. In the
R10K, loop unrolling was next to useless, in the R4K loop unrolling was
a big boost.

However, if we're talking about what fits into a secondary cache, all
this is moot since big hits in performance in modern CPU's are due to
cache misses (mostly secondary cache). This also means that the speed
for "clearing out" a block of memory may be limited by the memory
bandwidth. So, the aim is to come up with an algorithm that is able to
saturate the memory interfaces.

However, if we're talking about a 4x4 matrix, that would fit in a small
number of cache lines and by the time the memory allocator handed over
the memory, it would more than likely already be in primary cache so
there would be next to no benefit to apply any of the more advanced
techniques to deal with cache misses.

All this is of course highly dependant on *your* system.

Gianni Mariani · Feb 25, 2006

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

*BZZT* Wrong.

std::vector is not slower than an array unless you have tested it. In
many cases, I have seen negligible different between std::vector and an
array when the optimizer gets done with it.

Not quite big the max value is around 1000 elements, in most cases
~100.

This is an atavism. That's how it works for me

OK, thanks everyone for the replies, but that was my fault that I
didn't emphasize that my performance problem does not involve
allocation. It is completely in setting all elements of a particular 2D
array to zero. I'm creating this array only once and then i might need
to zero it tens thousands of times.

BTW - the address arithmetic you feel is causing you performance issues
is far more likely to be less of an issue than the memory fetch for the
address you're looking for.

Up until now the best option I could come up with is:
~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; i++) {
memset(mx, 0, sizeof(double)*row);
}
~~~~~~~~~~~~~~~~~~

Axter · Feb 25, 2006

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

How big are these arrays going to be ?

Click to expand...

Not quite big the max value is around 1000 elements, in most cases
~100.

I didn't see anybody mention this, but why are you allocating one less than what seems to be the "number of columns"?

Click to expand...

This is an atavism. That's how it works for me

OK, thanks everyone for the replies, but that was my fault that I
didn't emphasize that my performance problem does not involve
allocation. It is completely in setting all elements of a particular 2D
array to zero. I'm creating this array only once and then i might need
to zero it tens thousands of times.

Up until now the best option I could come up with is:
~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; i++) {
memset(mx, 0, sizeof(double)*row);
}

If you use the method I posted, you wouldn't have to interatore through
each column of the array to set it.
With the method I posted, you can do the following:
memset(mx[0], 0, sizeof(double)*row*col);

You can just make one call to memset.
The method I posted allocates a single block for the 2D array, and then
devides it in sections to set the main array of pointers.
Since it's one continuous block, you can just use one call to memset.

Gavin Deane · Feb 25, 2006

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

std::vector provides memory management, exception safety and a clean
and simple way of passing the data between functions and/or containing
the data within an object. You must be writing a very trivial C++
program if you don't need any of that functionality.

Gavin Deane

Amadeus W. M. · Feb 26, 2006

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Thanks,
-R.

If you have to zero out repeatedly the double ** array, then I think the
fastest way is to call memset on a contiguous array (along the lines
Axter suggested). It's not just how many times you allocate/deallocate
memory. Once it's allocated, you only call memset once each time you need
to zero out the array, if the array is contiguous. And memset might be
optimized for the different architectures.

It's not good to have MANY large, contiguous arrays, because they may not
fit in memory. For instance, an image is invariably stored contiguously,
but how many images can you keep in memory simultaneously? Certainly not
a whole movie. Otherwise it's ok. Contiguous arrays of size O(1000) don't
pose any threat.

Now if you're concerned about page hits/misses, caches and virtual memory,
then you may have to unroll the loops, or resort to other tricks that
people in sci.math.num-analysis will most certainly know better.

romayankin · Feb 26, 2006

2Axter regarding your code. The thing is that my 2D array is not
rectangular. And therefore your approach will cause some unnecessary
memory overheads. But thanks for the idea i might implement it as an
"speed optimized" option.

2Gianni

BTW - the address arithmetic you feel is causing you performance issues
is far more likely to be less of an issue than the memory fetch for the
address you're looking for.

I didn't mean it would affect the speed, my saying that it involves too
much address arithmetic I meant that the code gets less crystal and
besides of that, its easier to make a mistake when you always have to
multiply iterator by the length of the row. And you right its more
likely that retrieving address using square brackets is usually less
efficient then using incrementing pointers.

So ... as far as i'm having non-rectangular 2D array the best way to
fill it with zeros is:
~~~~~~~~~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; ++i)
{
memset(mx, 0, sizeof(double)*rowLen);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~
or am I missing something again?

John Carson · Feb 26, 2006

2Axter regarding your code. The thing is that my 2D array is not
rectangular. And therefore your approach will cause some unnecessary
memory overheads. But thanks for the idea i might implement it as an
"speed optimized" option.

The code in your original post was for a rectangular array. However, if you
have an array of row lengths, then Axter's approach can easily be modified
to allocate a block equal in size to the sum of those row lengths. Pointers
can easily be initialized to point at the appropriate places.

Daniel T. · Feb 26, 2006

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

You are quite wrong. I once had a rather good C programmer insist the
same thing so we wrote a test program. Once the compiler was through
optimizing the code, the vector was actually 5% *faster* than anything
he could come up with.

Worst case, use a one dimensional array encapsulated in a class that
does the address arithmetic for you.

romayankin · Feb 26, 2006

Okey thanks, I guess then my question has been answered. Thanks for
everyone who has participated in this discussion

Copy string from 2D array to a 1D array in C	1	Nov 1, 2023
Is there a way to get a single mode using all the points within a 2D array?	2	Oct 17, 2022
Problem with assigning 1D array to a 2D array	11	Oct 12, 2012
Map 2D array to contiguous memory?	15	Jul 16, 2012
Setting array size with a variable - What does the C compiler do?	3	Feb 25, 2022
fastest way to get a double from a string	4	Nov 30, 2009
2D arrays, pointers, and pointer arithmetic: question	32	Aug 24, 2012
function of 2d array	4	Nov 7, 2006

The fastest way to fill 2D dynamic array with zeros

romayankin

Bob Hairgrove

Gianni Mariani

Victor Bazarov

Axter

romayankin

Gianni Mariani

Gianni Mariani

Axter

Gavin Deane

Amadeus W. M.

romayankin

John Carson

Daniel T.

romayankin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads