'new'd memeroy is contiguous?

B

BobR

(e-mail address removed) wrote in message ...
I, indeed, am working in a image processing application and was trying
to figure out if a 2D array is better for dynamic allocation or a big
1-dimensional one.

Case 1: 1D case will allocate it in one go. So contguity is there
indeed.

Case 2: 2D will involve memory to be new'd (again, grammatical usage is
casual here. This is *NOT* correct english.) once for the columns and
then itertively with in a 'for' loop so as to cover the rows.
int** array2D = new int*[height];
for (int i=0; i<height; i++){
array2D = new int[width];
}

My querry is that whether this array2D is contiguous or not.

So far so good.. now which of these 2 cases will be faster to access?
Obviously the one which is contiguous. Hence this question on the
newsgroup.




What do you have against std::vector? I use a vector of 3D points(double
x,y,z) to store a '3D rock' which animates smoothly, 30k+ triangle-mesh in
OpenGL running in wxWidgets.
Vector is contiguous.

#include <iostream> // 2D example
#include <vector>
// ........
size_t Rows( 480 );
size_t Columns( 640 );
std::vector<std::vector<double> > My2Darray( Rows, Columns );
std::cout<<"My2Darray.size() ="<<My2Darray.size()<<std::endl;
std::cout<<"My2Darray.at(0).size() ="<<My2Darray.at(0).size()<<std::endl;
std::cout<<"row 7 col 9 ="<<My2Darray.at(7).at(9)<<std::endl;

Now you don't need to worry about 'new/delete[]'.

[ Sorry if this was all 'old hat' to you. Maybe it will help some newbie. ]
 
J

Jim Langston

Kaz said:
Suppose it wasn't contiguous. You get one pointer out of the allocator,
which points to uninitialized storage. How would your program find the
other pieces? With what information?


Yes, that's what I was worried about.
I, indeed, am working in a image processing application and was trying
to figure out if a 2D array is better for dynamic allocation or a big
1-dimensional one.

Case 1: 1D case will allocate it in one go. So contguity is there
indeed.


Case 2: 2D will involve memory to be new'd (again, grammatical usage is
casual here. This is *NOT* correct english.) once for the columns and
then itertively with in a 'for' loop so as to cover the rows.
int** array2D = new int*[height];
for (int i=0; i<height; i++)
{
array2D = new int[width];
}

My querry is that whether this array2D is contiguous or not.


Most likely it will not be contiguous as a group.

That is, array2D + height will probably not point to array2D[1][0]. It
could, but probably wouldn't.

So in your specific case it would be better to allocate it in one go and do
the array math.
 
K

Kaz Kylheku

Case 2: 2D will involve memory to be new'd (again, grammatical usage is
casual here. This is *NOT* correct english.)

"newed" will do, patterned after "renewed", as in library books or a
club membership.

once for the columns and
then itertively with in a 'for' loop so as to cover the rows.
int** array2D = new int*[height];
for (int i=0; i<height; i++)
{
array2D = new int[width];
}

My querry is that whether this array2D is contiguous or not.


Your new query is whether this 2D array is contiguous or not. Your
original query doesn't ask anything like this.

In general, each call to the memory allocator is an independent
request, which is satisfied by finding a sufficiently large piece of
memory somewhere.

The memory will quite likely not be contiguous. Even if the rows are
allocated sequentially at the beginning of a large extent of memory,
there will possibly be separated by memory management information.

So far so good.. now which of these 2 cases will be faster to access?
Obviously the one which is contiguous. Hence this question on the
newsgroup.

The non-contiguous case gives you flexibility that doesn't exist in the
contiguous case, because when you make your 2D array one object, the
width of each row is a compile-time constant.

You could dynamically allocate a 2D array with a number of rows and
columns determined at run-time. It would actually be a 2D array
simulated within a 1D array. To access row R and column C, you would
use the expression:

array[width * R + C]

rather than

array[R][C]

The first method performs a multiplication, addition and one memory
access. The second method, assuming that array[] is a vector of
pointers to individually allocated rows, has to displace into the row
vector to retrieve a pointer, and then displace that pointer to fetch
the element. Two memory accesses.

What is faster? The answer to that depends entirely on the compiler,
machine, the size of the array, the access pattern to that array and so
forth.

It may happen that both of these "naive" representations for a 2D array
are inefficient for doing some kinds of operations on very large
arrays, such as high resolution graphical images.

Suppose you are writing an image manipulation program. With this
program, the user will do lots of editing operations that are clustered
in a small part of the image, and in fact he or she may zoom in, so as
to only see a small part of the image.

A naive, "linear" memory allocation strategy will have the effect that
two pixels that are right above one another (same column) will be in
different virtual memory pages. To retrieve a 100x100 square of the
image for some paintbrush operation or whatever will involve accesses
to 100 different pages! And 100x100 is barely larger than a modern
icon; it's tiny. With this representation of the image data, the
program will thrash the page translation cache, and also demonstrate
poor swapping behavior in low memory conditions.

Programs like this, such GIMP, use a "tile based" memory allocation.
The image is carved into small, square tiles, which are individual 2D
arrays that are contiguous. Thus a sub-rectangle of an image occupies
only a small number of pages.

So you see, optimization issues aren't always simple when you have
various different kinds of caches for instructions, data, and virtual
memory information, existing at various different levels.

A data representation scheme that involves burning a few memory and
machine cycles on chasing a few pointers may be faster than a naive one
that just uses address arithmetic, because it exhibits better caching
and paging behavior!

Some programs will exhibit different performance if you just switch
their arrays from row major to column major representation. In some
languages that actually have true multidimensional arrays, choosing
that representation is up to the compiler or language run-time, or
perhaps influenced by optimization hints from the programmer.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top