a question about 1-dimension and 2-dimension array

L

Luuk

Op 9-2-2010 13:42, Dimilar Zhu schreef:
I have 10000 numbers. Now i need to choose a kind of data structure to save them.
of 1-dimension and 2-dimension array, which is faster? does it depend
on factors such as degree of order, correlation, operations?

if yu write a threaded application, than the number of threads will make
the speed.

so if you have a dual-core processor, take a 2-dimensional array


;-)
 
D

Dimilar Zhu

I have 10000 numbers. Now i need to choose a kind of data structure to save them.
of 1-dimension and 2-dimension array, which is faster? does it depend
on factors such as degree of order, correlation, operations?
 
D

dimilar

Luuk said:
Op 9-2-2010 13:42, Dimilar Zhu schreef:

btw, why do you care about speed,
you are living in the FUTURE!

just a question, yesterday my boss asked me. he is a doctor, and do not know
much about detail of computer language. but at that time i do not know how
to answer it. so I said, "it depends....blablabla".

later a guy told me that there were distinct performance result for large
scale of data if taking account of memory cache and memory access.

for 2-dimension array, it is necessary to do much multiplication to obtain the
index of element. compared to the addition of 1-dimension array, it consumes
more time.

but I did not do any experiment to verify it.
 
B

Ben Bacarisse

dimilar said:
just a question, yesterday my boss asked me. he is a doctor, and do not know
much about detail of computer language. but at that time i do not know how
to answer it. so I said, "it depends....blablabla".

Good answer, though you could have said that data structures aren't
fast or slow, it is algorithms that have certain performance
characteristics. In particular, it is the pattern of accesses that
determines what the best data structure is for a particular problem.

If this is something that needs a fuller answer, then post on
comp.programming with an outline of what it is you are trying to do
(at the highest level -- invert a matrix, find a clique in a graph
etc.) because this is not really a C question.
 
R

Richard Tobin

dimilar said:
for 2-dimension array, it is necessary to do much multiplication to obtain the
index of element.

Certainly accessing a 2d array *conceptually* involves multiplication.
But if you can straightforwardly use a 1d array instead, then quite
likely so can the compiler. For example, in

int a[n1][n2];

for(i=0; i<n1; i++)
for(j=0; j<n2; j++)
... do something with a[j] ...;

a reasonable compiler will not calculate i*n2+j from scratch each time
round the inner loop. It can compute i*n2 once each time round the
outer loop, and may not even do that.

-- Richard
 
D

dimilar

dimilar said:
for 2-dimension array, it is necessary to do much multiplication to obtain the
index of element.

Certainly accessing a 2d array *conceptually* involves
multiplication.
But if you can straightforwardly use a 1d array instead,
then quite
likely so can the compiler. For example, in

int a[n1][n2];

for(i=0; i<n1; i++)
for(j=0; j<n2; j++)
... do something with a[j] ...;


for the case you described here, multiplication is not a problem because you are
accessing elements in a certain order. but if we randomly access
many elements from a large 2d array. many multiplication is
inevitable. although I am also not sure it is a bottleneck.

however, I think it is really an algorithms dependent problem.
Thanks for you reply :)
 
M

Mark Bluemel

I have 10000 numbers. Now i need to choose a kind of data structure to save them.
of 1-dimension and 2-dimension array, which is faster? does it depend
on factors such as degree of order, correlation, operations?

As I think others have said, this really isn't that much of a C
question. It's more a general computing question, and even then you
haven't given us enough information to go on.

What do your 10000 numbers represent? How do you expect to populate
the data structure and how do you expect to access it?

If the natural representation of your data is as a two-dimensional
array, I don't see how it would be any advantage for you to flatten it
to one dimension and do the maths to access your target cells.

Equally, if the natural representation is one dimensional, why would
you do anything else?
 
M

Malcolm McLean

I have 10000 numbers. Now i need to choose a kind of data structure to save them.
of 1-dimension and 2-dimension array, which is faster? does it depend
on factors such as degree of order, correlation, operations?

In C a 2-dimensional array is stored as a flat structure continuously
in memory. So the only performance difference is (potnetially) due to
the difference between accessing array[y][x] as opposed to
array[y*width+x]. It's hard to think of circumstances where the first
would be slower, but often the second mght be slower because it is
harder for the compiler to optimise out constants.
However typically the dimensions of a 2d array are either not known at
compile time, or are undesireable to hardcode into low-level
functions. So the 1d method is usually the way to go, in C.
 
S

Stefan Ram

Dimilar Zhu said:
I have 10000 numbers. Now i need to choose a kind of data structure to save them.

If you have 10000 number you already /have/ a data structure
to hold them, otherwise you would not /have/ them.

If you want to /save/ them to a file, you do not need a data
structure but a data language (aka a file format) to
serialize (marshal, pickle, shelve) them to.
of 1-dimension and 2-dimension array, which is faster?

Arrays do not have a speed at all. Only operations
have a speed.
 
P

Phil Carmody

Luuk said:
Op 9-2-2010 13:42, Dimilar Zhu schreef:

if yu write a threaded application, than the number of threads will make
the speed.

so if you have a dual-core processor, take a 2-dimensional array

Was the smiley I snipped supposed to imply "the above's a load of
bollocks, please ignore it"?

The original question is of course malformed for many reasons: data
structures don't have speed, only operations upon data structures
have speed; what does 'save' mean - if you already 'have' the numbers,
then they're already 'saved' wherever you have them; etc. .

The only vaguely sensible answer is 'find the most likely bottlenecks,
and avoid them'. My guess would be that that's entirely dependent on
the memory architecture (caches, etc.), but that's nothing to do with
C, that's hardware configuration.

Phil
 
L

Luuk

Op 10-2-2010 10:45, Phil Carmody schreef:
Was the smiley I snipped supposed to imply "the above's a load of
bollocks, please ignore it"?


Phil

Yeah, it was,
Especially this piece:
"does it depend on factors such as degree of order, correlation,
operations?"

He was only trying to do things on 10000 numbers, and he's asking for
speed. Obviously he needs something completly different, so i'm confused
why this question was asked.
 
P

Phil Carmody

Luuk said:
Op 10-2-2010 10:45, Phil Carmody schreef:

Yeah, it was,

OK, you got me! :-D
Especially this piece:
"does it depend on factors such as degree of order, correlation,
operations?"

Well, were he to be sorting them, the speed might depend on the
degree of order. That's the problem with nutty questions, they
often leave holes large enough such that filling the gaps could
lead the answerer in many radically different directions.
He was only trying to do things on 10000 numbers, and he's asking for
speed. Obviously he needs something completly different, so i'm confused
why this question was asked.

Well, I suspect that pushes him out of the L1 cache of most
processors that have such a thing. In which case you'd definitely
want to split the job into two or more contiguous large chunks
rather than interleaving accesses, if that's possible. 10000 is
still considered a reasonably large FFT, for example, in particular
if the numbers are complex doubles, and non-aligned striding is
often used to improve cache usage.

As I say - enough holes to fill the gaps in some quite interesting
ways.

Phil
 
B

blmblm

Dimilar said:
I have 10000 numbers. Now i need to choose a kind of data structure to save them.
of 1-dimension and 2-dimension array, which is faster? does it depend
on factors such as degree of order, correlation, operations?

it depends. :)

what matter is how you access the data, for example how FORTRAN and C
store data in a matrix (i.e. 2-dimensional array) is different... in
FORTRAN it's optimized for matrix computations, while in C the matrix
components is stored column wise, that is, given

int matrix[j];

then has matrix[0][0] is at first memory location, and at the second
location we have matrix[0][1]... hence if you don't access the matrix
components that way, you risk triggering cache misses. To avoid cache
misses, you need locality of data during access, and it's a good idea to
avoid unaligned data as well.


Nitpick / terminology quibble:

Isn't this way of storing data in memory usually referred to as
"row-major order" (and contrasted with Fortran's "column-major order",
in which columns rather than rows are stored contiguously)?

Agreed, though, that the order in which one accesses elements can
make a substantial difference in performance.
Consider this inner loop

for (k=0; k < max; k++)
{
sum += a[k] * b[k][j];
}

here matrix a[][] is accessed column wise (same as C store the data),
but matrix b[][] is *not*... and you may risk stalling the cache pipeline.

Which may not matter a thing when doing disk or network I/O, as your CPU
then usually is idle most of the time anyway.... :)
 
R

Richard Tobin

Isn't this way of storing data in memory usually referred to as
"row-major order" (and contrasted with Fortran's "column-major order",
in which columns rather than rows are stored contiguously)?

I always have trouble remembering which of these terms means
which. I find it clearer to say that C's arrays are stored in
lexicographic order, that is, with the subscripts varying as
they do in an (English) index.

-- Richard
 
N

Nobody

I always have trouble remembering which of these terms means
which.

They don't mean anything until you designate one of the indices as being
the "row" and the other as the "column".

The mathematical convention is that a[i,j] refers to row i, column j. If
you translate that as a[j] in C, the result is that the array is in
row-major order (i.e. incrementing the row has a greater effect upon the
offset than incrementing the column).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top