Efficiency

Lars · Sep 22, 2008

I need to multiply to large matrices (of type double), where one is
"dense" (mostly non-zero entries) and the other is "sparse" (mostly
zero entries).

Will it be more efficient to check for non-zero entries, or does it
matter at all?

If B is "sparse":

for (i = 0; i <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = +; k <= nc-1; k++)
C[j] = C[j] + A[k]*B[k][j];
}
}
}

or

for (i = 1; 0 <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = 0; k <= nc-1; k++)
if (B[k][j] != 0)
C[j] = C[j] + A[k]*B[k][j];
}
}
}

?

Flash Gordon · Sep 22, 2008

Lars wrote, On 22/09/08 07:45:

I need to multiply to large matrices (of type double), where one is
"dense" (mostly non-zero entries) and the other is "sparse" (mostly
zero entries).

Will it be more efficient to check for non-zero entries, or does it
matter at all?

It depends on the processor and that is not the only thing that can
affect performance. I've used processors where multiply is a single
cycle (actually, multiply and accumulate was a single cycle) so doing a
test on that processor to avoid a single multiply would definitely make
it worse.

If B is "sparse":

for (i = 0; i <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = +; k <= nc-1; k++)
C[j] = C[j] + A[k]*B[k][j];
}
}
}

or

for (i = 1; 0 <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = 0; k <= nc-1; k++)
if (B[k][j] != 0)
C[j] = C[j] + A[k]*B[k][j];

I suggest looking to see if you can't re-arrange it so that the
condition is *not* in the inner loop and thus executed loads of times
but in an outer loop so it prevents execution of the loop.

Also look at whether you can store your sparse matrix more efficiently
and in such a way as to improve the execution.

For real efficiency you need to look at all sorts of processor specific
stuff such as what goes on with the cache. Those questions, however,
will be far too detailed for here and need to be asked on a group
dedicated to your processor.

Bartc · Sep 22, 2008

Lars said:
I need to multiply to large matrices (of type double), where one is
"dense" (mostly non-zero entries) and the other is "sparse" (mostly
zero entries).

Will it be more efficient to check for non-zero entries, or does it
matter at all?

If B is "sparse":

for (i = 0; i <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = +; k <= nc-1; k++)
C[j] = C[j] + A[k]*B[k][j];
}
}
}

You clearly haven't compiled the above...

Have you thought of simply... testing?

On my machine your second version testing for B[][] zero, was 40-50% faster
using gcc compiler, based on multiplying 100x200 matrices, containing /all/
zeros, a few thousand times.

Using "< nr" instead of "<= nr-1", and same for nc, also made a small
difference.

I also tried a char[][] map of the elements in B (containing 0 where B[][]
was 0.0), and this further doubled the speed (again containing all zeros) by
testing map[k][j] instead of B[k][j]. But this is only viable to build if
you will be multiplying by the same B any times. And again this was for my
machine; your machine may be totally different hence the need for your own
tests.

James Kuyper · Sep 22, 2008

Lars said:
I need to multiply to large matrices (of type double), where one is
"dense" (mostly non-zero entries) and the other is "sparse" (mostly
zero entries).

Will it be more efficient to check for non-zero entries, or does it
matter at all?

If B is "sparse":

for (i = 0; i <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = +; k <= nc-1; k++)
C[j] = C[j] + A[k]*B[k][j];
}
}
}

or

for (i = 1; 0 <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = 0; k <= nc-1; k++)
if (B[k][j] != 0)
C[j] = C[j] + A[k]*B[k][j];
}
}
}

The best way to find out is to try it, but I doubt that you'll see much
difference, and the answer to your question might be different on
different machines.

You only get significant improvement in efficiency for a sparse matrix
if you store (or at least, access) it as a sparse matrix. There are many
ways of doing so, depending upon the structure of the matrix. For
instance, a tri-diagonal matrix is a matrix where only three diagonals
of the matrix have non-zero values. Such matrices come up frequently in
certain kinds of applications. An NxN tri-diagonal matrix has only 3xN-2
non-zero elements, and only the non-zero elements need to be stored;
with negligible inefficiency, they can be stored in an Nx3 (or 3xN)
matrix, with one column (or row) for each diagonal of the full array.

Another way to store a sparse matrix is to just keep a list of the
non-zero values in the matrix, along with the row and column numbers
where that value is located.

For each way of storing a sparse matrix, there is a different way of
writing the matrix multiplication process so that it only bothers
performing those calculations that involve the non-zero elements of the
sparse matrix - not by testing for them, as your code does, but by
keeping track of where they are. That can save you a lot of processing
time if the matrix is sufficiently sparse. It can also waste a lot of
processing time if the matrix is only moderately sparse, because the
straightforward way of calculating the product of non-sparse matrices
can be implemented very efficiently, particularly on machines that have
any kind of vectorization capability.

Barry Schwarz · Sep 23, 2008

I need to multiply to large matrices (of type double), where one is
"dense" (mostly non-zero entries) and the other is "sparse" (mostly
zero entries).

Will it be more efficient to check for non-zero entries, or does it
matter at all?

If B is "sparse":

for (i = 0; i <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = +; k <= nc-1; k++)

I assume the k=+ is just a finger check.

If efficiency is that critical, why are you using <=nc-1 when <nc will
do the same without the unnecessary subtraction?

C[j] = C[j] + A[k]*B[k][j];
}
}
}

or

for (i = 1; 0 <= nr-1; i++) {
for (j = 0; j <= nc-1; j++) {
C[j] = 0.0;
for (k = 0; k <= nc-1; k++)
if (B[k][j] != 0)
C[j] = C[j] + A[k]*B[k][j];
}
}
}

?

Click to expand...

Fibonacci	0	May 13, 2023
Efficiency	0	Sep 19, 2008
Need help with this Python code.	2	Jun 13, 2023
Could you fix my code please? (i get no output after inputing the number)	1	Oct 16, 2023
How to multiply two matrices of size in using inline assembly in C++	2	Mar 3, 2024
Java matrix problem	3	Sep 10, 2023
Filter sober in c++ don't pass test	0	Dec 2, 2023
Need Help: Program to Accept 2 Matrices and Show their Sum	0	Aug 21, 2022

Efficiency

Lars

Flash Gordon

Bartc

James Kuyper

Barry Schwarz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads