# normalising a double[][]

Discussion in 'Java' started by jimgardener, Jul 16, 2008.

1. ### jimgardenerGuest

hi
i wrote the following code to normalise an double[][]..but it takes
too much time when i try arrays with about 10,000 number of columns (i
need this for an image processing app).can someone tell me how this
code can be improved?
thanks
Jim

public void normaliseArray(double[][] temp){
double[] mvals=new double[temp.length];
for(int i=0;i<temp.length;i++){
mvals=max(temp);
}

for(int i=0;i<temp.length;i++){
for(int j=0;j<temp[0].length;j++){
temp[j]/=mvals;
}
}
}

public static void main(String[] args){
double[][] myarray=new double[][]{
{1.,3.,5.,6.,8.},
{6.,8.,9.,2.,4.},
{4.,10.,2.,1.,5.},
{7.,1.,8.,2.,9.}
};

new MyClass().normaliseArray(myarray);

/* result should be
0.1250 0.3750 0.6250 0.7500 1.0000
0.6667 0.8889 1.0000 0.2222 0.4444
0.4000 1.0000 0.2000 0.1000 0.5000
0.7778 0.1111 0.8889 0.2222 1.0000
*/

}

jimgardener, Jul 16, 2008

2. ### jimgardenerGuest

private double max(double d1,double d2){
return d1 > d2? d1:d2;
}
private double max(double[] arr){
double m=arr[0];
for(int i=0;i<arr.length;i++){
m=max(m,arr);
}
return m;
}

jim

jimgardener, Jul 16, 2008

3. ### PhilippGuest

jimgardener wrote:
> hi
> i wrote the following code to normalise an double[][]..but it takes
> too much time when i try arrays with about 10,000 number of columns (i
> need this for an image processing app).can someone tell me how this
> code can be improved?
> thanks
> Jim
>
> public void normaliseArray(double[][] temp){
> double[] mvals=new double[temp.length];
> for(int i=0;i<temp.length;i++){
> mvals=max(temp);
> }
>
> for(int i=0;i<temp.length;i++){
> for(int j=0;j<temp[0].length;j++){
> temp[j]/=mvals;
> }
> }
> }
>
>
> public static void main(String[] args){
> double[][] myarray=new double[][]{
> {1.,3.,5.,6.,8.},
> {6.,8.,9.,2.,4.},
> {4.,10.,2.,1.,5.},
> {7.,1.,8.,2.,9.}
> };
>
> new MyClass().normaliseArray(myarray);
>
> /* result should be
> 0.1250 0.3750 0.6250 0.7500 1.0000
> 0.6667 0.8889 1.0000 0.2222 0.4444
> 0.4000 1.0000 0.2000 0.1000 0.5000
> 0.7778 0.1111 0.8889 0.2222 1.0000
> */
>
> }

If your application is also filling the matrix (which it presumably
does, if it is not hard coded), you can keep track of the the maxima
for each line in a separate array while you are filling it. Thus you
do not need to find the maximum anymore. (the usual memory/speed

If you only use parts of the array and not all of it, you could delay
calculation of the normalized value to when the value is really
requested.

Just my 2cents. Phil

Philipp, Jul 16, 2008
4. ### Tom AndersonGuest

On Wed, 16 Jul 2008, rossum wrote:

> On Wed, 16 Jul 2008 00:23:39 -0700 (PDT), jimgardener
> <> wrote:
>
>> i wrote the following code to normalise an double[][]..but it takes too
>> much time when i try arrays with about 10,000 number of columns (i need
>> this for an image processing app).can someone tell me how this code can
>> be improved?

>
> The biggest speedup can usually be gained by using a more efficient
> algorithm. Have you searched for algorithms available for normalizing
> arrays? How do different open source projects approach this problem?

I struggle to see how it could be asymptotically any faster; it's going to
be at least O(n) no matter what you do, surely?

> For very large arrays you might also want to consider whether your
> computer organises its arrays in row order or column order.

This is java - arrays are one-dimensional, so this issue is moot. What
appear to be two-dimensional arrays are in fact arrays of arrays.

Note that this means that each of the inner arrays could be in completely
different places in memory. Thus, you really, really want to access them
row-by-row. If you want to have all the elements of the matrix in one
place, you have to fake up a two-dimensional array by doing the index
calculations yourself:

public final class Matrix {
private double[] elements ;
public final int width ;
public final int height ;

public Matrix(int width, int height) {
elements = new double[width * height] ;
this.width = width ;
this.height = height ;
}
public double get(int x, int y) {
return elements[index(x, y)] ;
}
public void set(int x, int y, double d) {
elements[index(x, y)] = d ;
}
private int index(int x, int y) {
// delete these two lines if you want to play fast and loose
if ((x < 0) || (x >= width)) throw new ArrayIndexOutOfBoundsException(x) ;
if ((y < 0) || (y >= height)) throw new ArrayIndexOutOfBoundsException(y) ;
return x + (width * y) ;
}
}

Pretty easy, and just as fast as a language-level two-dimensional array.

> Arranging you code to maximise cache hits can speed up array accesses.

Very good advice. For small rows, using a packed matrix, as above, will
help with this. And whether the rows are large or small, taking Patricia's
advice and doing ranging and normalising on a row in one pass is also
good.

>> public void normaliseArray(double[][] temp){
>> double[] mvals=new double[temp.length];
>> for(int i=0;i<temp.length;i++){

>
> You may obtain some speedup by avoiding calls to temp.length, you
> might want to time an alternative such as:

Any compiler worth its salt will hoist the .length.

>> mvals=max(temp);
>> }
>>
>> for(int i=0;i<temp.length;i++){
>> for(int j=0;j<temp[0].length;j++){

>

Ditto.

> Also note that your first loop here is identical to your loop above.
> You can place the contents of that loop inside the outer loop, before
> the inner loop.

Yes.

>> temp[j]/=mvals;

> You need to be *very* sure that mvals is not zero here. Division
> by zero will have a deleterious effect on the results of your
> calculation.

The only way it could be zero is if every element of the row was zero, or
less than zero. If the OP is confident that the input is never negative,
zeroes is ill-defined anyway, so getting a bunch of NaNs in that situation
is fair!

tom

--
If a scientist were to cut his ear off, no one would take it as evidence
of heightened sensibility -- Peter Medawar

Tom Anderson, Jul 16, 2008
5. ### Neil CoffeyGuest

jimgardener wrote:

> for(int i=0;i<temp.length;i++){
> for(int j=0;j<temp[0].length;j++){
> temp[j]/=mvals;
> }
> }

In addition to what other people have mentioned, try dereferencing
the first array index in the outer loop. In other words:

for(int i=0;i<temp.length;i++) {
double[] t = temp;
for(int j=0;j<temp[0].length;j++){
t[j]/=mvals;
}
}

There might be a slight gain in counting down rather than up (there
are special instructions for 'comparing with zero'), and this makes
it easy to avoid the multiple calls to ARRAYLENGTH:

for (int i = temp.length-1; i >= 0; i--) {
...
}

You could also try unrolling the inner loop (though the JIT compiler
may do this anyway) if you know that the number of colums is divisible
by a certain number.

Oh, you should possibly verify that the VM *is* actually running
your code as native code (by running in profiler mode and seeing if
time in your method is spent in the 'compiled' or 'interpreted'
section).

Neil

Neil Coffey, Jul 18, 2008
6. ### Arne VajhøjGuest

jimgardener wrote:
> i wrote the following code to normalise an double[][]..but it takes
> too much time when i try arrays with about 10,000 number of columns (i
> need this for an image processing app).can someone tell me how this
> code can be improved?

> public void normaliseArray(double[][] temp){
> double[] mvals=new double[temp.length];
> for(int i=0;i<temp.length;i++){
> mvals=max(temp);
> }
>
> for(int i=0;i<temp.length;i++){
> for(int j=0;j<temp[0].length;j++){
> temp[j]/=mvals;
> }
> }
> }

You can make a few tweaks. Math.max and multiply with
1/mvals calculated once seems as suggested seems
worth trying.

But if you have a new computer, then you can also try

It should scale well so in theory it should run 4 times

Arne

Arne Vajhøj, Jul 19, 2008
7. ### Arne VajhøjGuest

rossum wrote:
> On Wed, 16 Jul 2008 00:23:39 -0700 (PDT), jimgardener
> <> wrote:
>> i wrote the following code to normalise an double[][]..but it takes
>> too much time when i try arrays with about 10,000 number of columns (i
>> need this for an image processing app).can someone tell me how this
>> code can be improved?

> The biggest speedup can usually be gained by using a more efficient
> algorithm. Have you searched for algorithms available for normalizing
> arrays?

I find it very difficult to see how the described logic can be done
without the shown loops for a general case.

Arne

Arne Vajhøj, Jul 19, 2008
8. ### Arne VajhøjGuest

rossum wrote:
> On Fri, 18 Jul 2008 20:42:01 -0400, Arne Vajhøj <>
> wrote:
>> rossum wrote:
>>> On Wed, 16 Jul 2008 00:23:39 -0700 (PDT), jimgardener
>>> <> wrote:
>>>> i wrote the following code to normalise an double[][]..but it takes
>>>> too much time when i try arrays with about 10,000 number of columns (i
>>>> need this for an image processing app).can someone tell me how this
>>>> code can be improved?
>>> The biggest speedup can usually be gained by using a more efficient
>>> algorithm. Have you searched for algorithms available for normalizing
>>> arrays?

>> I find it very difficult to see how the described logic can be done
>> without the shown loops for a general case.

> I also find it difficult to imagine such an algorithm, but that does
> not mean that one does not exist.

It is impossible to:
- find the maximum value in an array without looking at all elements
in the array
- change all elements in an array without accessing all elements
in the array
- use a calculated value before it is calculated

This is not just lack of imagination - this is close to proof.

Arne

Arne Vajhøj, Jul 19, 2008
9. ### Daniel PittsGuest

rossum wrote:
> On Wed, 16 Jul 2008 08:51:31 -0400, Lew <com.lewscanon@lew> wrote:
>
>> rossum wrote:
>>> Why are you using a function call here? What is wrong with:
>>> m = (m > arr) ? m : arr
>>> function calls can take more time than an inline expression.

>> Or not. Function calls get inlined all the time.

> Agreed, but not always the first few times. For something as simple
> as this (i.e. assuming no NaNs or infinities) I would be inclined to
> inline it explicitly, especially as the OP already has an issue with
> timing.
>
> rossum
>

Actually, if I was truly optimizing this (without using a profiler, tsk,
tsk...)

if (arr > m) m = arr;
This ensures that the only assignments are meaningful ones.

As for the normalization routines:

public static void normalize(double[][] array) {
for (double[] line: array) {
final double maxValue = max(line);
for (int i = 0; i < line.length; ++i) {
line /= maxValue;
}
}
}

That removes an unnecessary "new double[]" to story the max values at least.

If this still isn't fast enough, use a profiler, or see if you can
achieve your ultimate goal without normalizing so much data.

If you have a multi-processor system, you can get a linear speed up by
splitting the work into different threads.

Hope this helps,
Daniel.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

Daniel Pitts, Jul 20, 2008