normalising a double[][]

Discussion in 'Java' started by jimgardener, Jul 16, 2008.

  1. jimgardener

    jimgardener Guest

    hi
    i wrote the following code to normalise an double[][]..but it takes
    too much time when i try arrays with about 10,000 number of columns (i
    need this for an image processing app).can someone tell me how this
    code can be improved?
    thanks
    Jim

    public void normaliseArray(double[][] temp){
    double[] mvals=new double[temp.length];
    for(int i=0;i<temp.length;i++){
    mvals=max(temp);
    }

    for(int i=0;i<temp.length;i++){
    for(int j=0;j<temp[0].length;j++){
    temp[j]/=mvals;
    }
    }
    }


    public static void main(String[] args){
    double[][] myarray=new double[][]{
    {1.,3.,5.,6.,8.},
    {6.,8.,9.,2.,4.},
    {4.,10.,2.,1.,5.},
    {7.,1.,8.,2.,9.}
    };

    new MyClass().normaliseArray(myarray);

    /* result should be
    0.1250 0.3750 0.6250 0.7500 1.0000
    0.6667 0.8889 1.0000 0.2222 0.4444
    0.4000 1.0000 0.2000 0.1000 0.5000
    0.7778 0.1111 0.8889 0.2222 1.0000
    */

    }
    jimgardener, Jul 16, 2008
    #1
    1. Advertising

  2. jimgardener

    jimgardener Guest

    also two methods added

    private double max(double d1,double d2){
    return d1 > d2? d1:d2;
    }
    private double max(double[] arr){
    double m=arr[0];
    for(int i=0;i<arr.length;i++){
    m=max(m,arr);
    }
    return m;
    }

    jim
    jimgardener, Jul 16, 2008
    #2
    1. Advertising

  3. jimgardener

    Philipp Guest

    jimgardener wrote:
    > hi
    > i wrote the following code to normalise an double[][]..but it takes
    > too much time when i try arrays with about 10,000 number of columns (i
    > need this for an image processing app).can someone tell me how this
    > code can be improved?
    > thanks
    > Jim
    >
    > public void normaliseArray(double[][] temp){
    > double[] mvals=new double[temp.length];
    > for(int i=0;i<temp.length;i++){
    > mvals=max(temp);
    > }
    >
    > for(int i=0;i<temp.length;i++){
    > for(int j=0;j<temp[0].length;j++){
    > temp[j]/=mvals;
    > }
    > }
    > }
    >
    >
    > public static void main(String[] args){
    > double[][] myarray=new double[][]{
    > {1.,3.,5.,6.,8.},
    > {6.,8.,9.,2.,4.},
    > {4.,10.,2.,1.,5.},
    > {7.,1.,8.,2.,9.}
    > };
    >
    > new MyClass().normaliseArray(myarray);
    >
    > /* result should be
    > 0.1250 0.3750 0.6250 0.7500 1.0000
    > 0.6667 0.8889 1.0000 0.2222 0.4444
    > 0.4000 1.0000 0.2000 0.1000 0.5000
    > 0.7778 0.1111 0.8889 0.2222 1.0000
    > */
    >
    > }


    If your application is also filling the matrix (which it presumably
    does, if it is not hard coded), you can keep track of the the maxima
    for each line in a separate array while you are filling it. Thus you
    do not need to find the maximum anymore. (the usual memory/speed
    tradeoff)

    If you only use parts of the array and not all of it, you could delay
    calculation of the normalized value to when the value is really
    requested.

    Just my 2cents. Phil
    Philipp, Jul 16, 2008
    #3
  4. jimgardener

    Tom Anderson Guest

    On Wed, 16 Jul 2008, rossum wrote:

    > On Wed, 16 Jul 2008 00:23:39 -0700 (PDT), jimgardener
    > <> wrote:
    >
    >> i wrote the following code to normalise an double[][]..but it takes too
    >> much time when i try arrays with about 10,000 number of columns (i need
    >> this for an image processing app).can someone tell me how this code can
    >> be improved?

    >
    > The biggest speedup can usually be gained by using a more efficient
    > algorithm. Have you searched for algorithms available for normalizing
    > arrays? How do different open source projects approach this problem?


    I struggle to see how it could be asymptotically any faster; it's going to
    be at least O(n) no matter what you do, surely?

    > For very large arrays you might also want to consider whether your
    > computer organises its arrays in row order or column order.


    This is java - arrays are one-dimensional, so this issue is moot. What
    appear to be two-dimensional arrays are in fact arrays of arrays.

    Note that this means that each of the inner arrays could be in completely
    different places in memory. Thus, you really, really want to access them
    row-by-row. If you want to have all the elements of the matrix in one
    place, you have to fake up a two-dimensional array by doing the index
    calculations yourself:

    public final class Matrix {
    private double[] elements ;
    public final int width ;
    public final int height ;

    public Matrix(int width, int height) {
    elements = new double[width * height] ;
    this.width = width ;
    this.height = height ;
    }
    public double get(int x, int y) {
    return elements[index(x, y)] ;
    }
    public void set(int x, int y, double d) {
    elements[index(x, y)] = d ;
    }
    private int index(int x, int y) {
    // delete these two lines if you want to play fast and loose
    if ((x < 0) || (x >= width)) throw new ArrayIndexOutOfBoundsException(x) ;
    if ((y < 0) || (y >= height)) throw new ArrayIndexOutOfBoundsException(y) ;
    return x + (width * y) ;
    }
    }

    Pretty easy, and just as fast as a language-level two-dimensional array.

    > Arranging you code to maximise cache hits can speed up array accesses.


    Very good advice. For small rows, using a packed matrix, as above, will
    help with this. And whether the rows are large or small, taking Patricia's
    advice and doing ranging and normalising on a row in one pass is also
    good.

    >> public void normaliseArray(double[][] temp){
    >> double[] mvals=new double[temp.length];
    >> for(int i=0;i<temp.length;i++){

    >
    > You may obtain some speedup by avoiding calls to temp.length, you
    > might want to time an alternative such as:


    Any compiler worth its salt will hoist the .length.

    >> mvals=max(temp);
    >> }
    >>
    >> for(int i=0;i<temp.length;i++){
    >> for(int j=0;j<temp[0].length;j++){

    >
    > The same idea might help you here for both loops.


    Ditto.

    > Also note that your first loop here is identical to your loop above.
    > You can place the contents of that loop inside the outer loop, before
    > the inner loop.


    Yes.

    >> temp[j]/=mvals;

    > You need to be *very* sure that mvals is not zero here. Division
    > by zero will have a deleterious effect on the results of your
    > calculation.


    The only way it could be zero is if every element of the row was zero, or
    less than zero. If the OP is confident that the input is never negative,
    you don't need worry about this. I think the normalisation of a row of
    zeroes is ill-defined anyway, so getting a bunch of NaNs in that situation
    is fair!

    tom

    --
    If a scientist were to cut his ear off, no one would take it as evidence
    of heightened sensibility -- Peter Medawar
    Tom Anderson, Jul 16, 2008
    #4
  5. jimgardener

    Neil Coffey Guest

    jimgardener wrote:

    > for(int i=0;i<temp.length;i++){
    > for(int j=0;j<temp[0].length;j++){
    > temp[j]/=mvals;
    > }
    > }


    In addition to what other people have mentioned, try dereferencing
    the first array index in the outer loop. In other words:

    for(int i=0;i<temp.length;i++) {
    double[] t = temp;
    for(int j=0;j<temp[0].length;j++){
    t[j]/=mvals;
    }
    }

    There might be a slight gain in counting down rather than up (there
    are special instructions for 'comparing with zero'), and this makes
    it easy to avoid the multiple calls to ARRAYLENGTH:

    for (int i = temp.length-1; i >= 0; i--) {
    ...
    }

    You could also try unrolling the inner loop (though the JIT compiler
    may do this anyway) if you know that the number of colums is divisible
    by a certain number.

    Oh, you should possibly verify that the VM *is* actually running
    your code as native code (by running in profiler mode and seeing if
    time in your method is spent in the 'compiled' or 'interpreted'
    section).

    Neil
    Neil Coffey, Jul 18, 2008
    #5
  6. jimgardener

    Arne Vajhøj Guest

    jimgardener wrote:
    > i wrote the following code to normalise an double[][]..but it takes
    > too much time when i try arrays with about 10,000 number of columns (i
    > need this for an image processing app).can someone tell me how this
    > code can be improved?


    > public void normaliseArray(double[][] temp){
    > double[] mvals=new double[temp.length];
    > for(int i=0;i<temp.length;i++){
    > mvals=max(temp);
    > }
    >
    > for(int i=0;i<temp.length;i++){
    > for(int j=0;j<temp[0].length;j++){
    > temp[j]/=mvals;
    > }
    > }
    > }


    You can make a few tweaks. Math.max and multiply with
    1/mvals calculated once seems as suggested seems
    worth trying.

    But if you have a new computer, then you can also try
    doing it in 2 threads (dual core) or 4 threads (quad core).

    It should scale well so in theory it should run 4 times
    as fast on a quad core with 4 threads.

    Arne
    Arne Vajhøj, Jul 19, 2008
    #6
  7. jimgardener

    Arne Vajhøj Guest

    rossum wrote:
    > On Wed, 16 Jul 2008 00:23:39 -0700 (PDT), jimgardener
    > <> wrote:
    >> i wrote the following code to normalise an double[][]..but it takes
    >> too much time when i try arrays with about 10,000 number of columns (i
    >> need this for an image processing app).can someone tell me how this
    >> code can be improved?

    > The biggest speedup can usually be gained by using a more efficient
    > algorithm. Have you searched for algorithms available for normalizing
    > arrays?


    I find it very difficult to see how the described logic can be done
    without the shown loops for a general case.

    Arne
    Arne Vajhøj, Jul 19, 2008
    #7
  8. jimgardener

    Arne Vajhøj Guest

    rossum wrote:
    > On Fri, 18 Jul 2008 20:42:01 -0400, Arne Vajhøj <>
    > wrote:
    >> rossum wrote:
    >>> On Wed, 16 Jul 2008 00:23:39 -0700 (PDT), jimgardener
    >>> <> wrote:
    >>>> i wrote the following code to normalise an double[][]..but it takes
    >>>> too much time when i try arrays with about 10,000 number of columns (i
    >>>> need this for an image processing app).can someone tell me how this
    >>>> code can be improved?
    >>> The biggest speedup can usually be gained by using a more efficient
    >>> algorithm. Have you searched for algorithms available for normalizing
    >>> arrays?

    >> I find it very difficult to see how the described logic can be done
    >> without the shown loops for a general case.

    > I also find it difficult to imagine such an algorithm, but that does
    > not mean that one does not exist.


    It is impossible to:
    - find the maximum value in an array without looking at all elements
    in the array
    - change all elements in an array without accessing all elements
    in the array
    - use a calculated value before it is calculated

    This is not just lack of imagination - this is close to proof.

    Arne
    Arne Vajhøj, Jul 19, 2008
    #8
  9. jimgardener

    Daniel Pitts Guest

    rossum wrote:
    > On Wed, 16 Jul 2008 08:51:31 -0400, Lew <com.lewscanon@lew> wrote:
    >
    >> rossum wrote:
    >>> Why are you using a function call here? What is wrong with:
    >>> m = (m > arr) ? m : arr
    >>> function calls can take more time than an inline expression.

    >> Or not. Function calls get inlined all the time.

    > Agreed, but not always the first few times. For something as simple
    > as this (i.e. assuming no NaNs or infinities) I would be inclined to
    > inline it explicitly, especially as the OP already has an issue with
    > timing.
    >
    > rossum
    >

    Actually, if I was truly optimizing this (without using a profiler, tsk,
    tsk...)

    if (arr > m) m = arr;
    This ensures that the only assignments are meaningful ones.

    As for the normalization routines:

    public static void normalize(double[][] array) {
    for (double[] line: array) {
    final double maxValue = max(line);
    for (int i = 0; i < line.length; ++i) {
    line /= maxValue;
    }
    }
    }

    That removes an unnecessary "new double[]" to story the max values at least.

    If this still isn't fast enough, use a profiler, or see if you can
    achieve your ultimate goal without normalizing so much data.

    If you have a multi-processor system, you can get a linear speed up by
    splitting the work into different threads.

    Hope this helps,
    Daniel.

    --
    Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
    Daniel Pitts, Jul 20, 2008
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Web learner

    from List <double> to double[]

    Web learner, Apr 25, 2006, in forum: ASP .Net
    Replies:
    3
    Views:
    462
  2. sb
    Replies:
    4
    Views:
    297
    Alberto Barbati
    Feb 19, 2004
  3. Jacek Dziedzic
    Replies:
    5
    Views:
    370
    Old Wolf
    Apr 8, 2004
  4. ferran
    Replies:
    9
    Views:
    3,006
    Kevin Goodsell
    Apr 12, 2004
  5. Sydex
    Replies:
    12
    Views:
    6,452
    Victor Bazarov
    Feb 17, 2005
Loading...

Share This Page