jni to optimize a java application using native mathematical libraries

Discussion in 'Java' started by dimitri.ognibene@gmail.com, Apr 2, 2006.

  1. Guest

    Hi,
    I've built an application, nn and simulation with a lot of
    visualization , in java, now my computational needs are exceeding the
    power of my system.
    I usually used the observer pattern to listen to changes on the data
    model, and the visualization only redraws if needed and isn't
    synchronized with the numerical compuation.
    However i read matrix data inside my paint methods.

    Can i use jni and an optimized native library to implement the
    numerical elaboration?
    Doing so, will my visualization code be preserved and reused?
    Will my system perform better?

    I think that much of the matter is in the grain of jni interface and
    data exchanged between the 2 systems.. but I've not the experience to
    give you all data without a little preliminary help.

    So this data are only to describe the problem in general. I'll give the
    details that you tell me are needed

    Thanks
    , Apr 2, 2006
    #1
    1. Advertising

  2. Thanks gordon,
    i suppose to use Intel Math Kernel Library to re-implement Neural
    Networks code. I will need to extract all data (connections weights and
    activation values) any time the gui is refreshed.
    so you say
    >1)make as few, as "large" calls to native methods as possible.
    >2)try to reduce the amount of data you have to copy back and forth.

    I think to have only calls to train and evaluate..
    but i want that the native library preserve results, becouse they will
    be used over and over again in the calculation.. while java needs only
    to send new data and read data to display results.
    >pass only primitives (or arrays of primitives) to your native

    methods in order to reduce their dependency on invoking Java methods
    or JNI accessors to get the job done.
    I suppose to only pass multi-dim double arrays...so here I've no
    problem



    I'm still in doubt that in this situation perhaps i'll gain more and
    more easily rewriting all using c++ or using something like Ninja
    classes by ibm to implement calculation, or find an already built
    interface (jni or nio) to optimized mathematical libraries.
    Any further suggestion?
    Thanks Dimitri
    Dimitri Ognibene, Apr 3, 2006
    #2
    1. Advertising

  3. On 2 Apr 2006 14:08:59 -0700, wrote:
    > Will my system perform better?
    >
    > I think that much of the matter is in the grain of jni interface and
    > data exchanged between the 2 systems..


    You have the right idea. JNI is not a guarantee of better performance,
    but you can increase its chances of improving your performance by
    following a few simple rules:

    - make as few, as "large" calls to native methods as possible.

    - try to reduce the amount of data you have to copy back and forth.

    - pass only primitives (or arrays of primitives) to your native
    methods in order to reduce their dependency on invoking Java methods
    or JNI accessors to get the job done.

    - If you need to return results from a method, use the return value
    like it was intended, i.e. avoid writing void methods that pass
    results back through object reference arguments or by updating
    fields in the calling object.

    Note that CPU bound calculations aren't necessarily much faster in
    native code than in Java, so the potential gain is easily lost if you
    aren't careful.

    /gordon

    --
    [ do not email me copies of your followups ]
    g o r d o n + n e w s @ b a l d e r 1 3 . s e
    Gordon Beaton, Apr 3, 2006
    #3
  4. "Dimitri Ognibene" <> wrote in message
    news:...
    > Thanks gordon,
    > i suppose to use Intel Math Kernel Library to re-implement Neural
    > Networks code. I will need to extract all data (connections weights and
    > activation values) any time the gui is refreshed.
    > so you say
    >>1)make as few, as "large" calls to native methods as possible.
    >>2)try to reduce the amount of data you have to copy back and forth.

    > I think to have only calls to train and evaluate..
    > but i want that the native library preserve results, becouse they will
    > be used over and over again in the calculation.. while java needs only
    > to send new data and read data to display results.
    >>pass only primitives (or arrays of primitives) to your native

    > methods in order to reduce their dependency on invoking Java methods
    > or JNI accessors to get the job done.
    > I suppose to only pass multi-dim double arrays...so here I've no
    > problem
    >
    >
    >
    > I'm still in doubt that in this situation perhaps i'll gain more and
    > more easily rewriting all using c++ or using something like Ninja
    > classes by ibm to implement calculation, or find an already built
    > interface (jni or nio) to optimized mathematical libraries.
    > Any further suggestion?
    > Thanks Dimitri
    >


    Err, i sincerely doubt multi dimensional arrays, mathematics, neural net
    computation and such are considerably faster in C++ compared to Java. Have
    you actually profiled your code and checked where the hotspots are? Have you
    tried running your code in the server VM? Your original post suggests you
    overdesigned parts of your code (i.e. implemented the observer pattern).
    Basically i dont think your C++ code will be considerably faster than a
    direct port in Java. Claiming C++ magically makes your code 5 times as fast
    is very 1998.
    Remon van Vliet, Apr 3, 2006
    #4
  5. hi Remon
    i think so too, but i've already profiled a lot, and I'll do it again,
    but the intel libraries are very fast adn scalable to multi-core so I
    look at them as a possible solution. My application actually is a
    simulator running on a desktop but i hope to parellelize it asap, but
    there will be big sync and performance issues. do you have any
    suggestion to visualize data without requiring "overdesigned" patterns
    like observer? my system is pretty complex with some async neural
    networks that interact.. and I don't want to fill my code of things
    like redraw or similar, i dislike observable.update too, but it the
    littlest evil i've found. One other problem is to do the as few copies
    of data as possible but I need some buffer to let the components
    interact.. and to don't broke my data by error (why not final arrays in
    java? :-(
    This are the first points i will optimize, and i've seen that jni will
    increace array copies...
    If you have any suggestion please let me know.
    Dimitri
    Dimitri Ognibene, Apr 4, 2006
    #5
  6. Roedy Green Guest

    On Mon, 3 Apr 2006 23:22:56 +0200, "Remon van Vliet"
    <> wrote, quoted or indirectly quoted someone who
    said :

    >Claiming C++ magically makes your code 5 times as fast
    >is very 1998.


    recent benchmarks posted here show the reverse. Java compiler
    technology is now ahead of C++.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
    Roedy Green, Apr 4, 2006
    #6
  7. James Westby Guest

    Re: jni to optimize a java application using native mathematicallibraries

    Roedy Green wrote:
    > On Mon, 3 Apr 2006 23:22:56 +0200, "Remon van Vliet"
    > <> wrote, quoted or indirectly quoted someone who
    > said :
    >
    >> Claiming C++ magically makes your code 5 times as fast
    >> is very 1998.

    >
    > recent benchmarks posted here show the reverse. Java compiler
    > technology is now ahead of C++.


    Matlab has recently deprecated the tool that automatically converts .m
    files to mex files (kind of JNI for Matlab) as they say that the
    run-time optimistation performed in Matlab removes the need to do this.

    However I have just finished moving some Matlab code from an m file to
    C. The code basically computes a random walk, and so involves numerical
    integration of a function involving gaussians. I moved from using
    Matlab's randn (very useful to have a generator with gaussian pdf built
    in) to using the GNU scientific library. This has speeded up my code a
    staggering amount, and makes it reasonable to run the experiments now.

    I'm not sure how the randn function is implemented in Matlab, so I'm not
    sure where the optimisations are coming from, but they impress me none
    the less. I also think that Matlab's JIT is not a Java JIT.

    I'm not trying to say Java is slow, and I certainly don't believe it is.
    This function is where the code previously spent 99% of it's time, and
    was executed approximately 2.5 million times per run, needing to
    generate 50 million random numbers in that time, and this is just the
    simple test while I'm developing the code, the real numbers will
    probably be thousands of times bigger. I realise that this is the
    comment usually made about optimisation, that it should be done at
    exactly this point, and I agree with the arguments. The optimisation I
    did involved switching to highly developed code using rigorously studied
    algorithms, far far far better than I could ever have implemented myself.

    If I have time then I will be porting a lot of this code over to Java,
    and if I do I will post some measurements of how Sun's HotSpot fares on
    this code, as I assume it would be a perfect candidate for their
    optimisations (Short loops, but enough in them to give the CPU something
    to do between branches, highly predictable branching, few memory
    requirements, though I wouldn't exactly call it a real-world application).


    James
    James Westby, Apr 4, 2006
    #7
  8. Roedy Green Guest

    On Tue, 04 Apr 2006 03:23:56 GMT, James Westby <>
    wrote, quoted or indirectly quoted someone who said :

    > I also think that Matlab's JIT is not a Java JIT.


    If you can collect all the jars, try compiling it with Jet and see
    what sort of speed you get.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
    Roedy Green, Apr 4, 2006
    #8
  9. Thanks James,

    I've done exaclty the same considerations, the problem (mine specific
    problem) is not if java is faster then c++, but to use optimized
    libraries like GNU scientific library, that i didn't know till your
    answer (thks), or Intel MKL, to replace mine non optimized, and not
    truly able to optimize any further, code. In my lab we have used MKL
    exp function instead of standard c implementation and we obtained a
    speed up of 5 times in our neural network code, and we can't link
    statically!!!
    Now, I'm sure that those libraries are faster then any code i'll ever
    write, but I don't know if interfacing them with my already written
    (130 classes) java system. I suppose, as I've said in my first post,
    that in my specific application the use of an external library using
    JNI will make the system much more complex (And difficult to mantain
    and debug) and only a little faster.
    I will be happier if I find a good java Math library, even if not as
    good as MKL, and I've not to write boring JNI stuff, array copyng
    methods and so on.
    If I'm right you are translating your entire Matlab simulation to c, i
    don't want to do this, at the moment, my coworker want, but I'm the
    sw-engineer in the lab so.. much of the effort and of the decisions are
    mine.. And i would like to find a compromise using some good Math
    library to optimize the code where , as you said, 90% cpu of time is
    spent, random, gaussian, sin and similar functions and perhaps Matrix
    multiplication.. the use of java multidimensianl arrays isn't good,
    I've tried to optimizing using code like:

    double matrix1[50][900];
    double vector2[900];
    for(int i =0;i<50;i++){
    final double matrix_col[]=matrix;
    for(int j=0;j<900;j++){
    .....}}

    But only little speed up is gained..
    And I don't have the time to find optimization tricks.. so a library
    perhaps would be a better solution..
    If you have any suggestion, please let me know
    Dimitri Ognibene, Apr 4, 2006
    #9
  10. Thanks Gordon,
    I already know this page, it looks outdated.. but it contains several
    interesting links.. Like the colt project. However the libraries links
    that I've found look outdated too.. has numerical compuation in java
    disappeared? If you have ever used one of this libraries or have any
    other insight please let me know.
    p.s. I've found a project on sf to interface java to gsl
    http://sourceforge.net/projects/gsl-java, does anyone ever used it? It
    look outadated tooooo :(

    Good work
    Dimitri
    Dimitri Ognibene, Apr 4, 2006
    #10
  11. On 4 Apr 2006 00:50:28 -0700, Dimitri Ognibene wrote:
    > And I don't have the time to find optimization tricks.. so a library
    > perhaps would be a better solution..


    Do any of these libraries help?

    http://math.nist.gov/javanumerics/

    --
    [ do not email me copies of your followups ]
    g o r d o n + n e w s @ b a l d e r 1 3 . s e
    Gordon Beaton, Apr 4, 2006
    #11
  12. Chris Uppal Guest

    wrote:

    > I usually used the observer pattern to listen to changes on the data
    > model, and the visualization only redraws if needed and isn't
    > synchronized with the numerical compuation.
    > However i read matrix data inside my paint methods.


    I suspect that you should use /more/ copying of data, not less. Your
    simulation engine will run best if it can ignore the possibility that something
    else is reading the same data. So it runs at full speed on one thread (not
    doing any synchronisation). At extremely long intervals by computer
    standards -- roughly once a second, say -- it makes a copy of the current state
    of the simulation, and saves it. The test for whether to do that is in the
    outermost loop of the simulation, and so that will have negligible effect on
    the overall speed. When it determines that it is time to make a copy, it does
    so, and then (and only then) uses a synchronised method to save the new
    description of the state.

    The GUI meantime (running on a different thread) updates the screen display at
    regular intervals. To do that it uses a synchronised method to get the most
    recent copy and refreshed from that. It keeps that copy around so that it can
    repaint() itself as necessary.

    Depending on how you've structured your existing code, making the copy may be
    almost trivial. Note that you will have no display-related code in the
    simulation engine at all (not even triggering notifications for any Observers).


    > Can i use jni and an optimized native library to implement the
    > numerical elaboration?
    > Doing so, will my visualization code be preserved and reused?
    > Will my system perform better?


    A lot depends on how much work you do in each call to JNI. If you are just
    doing something trivial like generating the next random number, then almost
    certainly not. The cost of crossing the JNI barrier is pretty high, and will
    swamp the gains from using (say) Intel's maths libraries. On the other hand,
    if you have some slow operation (like matrix multiplication) where the time
    taken is high, and -- more importantly -- the time required to copy any
    necessary data across the JNI barrier is small in comparison[*] then using the
    native libraries may help you.

    ([*] Copy is O(N) but if the operation is, say, O(N**2) then you can ignore the
    cost of the copy.)

    It may be that your code is dominated by a small number of slow operations
    which can be implemented quickly in an external library. For instance it may
    be that array multiplication dominates the time, and that the Intel library has
    a particularly well-tuned implementation of that. If that applies then you may
    see big gains by using JNI for array multiplication. If not then you'll have
    to rewrite your code so that the bulk of the implementation /is/ in C/C++ if
    you want to take advantage of Intel's libraries -- e..g make each step of your
    simulation into a single call to JNI.

    BTW, the way that Java represents 2D arrays is not efficient, and is probably
    incompatible with what an external library would expect. If you represent a
    logically two-dimensional array of doubles as a double[][] then each access
    will require two indirections. A better scheme (albeit quite a bit more work)
    is to represent it as a single double[] and use arithmetic combinations of the
    row/column coordinates to find each element. The external library will expect
    to find the data in this format anyway, so by using it internally you minimise
    the messing around (and perhaps the copying too) when you cross the JNI
    barrier. For instance from one of your later posts in this thread:

    > double matrix1[50][900];
    > for(int i =0;i<50;i++){
    > final double matrix_col[]=matrix;
    > for(int j=0;j<900;j++){
    > ....}}


    becomes:

    double[] matrix = new double[50*900];
    for (int i = 0; i < 50; i++)
    {
    int start = i * 900;
    int end = start + 900;
    for (int j = start; j < end; j++)
    {
    float elem = matix[j];
    ...
    }
    }

    Some people have reported seeing useful speedups using that technique (not huge
    but useful), but the main reason for using it is so that highly tuned external
    implementations of the array operation can work on the data more-or-less
    directly.

    -- chris
    Chris Uppal, Apr 4, 2006
    #12
  13. thank you for your general advices,
    my simulation code si built by many components, and not any of them
    changes its state at every step, so an update method is usefull for me,
    another problem is that i'm afraid of modifing by mystake data inside
    the model step, I had very small time so I'm not sure of some pieces of
    code, so I copy my data between model components.. Yes it's my mistake
    and i will remove superfluos copies asap.
    Do you know if there is some pre-compiler tool that can verify write
    violations like the const keyword of c++?
    Another thing that I can't use is the synch of the gui, because it is
    not important in displaying large data set, only global data, a few
    doubles, are synchronized, and obviously copied as args of updates.
    >A lot depends on how much work you do in each call to JNI. If you are just
    >doing something trivial like generating the next random number, then almost
    >certainly not. The cost of crossing the JNI barrier is pretty high, and will
    >swamp the gains from using (say) Intel's maths libraries. On the other hand,
    >if you have some slow operation (like matrix multiplication) where the time
    >taken is high, and -- more importantly -- the time required to copy any
    >necessary data across the JNI barrier is small in comparison[*] then using the
    >native libraries may help you.

    Do you know where i can find some resource on the performance of JNI
    barrier?
    I've a matrix of 900X400.. but it's element are the results of the
    previews computation... so i wish to leave a copy in the native library
    and only extract a copy when I need one...
    I'm starting to think that it is easier to rewrite all in c++ MKL and
    qt... If i'll obtain less then 2 time speedup..

    I was thinking of unwindin matrix operation like you suggested but I've
    some operation like appling moving 2D filters that are a little
    complex, now that i've seen them work in simple non-unwinded mode,
    perhaps i can optimize them.. Can you suggest a manner to profile
    effective speed-up?

    Thanks,
    Dimitri
    Dimitri Ognibene, Apr 4, 2006
    #13
  14. James Westby Guest

    Re: jni to optimize a java application using native mathematicallibraries

    Dimitri Ognibene wrote:
    > Thanks James,
    >
    > I've done exaclty the same considerations, the problem (mine specific
    > problem) is not if java is faster then c++, but to use optimized
    > libraries like GNU scientific library, that i didn't know till your
    > answer (thks), or Intel MKL, to replace mine non optimized, and not
    > truly able to optimize any further, code. In my lab we have used MKL
    > exp function instead of standard c implementation and we obtained a
    > speed up of 5 times in our neural network code, and we can't link
    > statically!!
    > Now, I'm sure that those libraries are faster then any code i'll ever
    > write, but I don't know if interfacing them with my already written
    > (130 classes) java system. I suppose, as I've said in my first post,
    > that in my specific application the use of an external library using
    > JNI will make the system much more complex (And difficult to mantain
    > and debug) and only a little faster.


    That is a problem that you should avoid if possible. There is a
    trade-off between the speedup you can get and the increased complexity
    in maintaining the code.

    > I will be happier if I find a good java Math library, even if not as
    > good as MKL, and I've not to write boring JNI stuff, array copyng
    > methods and so on.
    > If I'm right you are translating your entire Matlab simulation to c, i
    > don't want to do this, at the moment, my coworker want, but I'm the
    > sw-engineer in the lab so.. much of the effort and of the decisions are
    > mine.. And i would like to find a compromise using some good Math
    > library to optimize the code where , as you said, 90% cpu of time is
    > spent, random, gaussian, sin and similar functions and perhaps Matrix
    > multiplication.. the use of java multidimensianl arrays isn't good,
    > I've tried to optimizing using code like:

    I've only moved one small part to C, the bit that was taking all the
    time when I profiled the code. Have you done that? I don't know of maths
    libraries in Java, it would like to know if there are any good ones.

    >
    > double matrix1[50][900];
    > double vector2[900];
    > for(int i =0;i<50;i++){
    > final double matrix_col[]=matrix;
    > for(int j=0;j<900;j++){
    > ....}}
    >

    Matrix multiplication is a slow operation, and probably a good candidate
    for optimistation, either by you or by swapping the code for something
    specialised (BLAS springs to mind, but I can only find a small mention
    to jBLAS by Google).

    > But only little speed up is gained..
    > And I don't have the time to find optimization tricks.. so a library
    > perhaps would be a better solution..
    > If you have any suggestion, please let me know
    >


    James
    James Westby, Apr 4, 2006
    #14
  15. Thanks James, I'll take a look to jBLAS api, if they are developed by
    google they should be usefull and updated, i hope
    thanks
    Dimitri
    Dimitri Ognibene, Apr 4, 2006
    #15
  16. James Westby Guest

    Re: jni to optimize a java application using native mathematicallibraries

    Dimitri Ognibene wrote:
    > Thanks James, I'll take a look to jBLAS api, if they are developed by
    > google they should be usefull and updated, i hope
    > thanks
    > Dimitri
    >

    No, it was a Google search. But it looked like the API was not fully
    developed yet, so it probably wont be very useful.

    James
    James Westby, Apr 4, 2006
    #16
  17. Chris Uppal Guest

    Dimitri Ognibene wrote:

    > Do you know if there is some pre-compiler tool that can verify write
    > violations like the const keyword of c++?


    No. Sorry ;-)


    > Do you know where i can find some resource on the performance of JNI
    > barrier?


    Not offhand, and it varies according to what you are doing anyway. You'll have
    to measure it yourself.

    FWIW, I recently measured that on this 1.5 GHz WinXP box, the time taken for a
    JNI call to a native method declared as:

    static native int nothing(int i);

    is about 30 nanoeconds on a 1.5.0 JVM. The actual implementation is:

    JNIEXPORT jint JNICALL
    Java_Test_nothing(JNIEnv *e, jclass c, jint i)
    {
    return i;
    }

    so presumably the time is almost all JNI overhead. Other JNI operations have
    different overheads.


    > I've a matrix of 900X400.. but it's element are the results of the
    > previews computation... so i wish to leave a copy in the native library
    > and only extract a copy when I need one...


    Given my point that you are probably not copying /enough/, I doubt if this is
    the right way to go.


    > I was thinking of unwindin matrix operation like you suggested but I've
    > some operation like appling moving 2D filters that are a little
    > complex, now that i've seen them work in simple non-unwinded mode,
    > perhaps i can optimize them.. Can you suggest a manner to profile
    > effective speed-up?


    Just try it. If the re-write is too difficult to be feasible as an experiment,
    then it's probably too complex to use for production purposes.

    -- chris
    Chris Uppal, Apr 6, 2006
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jabel D. Morales - VMan of Mana

    Problems with JNI: calling a Java method from native method.

    Jabel D. Morales - VMan of Mana, Aug 1, 2003, in forum: Java
    Replies:
    1
    Views:
    4,769
    Joseph Millar
    Aug 1, 2003
  2. Alex Hunsley
    Replies:
    4
    Views:
    887
    Alex Hunsley
    Nov 14, 2003
  3. Replies:
    13
    Views:
    6,072
  4. Replies:
    3
    Views:
    396
    Lawrence Kirby
    Feb 28, 2005
  5. bgabrhelik
    Replies:
    0
    Views:
    802
    bgabrhelik
    Sep 29, 2009
Loading...

Share This Page