To increase speed on manipulating big arrays in Java with minimal bounds checking, ...

C

Casey Hawthorne

To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

I suppose I'm thinking of some sort of 2D or 3D (or more) iterator?

Maybe some sort of APL like functions?
 
C

Chris Uppal

Casey said:
To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

You could always write custom array operations in JNI. Otherwise the answer is
no (not without a change to the JVM design, which isn't going to happen).

OTOH, it is /said/ (I find it plausible, but I can't confirm it from personal
knowledge) that the big name JVM's JITers are pretty aggressive about removing
bounds checks in the generated code, so there might not be much to gain from
special operators.

-- chris
 
T

Thomas Hawtin

Casey said:
To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

A good runtime will pull bounds checking out of inner loops.

This sort of low level optimisation is not something to be concerned about.

Tom Hawtin
 
R

Robert Klemme

Chris said:
You could always write custom array operations in JNI. Otherwise the
answer is no (not without a change to the JVM design, which isn't
going to happen).

I'm not sure whether array accesses via JNI actually bypass bounds checks.
You probably would have to copy the Java array into a native array,
perform the algorithm on it and copy results back. This sounds quite
imperformant to me.
OTOH, it is /said/ (I find it plausible, but I can't confirm it from
personal knowledge) that the big name JVM's JITers are pretty
aggressive about removing bounds checks in the generated code, so
there might not be much to gain from special operators.

Also: what do you gain by a specific operator to be used on boundaries?
If you *know* you're at a boundary why then even bother to check? Bounds
checking is there to prevent accidental violations - something that would
lead to undefined behavior or even crashes if omitted.

Kind regards

robert
 
C

Chris Uppal

Robert said:
[me]
You could always write custom array operations in JNI. Otherwise the
answer is no (not without a change to the JVM design, which isn't
going to happen).

I'm not sure whether array accesses via JNI actually bypass bounds checks.

It does. The way the JNI interfaces work (for arrays of primitive types --
which are the only kind worth considering in this context) the API gives you
(at its choice) either a pointer to the data (which you have to return
explicitly) or a pointer to a temporary copy of the data (which you also have
to return). In either case access to the individual elements is at full memory
speed. There is no need to issue a JNI call for each array access.

Also note that it is very unlikely that the overhead of copying (if it happens
at all) would matter. Copying is O(n), but unless the operation itself were
significantly more expensive than O(n) there would be no point in optimising
it.

-- chris
 
C

Chris Uppal

Thomas said:
This sort of low level optimisation is not something to be concerned
about.

Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

Which is a perfectly reasonable scenario.

-- chris

([*] for any of several possible values of "too", eg:
- three times slower than the legacy application we are trying to replace.
- it takes a week to run, but we need to run it once per day.
- it takes 120 millisecs to run, but I need to refresh the frame 25
times/second.
- it takes several hours to run, so any significant speedup would be useful
- ...
)
 
T

Thomas Hawtin

Chris said:
Thomas said:
This sort of low level optimisation is not something to be concerned
about.

Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

Which is a perfectly reasonable scenario.

I'd start at oprimisations that reduce the amount of data involved and
accesses to it, rather than focus on some kind of made up worries.

Tom Hawtin
([*] for any of several possible values of "too", eg:
- three times slower than the legacy application we are trying to replace.
- it takes a week to run, but we need to run it once per day.
- it takes 120 millisecs to run, but I need to refresh the frame 25
times/second.
- it takes several hours to run, so any significant speedup would be useful
- ...
)

I'm not seeing how array bounds checking comes into this. This kind of
optimisation is handled by the runtime.
 
P

Patricia Shanahan

Chris said:
Thomas Hawtin wrote:

This sort of low level optimisation is not something to be concerned
about.


Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

and you have investigated the slowness and found that it is due to
bounds checking.
Which is a perfectly reasonable scenario.

-- chris

([*] for any of several possible values of "too", eg:
- three times slower than the legacy application we are trying to replace.
- it takes a week to run, but we need to run it once per day.
- it takes 120 millisecs to run, but I need to refresh the frame 25
times/second.
- it takes several hours to run, so any significant speedup would be useful
- ...
)
 
R

Robert Klemme

Chris said:
Robert said:
[me]
You could always write custom array operations in JNI. Otherwise
the answer is no (not without a change to the JVM design, which
isn't going to happen).

I'm not sure whether array accesses via JNI actually bypass bounds
checks.

It does. The way the JNI interfaces work (for arrays of primitive
types -- which are the only kind worth considering in this context)
the API gives you (at its choice) either a pointer to the data (which
you have to return explicitly) or a pointer to a temporary copy of
the data (which you also have to return). In either case access to
the individual elements is at full memory speed. There is no need to
issue a JNI call for each array access.

Ah, thanks for the heads up.
Also note that it is very unlikely that the overhead of copying (if
it happens at all) would matter. Copying is O(n), but unless the
operation itself were significantly more expensive than O(n) there
would be no point in optimising it.

Erm, strictly speaking copying becomes relatively cheaper (and thus
neglectible) if the op is worse than O(n), doesn't it? Anyway, even if
the op is O(n) you could neglect copying - from a theoretical perspective
(O(n) = O(2n)). It might be different in practice if arrays are huge (mem
allocation overhead) or if they are small and there are many invocations.

Kind regards

robert
 
T

TechBookReport

Casey said:
To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

I suppose I'm thinking of some sort of 2D or 3D (or more) iterator?

Maybe some sort of APL like functions?

If you're looking for commercial solutions then SmartArrays might be of
interest. It's very APL-like and supports Java and .NET, from what I've
read in the past. The authors have a long history of development of APL
interpreters and systems.
 
P

Patricia Shanahan

Robert Klemme wrote:
....
Erm, strictly speaking copying becomes relatively cheaper (and thus
neglectible) if the op is worse than O(n), doesn't it? Anyway, even if
the op is O(n) you could neglect copying - from a theoretical perspective
(O(n) = O(2n)). It might be different in practice if arrays are huge (mem
allocation overhead) or if they are small and there are many invocations.
....

For any finite problem size, O(n) can be worse than O(n^2).

Patricia
 
R

Robert Klemme

Patricia said:
Robert Klemme wrote:
...

For any finite problem size, O(n) can be worse than O(n^2).

Does the term O(n) make sense for finite problems? Big O is just about
asymptotic values. And, yes, as I said, in practice the overhead may
indeed count. :)

Kind regards

robert
 
C

Chris Uppal

Patricia said:
This sort of low level optimisation is not something to be concerned
about.


Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

and you have investigated the slowness and found that it is due to
bounds checking.

I think you have misunderstood me. I was talking about the circumstances under
which one would get into that kind of investigation in the first place, not the
circumstances under which one would actually attempt to optimise. As you say,
there's no point in optimising something that isn't actually slowing you
down[*].

([*] Although I should add that quite often the simplest way of measuring the
potential gains from an optimisation is just to try it, or a simplified
version of it)

-- chris
 
I

Ingo R. Homann

Hi,

Robert said:
Does the term O(n) make sense for finite problems?

Of course:
Big O is just about
asymptotic values.

Yes. But most (or even all?) algorithms are defined to *theoretically*
run with an unlimited 'n', although *in practise*, 'n' is always finite,
i.e. limited and not 'infinity'.
And, yes, as I said, in practice the overhead may
indeed count. :)

That's it!

Ciao,
Ingo
 
R

Robert Klemme

Ingo said:
Hi,



Of course:


Yes. But most (or even all?) algorithms are defined to *theoretically*
run with an unlimited 'n', although *in practise*, 'n' is always
finite, i.e. limited and not 'infinity'.

Yep, but in that case you can't work with O but you have to be more
specific (i.e. know the constants). That's why IMHO it doesn't make sense
here - at least you'll have to take O's with a grain of salt and look a
bit closer.
That's it!

Yepp, think we agree here. Just wanted to make the distinction clear.

robert
 
R

Roedy Green

To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

Only in theory is every reference bounds checked. The JVM is free to
check the bounds at the beginning of a loop to make sure there cannot
possibly be trouble.

for (String s : stringarray )
has no index to check in the loop.

In Jet you can turn off some checking, possibly bound checks, but not
in any Sun JVM.
 
L

ldv

Roedy said:
In Jet you can turn off some checking, possibly bound checks, but not
in any Sun JVM.

This feature will be disabled in Excelsior JET 4.0 as it is not
compliant with the Java spec. But sophisticated JVMs like Sun HotSpot,
BEA JRockit or Excelsior JET remove bounds checking when it is possible
to compute the range of the index expression before execution. So what
you should do is you should write your code so that the JVM can compute
those ranges (and also determine whether objects and arrays used in a
loop are not null before the loop.)

LDV
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top