To increase speed on manipulating big arrays in Java with minimal bounds checking, ...

Casey Hawthorne · Aug 31, 2005

To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

I suppose I'm thinking of some sort of 2D or 3D (or more) iterator?

Maybe some sort of APL like functions?

Chris Uppal · Aug 31, 2005

Casey said:
To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

You could always write custom array operations in JNI. Otherwise the answer is
no (not without a change to the JVM design, which isn't going to happen).

OTOH, it is /said/ (I find it plausible, but I can't confirm it from personal
knowledge) that the big name JVM's JITers are pretty aggressive about removing
bounds checks in the generated code, so there might not be much to gain from
special operators.

-- chris

Thomas Hawtin · Aug 31, 2005

Casey said:
To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

A good runtime will pull bounds checking out of inner loops.

This sort of low level optimisation is not something to be concerned about.

Tom Hawtin

Robert Klemme · Aug 31, 2005

Chris said:
You could always write custom array operations in JNI. Otherwise the
answer is no (not without a change to the JVM design, which isn't
going to happen).

I'm not sure whether array accesses via JNI actually bypass bounds checks.
You probably would have to copy the Java array into a native array,
perform the algorithm on it and copy results back. This sounds quite
imperformant to me.

OTOH, it is /said/ (I find it plausible, but I can't confirm it from
personal knowledge) that the big name JVM's JITers are pretty
aggressive about removing bounds checks in the generated code, so
there might not be much to gain from special operators.

Also: what do you gain by a specific operator to be used on boundaries?
If you *know* you're at a boundary why then even bother to check? Bounds
checking is there to prevent accidental violations - something that would
lead to undefined behavior or even crashes if omitted.

Kind regards

robert

Chris Uppal · Aug 31, 2005

Robert said:
[me]
You could always write custom array operations in JNI. Otherwise the
answer is no (not without a change to the JVM design, which isn't
going to happen).

Click to expand...

I'm not sure whether array accesses via JNI actually bypass bounds checks.

It does. The way the JNI interfaces work (for arrays of primitive types --
which are the only kind worth considering in this context) the API gives you
(at its choice) either a pointer to the data (which you have to return
explicitly) or a pointer to a temporary copy of the data (which you also have
to return). In either case access to the individual elements is at full memory
speed. There is no need to issue a JNI call for each array access.

Also note that it is very unlikely that the overhead of copying (if it happens
at all) would matter. Copying is O(n), but unless the operation itself were
significantly more expensive than O(n) there would be no point in optimising
it.

-- chris

Chris Uppal · Aug 31, 2005

Thomas said:
This sort of low level optimisation is not something to be concerned
about.

Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

Which is a perfectly reasonable scenario.

-- chris

([*] for any of several possible values of "too", eg:
- three times slower than the legacy application we are trying to replace.
- it takes a week to run, but we need to run it once per day.
- it takes 120 millisecs to run, but I need to refresh the frame 25
times/second.
- it takes several hours to run, so any significant speedup would be useful
- ...
)

Thomas Hawtin · Aug 31, 2005

Chris said:
Thomas said:

This sort of low level optimisation is not something to be concerned
about.

Click to expand...

Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

Which is a perfectly reasonable scenario.

I'd start at oprimisations that reduce the amount of data involved and
accesses to it, rather than focus on some kind of made up worries.

Tom Hawtin

([*] for any of several possible values of "too", eg:
- three times slower than the legacy application we are trying to replace.
- it takes a week to run, but we need to run it once per day.
- it takes 120 millisecs to run, but I need to refresh the frame 25
times/second.
- it takes several hours to run, so any significant speedup would be useful
- ...
)

I'm not seeing how array bounds checking comes into this. This kind of
optimisation is handled by the runtime.

Patricia Shanahan · Aug 31, 2005

Chris said:
Thomas Hawtin wrote:

This sort of low level optimisation is not something to be concerned
about.

Click to expand...

Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

and you have investigated the slowness and found that it is due to
bounds checking.

Which is a perfectly reasonable scenario.

-- chris

([*] for any of several possible values of "too", eg:
- three times slower than the legacy application we are trying to replace.
- it takes a week to run, but we need to run it once per day.
- it takes 120 millisecs to run, but I need to refresh the frame 25
times/second.
- it takes several hours to run, so any significant speedup would be useful
- ...
)

Robert Klemme · Aug 31, 2005

Chris said:
Robert said:

[me]
You could always write custom array operations in JNI. Otherwise
the answer is no (not without a change to the JVM design, which
isn't going to happen).

Click to expand...

I'm not sure whether array accesses via JNI actually bypass bounds
checks.

Click to expand...

It does. The way the JNI interfaces work (for arrays of primitive
types -- which are the only kind worth considering in this context)
the API gives you (at its choice) either a pointer to the data (which
you have to return explicitly) or a pointer to a temporary copy of
the data (which you also have to return). In either case access to
the individual elements is at full memory speed. There is no need to
issue a JNI call for each array access.

Ah, thanks for the heads up.

Also note that it is very unlikely that the overhead of copying (if
it happens at all) would matter. Copying is O(n), but unless the
operation itself were significantly more expensive than O(n) there
would be no point in optimising it.

Erm, strictly speaking copying becomes relatively cheaper (and thus
neglectible) if the op is worse than O(n), doesn't it? Anyway, even if
the op is O(n) you could neglect copying - from a theoretical perspective
(O(n) = O(2n)). It might be different in practice if arrays are huge (mem
allocation overhead) or if they are small and there are many invocations.

Kind regards

robert

TechBookReport · Aug 31, 2005

Casey said:
To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

I suppose I'm thinking of some sort of 2D or 3D (or more) iterator?

Maybe some sort of APL like functions?

If you're looking for commercial solutions then SmartArrays might be of
interest. It's very APL-like and supports Java and .NET, from what I've
read in the past. The authors have a long history of development of APL
interpreters and systems.

Patricia Shanahan · Aug 31, 2005

Robert Klemme wrote:
....

Erm, strictly speaking copying becomes relatively cheaper (and thus
neglectible) if the op is worse than O(n), doesn't it? Anyway, even if
the op is O(n) you could neglect copying - from a theoretical perspective
(O(n) = O(2n)). It might be different in practice if arrays are huge (mem
allocation overhead) or if they are small and there are many invocations.

....

For any finite problem size, O(n) can be worse than O(n^2).

Patricia

Robert Klemme · Aug 31, 2005

Patricia said:
Robert Klemme wrote:
...

For any finite problem size, O(n) can be worse than O(n^2).

Does the term O(n) make sense for finite problems? Big O is just about
asymptotic values. And, yes, as I said, in practice the overhead may
indeed count.

Kind regards

robert

Chris Uppal · Aug 31, 2005

Patricia said:
This sort of low level optimisation is not something to be concerned
about.

Click to expand...

Unless, of course, you are manipulating large amounts of data and the
application is running too[*] slow.

Click to expand...

and you have investigated the slowness and found that it is due to
bounds checking.

I think you have misunderstood me. I was talking about the circumstances under
which one would get into that kind of investigation in the first place, not the
circumstances under which one would actually attempt to optimise. As you say,
there's no point in optimising something that isn't actually slowing you
down[*].

([*] Although I should add that quite often the simplest way of measuring the
potential gains from an optimisation is just to try it, or a simplified
version of it)

-- chris

Ingo R. Homann · Aug 31, 2005

Hi,

Robert said:
Does the term O(n) make sense for finite problems?

Of course:

Big O is just about
asymptotic values.

Yes. But most (or even all?) algorithms are defined to *theoretically*
run with an unlimited 'n', although *in practise*, 'n' is always finite,
i.e. limited and not 'infinity'.

And, yes, as I said, in practice the overhead may
indeed count.

That's it!

Ciao,
Ingo

Robert Klemme · Aug 31, 2005

Ingo said:
Hi,

Of course:

Yes. But most (or even all?) algorithms are defined to *theoretically*
run with an unlimited 'n', although *in practise*, 'n' is always
finite, i.e. limited and not 'infinity'.

Yep, but in that case you can't work with O but you have to be more
specific (i.e. know the constants). That's why IMHO it doesn't make sense
here - at least you'll have to take O's with a grain of salt and look a
bit closer.

That's it!

Yepp, think we agree here. Just wanted to make the distinction clear.

robert

Roedy Green · Aug 31, 2005

To increase speed on manipulating big arrays in Java with minimal
bounds checking, would it be possible to have some operators that only
do bounds checking at the boundaries of the array?

Only in theory is every reference bounds checked. The JVM is free to
check the bounds at the beginning of a loop to make sure there cannot
possibly be trouble.

for (String s : stringarray )
has no index to check in the loop.

In Jet you can turn off some checking, possibly bound checks, but not
in any Sun JVM.

ldv · Sep 1, 2005

Roedy said:
In Jet you can turn off some checking, possibly bound checks, but not
in any Sun JVM.

This feature will be disabled in Excelsior JET 4.0 as it is not
compliant with the Java spec. But sophisticated JVMs like Sun HotSpot,
BEA JRockit or Excelsior JET remove bounds checking when it is possible
to compute the range of the index expression before execution. So what
you should do is you should write your code so that the JVM can compute
those ranges (and also determine whether objects and arrays used in a
loop are not null before the loop.)

LDV

multidimension arrays with unknown bounds	4	May 18, 2011
Array bounds checking	5	Jul 5, 2005
Bounds checking and safety in C	140	Jul 29, 2007
Manipulating large arrays very fast	8	Jan 24, 2006
Array Checking in Java.	10	Jan 10, 2008
How to get education and coding job coming from abroad starting new in the US? Advice of courses or places to look?	2	May 18, 2023
how to speed up the deletion of multidimensional array in C++	0	Oct 4, 2010
fully fast resizable arrays in c (c2)	21	Mar 18, 2013

To increase speed on manipulating big arrays in Java with minimal bounds checking, ...

Casey Hawthorne

Chris Uppal

Thomas Hawtin

Robert Klemme

Chris Uppal

Chris Uppal

Thomas Hawtin

Patricia Shanahan

Robert Klemme

TechBookReport

Patricia Shanahan

Robert Klemme

Chris Uppal

Ingo R. Homann

Robert Klemme

Roedy Green

ldv

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads