how big can an array be before it becomes unmanageable?

  • Thread starter Robert W via JavaKB.com
  • Start date
R

Robert W via JavaKB.com

In theory, how big can an array be before it becomes unmanageable? I know
this has some basis on the machine spec. However it is the proportions that I
am unsure about. Since an array initialises all it is values to 0 if they are
not specified at creation.

How much memory would a large empty array take?

For example:

Lets say I am writing a game engine (I'm not but for arguments sake) and I
have decided to use an array for collision detection. A moving object in the
game queries an array to see if that space is occupied. If it is then it
invokes a method that carries out the desired reaction (explodes, prevents
the objects movement, etc). The planned platform is to have 256 megs of free
RAM

Now if the game creates a 3d int array of [1000][1000][1000].

how much free RAM would remain, if any?

I would appreciate any thoughts...

Rob
 
A

Andrey Kuznetsov

In theory, how big can an array be before it becomes unmanageable?

1. It is better to use multiple small arrays as one big array (easy paging).
2. Note that java does not true 3d arrays. it is rather arrays of arrays of
arrays.
 
T

Thomas Fritsch

Robert said:
How much memory would a large empty array take?
You can find out this with java's built-in profiler.
(Call java -agentlib:hprof=help for help about it)

Run the example below with
java -agentlib:hprof=heap=sites Test

public class Test {
public static void main(String args[]) {
int intArray[] = new int[1000000];
Object objArray[] = new Object[1000000];
}
}

You'll get a "java.hprof.txt" file containing:

percent live alloc'ed stack class
rank self accum bytes objs bytes objs trace name
1 49.25% 49.25% 4000016 1 4000016 1 300237 java.lang.Object[]
2 49.25% 98.50% 4000016 1 4000016 1 300235 int[]
3 0.10% 98.61% 8520 1 8520 1 300096 byte[]
 
K

Kenneth Patrick Turvey

1. It is better to use multiple small arrays as one big array (easy paging).
2. Note that java does not true 3d arrays. it is rather arrays of arrays of
arrays.

I've read this before, but I really don't understand why it would be true
in a modern language implementation. With virtual memory paging is
handled by the operating system. Other than working for locality of
reference, why should it have any impact on the programmer?

Does it?
 
T

Thomas Fritsch

Thomas said:
You can find out this with java's built-in profiler.
(Call java -agentlib:hprof=help for help about it)

Run the example below with
java -agentlib:hprof=heap=sites Test
Prior to Java 1.5 the calls are:
java -Xrunhprof:heap=sites Test
java -Xrunhprof:help
 
I

Ingo R. Homann

Hi,
I've read this before, but I really don't understand why it would be true
in a modern language implementation. With virtual memory paging is
handled by the operating system. Other than working for locality of
reference, why should it have any impact on the programmer?

Does it?

IMHO, you should normally *not* do that kind of optimization, since it
is a premature optimization.

E.g. modern JVMs remove the bounds-checks within a loop, if it can be
assured from outside, that the index will not go out of scope:

for(int i=0;i<is.length;i++) {
is=...; // no bound-check
}

This is much harder for the JVM when "optmizing" like Andrey suggested,
so that his version might be *slower*!

Of course, this all depends on the JVM you use (and the JVMs that will
come in future (*)). So, if you didn't profile (on different JVMs!) that
this is *really* a bottleneck, do never do such optmizations!

Ciao,
Ingo

(*) Some years ago, instantiating Objects was expensive so that it could
be a good idea to recycle Objects. Today, instantiating is no longer
much expensive, but recycling Objects may slow down the GC very much.
What once was a good optimization now slows down the applications and -
even worse - makes code harder to maintain!
 
T

Thomas Hawtin

Kenneth said:
I've read this before, but I really don't understand why it would be true
in a modern language implementation. With virtual memory paging is
handled by the operating system. Other than working for locality of
reference, why should it have any impact on the programmer?

If you were using lots of different, medium-sized arrays then having the
header in a different page than the element you are accessing may cause
twice the faults. Reducing the number of arrays so the headers will
remain cached is probably a better idea. Java isn't nice when it swaps
anyway.

There may be some advantage in avoiding very large arrays to make it
easier on the garbage collector to move them around.

Tom Hawtin
 
A

Andrey Kuznetsov

E.g. modern JVMs remove the bounds-checks within a loop, if it can be
assured from outside, that the index will not go out of scope:

for(int i=0;i<is.length;i++) {
is=...; // no bound-check
}

This is much harder for the JVM when "optmizing" like Andrey suggested, so
that his version might be *slower*!


not really

for(int j = 0; j < arr.length; j++) {
is = arr[j];
for(int i=0;i<is.length;i++) {
is=...; // no bound-check
}
}
 
I

Ingo R. Homann

Hi,

Andrey said:
E.g. modern JVMs remove the bounds-checks within a loop, if it can be
assured from outside, that the index will not go out of scope:

for(int i=0;i<is.length;i++) {
is=...; // no bound-check
}

This is much harder for the JVM when "optmizing" like Andrey suggested, so
that his version might be *slower*!


not really

for(int j = 0; j < arr.length; j++) {
is = arr[j];
for(int i=0;i<is.length;i++) {
is=...; // no bound-check
}
}


After reading your first posting again, I must admit that I had totally
misunderstood you. I thought you were suggesting an optimization (often
used in former times) like the following:

// old code:
int[][] iss=new int[256][256];
for(int x=0;x<iss.length;x++) {
int is=iss[x];
for(int y=0;y<is.length;y++) {
is[y]=...
}
}

// "optimized", new code:
int[] iss=new int[256*256];
for(int x=0;x<iss.length;x++) {
for(int y=0;y<is.length;y++) {
iss[x<<8|y]=...
}
}

I strongly recommend *not* to use this "optimization" generally.

(Although in this special case, when iterating over the *complete*
array, there were things to optmize.)

I promise to read postings more carefully! :)

Ciao,
Ingo
 
R

Roedy Green

How much memory would a large empty array take?

A full one and an empty one take the same amount of space. The objects
are extra. An array of references takes about 4 bytes per slot, 8 on a
64 bit machine.

An array of ints are 4 bytes per slot, shorts 2, bytes 1. doubles 8,
floats 4. See http://mindprod.com/jgloss/primitive.html

Arrays work fine until they are too big to fit in real memory. Then it
depends how you use them. If you hop all over you will be swapping
like mad. If you use them sequentially it will be far less painful.

Do some experiments to learn the properties.
 
R

Roedy Green

It is nice to have process in small batches, but whether the batches
are separate arrays or one big one makes no difference to access time.
Arrays are swapped on page boundaries, not object boundaries.

However, there another advantage to smaller arrays, they are easier to
find space for to allocate. You can often allocate many small arrays
in the various holes where you are forced to GC to create a giant
hole for a big array.
 
R

Roedy Green

IMHO, you should normally *not* do that kind of optimization, since it
is a premature optimization.

The operational word is PREMATURE, doing optimisation before its
time. This does NOT mean closing your eyes to performance issues
indefinitely.

It means merely avoiding dicking about with fiddly optimisations until
you know exactly where they are needed.

So many people seem to think that even knowing which coding technique
is faster is some sort of impure thought, and telling others some sort
of corruption of the young.

If you can make code fast by algorithm choice or simplicity or by
choices that take no additional effort, there is no need to postpone
it. NOW is the appropriate time.
 
T

Thomas Hawtin

Roedy said:
However, there another advantage to smaller arrays, they are easier to
find space for to allocate. You can often allocate many small arrays
in the various holes where you are forced to GC to create a giant
hole for a big array.

Conversely, when collected a big array will leave a nice big, easy to
fill hole. Little arrays may fragment the heap.

Of course (on Sun's JVM) this only matters when the arrays get to the
tenured generation, as eden is allocated in a stack-like fashion. Large
arrays may get their earlier if they are too big for the survivor space.

My main concern about lots of small object is with the GC pauses.

Tom Hawtin
 
R

Robert W via JavaKB.com

Roedy said:
Do some experiments to learn the properties.

Thank you for all your messages. I think I have a better idea now of the
whole subject.

Rob.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top