How to initialize a big (String-)Array fast?

P

Patrick

Hello,

is there a way to initialize a big String-Array faster than just with "new"?

Ex.:
String result[] = new String[64000000];

takes over a second on a Pentium 3 800MhZ.

For bigger values still more.

Is there a way to do this faster?

Greetings and TIA,
Patrick
 
F

Frank

Patrick said:
Hello,

is there a way to initialize a big String-Array faster than just with "new"?

Ex.:
String result[] = new String[64000000];

takes over a second on a Pentium 3 800MhZ.

For bigger values still more.

Is there a way to do this faster?

Greetings and TIA,
Patrick

You're allocating 256MB++ worth of memory. That's bound to take some
time, especially if there's swapping involved.

Perhaps you should just tell us what you're trying to accomplish
instead, some may have a better approach.
 
M

Michael Borgwardt

Patrick said:
is there a way to initialize a big String-Array faster than just with "new"?

Ex.:
String result[] = new String[64000000];

takes over a second on a Pentium 3 800MhZ.

For bigger values still more.

Is there a way to do this faster?

Sure. Redesign your program not to require such a godawful amount of memory.

It's completely pointless anyway, since you can't even HAVE enough
memory on a Pentium 3 machine to actually put *content* into that array.
 
T

Tim

Perhaps you'll want a memory mapped file so you can have a buffer and
the OS can futz with allocating RAM and swapping.
http://javaalmanac.com/egs/java.nio/CreateMemMap.html?l=rel

I think Linux has some kernel function for allocating memory clear or
initialized with the over-write bits. Windows probably does too. No
idea if Java supports this level of functionality. Anybody know?

So, I don't know any tricks to allocate large RAM cleared in Java. Any
luck finding out how?
 
J

John C. Bollinger

Tim said:
Perhaps you'll want a memory mapped file so you can have a buffer and
the OS can futz with allocating RAM and swapping.
http://javaalmanac.com/egs/java.nio/CreateMemMap.html?l=rel

Not relevant. The problem Michael was pointing to is that a Pentium 3
(or IV) cannot even address enough memory to handle distinct Strings (of
any content, including empty) for all 64 million positions of the array
the OP presented. There is no OS magic for addressing larger memory
than the supported address space of the processor, even ignoring such
piddling questions as limits on the size of supported memory-mapped
blocks (64MB in some common implementations -- barely more than enough
for 64 million undelimited one-character strings).
I think Linux has some kernel function for allocating memory clear or
initialized with the over-write bits. Windows probably does too. No
idea if Java supports this level of functionality. Anybody know?

So, I don't know any tricks to allocate large RAM cleared in Java. Any
luck finding out how?

Java specifies how the contents of new arrays are to be initialized. It
is the JVM's job to handle the details, and a quality JVM implementation
will make use of available features of the platform to do so as
efficiently as possible while remaining within the constraints of
allowable behavior.


John Bollinger
(e-mail address removed)
 
T

Tim

I was replying to Patrick's original post so maybe I should have
bypassed Michael's response. Didn't yet see the "Threads" option in
Googles' new beta DejaNews interface.

Doesn't Java run on some 64-bit processors? I too was surprised 1.5 did
not integrate 64-bit addressing. OS's incorporated this in the mid-90's
so Java not having it is questionable. As always, when the processor
cannot access available RAM then the "OS magic" for accessing secondary
memory (disk, swap, pagefile) comes to bear. In this case, create a
class that implements Collection but serves Strings. Back it with File
or RandomAccessFile. You can even amortize the initialization by
initializing on access and make it "faster" in total time used because
you will not allocate Strings unless they are used.

Of course, the simple fix is to allocate an array of array of Strings
and make accessors.

John, about 64MB memory-mapped blocks, I guess this is like on a 64MB
RAM machines? For this RandomAccessFile, at least with 1.4.1_05, barfs
if you try to access more than the given amount of RAM. TIJ refers to
this limit correctly as length().
http://www.eastons.org/tij/TIJ314.htm#Index1408. This memory is not
GC'ed either AFAICT! Anybody know if the filename given can be used by
other code (non-Java) to access the same RAM? That's one thing cool I
remember about Winders createFile.

Happy coding,
TimJowers
 
J

John C. Bollinger

Tim said:
Doesn't Java run on some 64-bit processors?
Yes.

I too was surprised 1.5 did
not integrate 64-bit addressing. OS's incorporated this in the mid-90's
so Java not having it is questionable.

Java is not concerned with addressing at all. Not 64-bit addressing,
not 32-bit addressing, not 24-bit addressing. Not 3417-bit addressing.
It is designed to be platform independent, so I don't expect it ever
to be concerned with such details.
As always, when the processor
cannot access available RAM then the "OS magic" for accessing secondary
memory (disk, swap, pagefile) comes to bear. In this case, create a
class that implements Collection but serves Strings. Back it with File
or RandomAccessFile. You can even amortize the initialization by
initializing on access and make it "faster" in total time used because
you will not allocate Strings unless they are used.

What you describe is not "OS magic", with which I was specifically
referring to some (nonexistent) means for a processor to address more
memory than its address space permits. The array presented by the OP
could not have been effectively used in any JVM I am aware of on a
Pentium 3 (processor specified by the OP) even if the OS applied all
available _virtual_ memory to the problem, regardless of the amount of
RAM, swap space, and disk available to the system. It is conceivable
that a VM might back an array with secondary storage as you describe,
but no VM I am aware of does so, probably because the implementation
would greatly complicate an otherwise relatively simple language
feature, and because the resulting performance would fall far short of
user expectations.

Yes, the OP could use a disk-backed Collection. Depending on his needs
that might be an appropriate solution (though I doubt it), but that's
rather a different thing from an array, and not what he asked about.
Of course, the simple fix is to allocate an array of array of Strings
and make accessors.

No, that's not a fix at all. The point is that the memory consumed by
the Strings themselves (if all distinct) would be in excess of the
amount of memory addressable by the processor in question.
John, about 64MB memory-mapped blocks, I guess this is like on a 64MB
RAM machines?

No, this is a kernel-level issue. OS kernels can, and at least some do,
place limits on the size of a memory-mapped block. 64MB is a limit that
at at least one time was reasonably common. That does not prevent
mapping several segments of the same file, and if you're careful you may
even be able to map them to contiguous addresses, but its an added
complication of relying on memory mapping files. The amount of RAM in
the system is not relevant at all.
For this RandomAccessFile, at least with 1.4.1_05, barfs
if you try to access more than the given amount of RAM. TIJ refers to
this limit correctly as length().
http://www.eastons.org/tij/TIJ314.htm#Index1408.

I'm not sure what I was supposed to find at that URL, but it says
nothing relevant about the length() of a RandomAccessFile. My reference
for Java platform APIs is generally the API docs. In any case,
RandomAccessFile has almost nothing to do with RAM -- it is an interface
to a file on some filesystem accessible to the VM. The filesystem, OS,
and disk hardware will place limits on the maximum size of such a file.
The length() is the current length of the file, which is the limit for
reading but not for seeking or writing (by which the length can be
extended).
This memory is not
GC'ed either AFAICT!

The contents of the file are not memory in the sense that Java uses the
term. The file may exist prior to execution of a Java program that
accesses it, and it is reasonable to suppose that it might be desirable
for the file to persist after the program completes. Of course the
*file* is not GC'd. The associated RandomAccessFile *object*, on the
other hand, is subject to GC just like any other object.
Anybody know if the filename given can be used by
other code (non-Java) to access the same RAM?

*It is not RAM.* Since it is a file on the filesystem, yes, other
programs may access it before, during, and after your Java program's use
of it. HOWEVER, you cannot safely assume that writes to the file (from
Java or otherwise) are immediately committed to disk, so multiple
programs simultaneously accessing the file may not have consistent views
of its contents.


John Bollinger
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top