"Virtual memory" framework for Java

zeus · Jul 27, 2005

Hi all,
I am dealing with very large in-memory data structures. The data
structures are accessed in different frequencies varying between every
30 seconds and few minutes. My first try was to zip the information to
a file when un-needed, but it caused very high I/O wait on Solaris
machine. What I really want is something like virtual memory - I want
to be oblivious to where portions of the data structure are stored
(in-memory or on disk) whenever I need to access the data, I want it to
be loaded into memory.
Do you know of such a framework for "virtual memory" in Java?

Thanks

Thomas Hawtin · Jul 27, 2005

zeus said:
I am dealing with very large in-memory data structures. The data
structures are accessed in different frequencies varying between every
30 seconds and few minutes. My first try was to zip the information to
a file when un-needed, but it caused very high I/O wait on Solaris
machine. What I really want is something like virtual memory - I want
to be oblivious to where portions of the data structure are stored
(in-memory or on disk) whenever I need to access the data, I want it to
be loaded into memory.
Do you know of such a framework for "virtual memory" in Java?

Use java.nio to memory map a file.

Be careful what you map though. If the file goes away or truncates you
can get an asynchronous exception (InternalError), and on Linux the JVM
actually crashes.

Tom Hawtin

Ingo R. Homann · Jul 27, 2005

Hi,

Thomas said:
Use java.nio to memory map a file.

Wich class of the package do you mean exactly? I find nothing apropriate.

My idea would be to use the java.lang.ref-Package to achieve some kind
of virtual memory.

Ciao,
Ingo

zeus · Jul 28, 2005

Thomas said:
Use java.nio to memory map a file.

Be careful what you map though. If the file goes away or truncates you
can get an asynchronous exception (InternalError), and on Linux the JVM
actually crashes.

Tom Hawtin

Can you elaborate a little bit about the utilization of the memory
mapped file for virtual memory as I described?

Virgil Green · Jul 28, 2005

zeus said:
Can you elaborate a little bit about the utilization of the memory
mapped file for virtual memory as I described?

Does
http://java.sun.com/j2se/1.4.2/docs/api/java/nio/MappedByteBuffer.html
help?

Ingo R. Homann · Jul 29, 2005

Hi zeus,

Can you elaborate a little bit about the utilization of the memory
mapped file for virtual memory as I described?

Again: I'm not sure if the MappedByteBuffer is apropriate for what you
want to do, since AFAICS it only maps bytes (and other native types) and
not complete Objects!

I think, something like java.lang.ref will be better since you are able
to de-/serialize complete Objects using this.

Ciao,
Ingo

Antti S. Brax · Jul 29, 2005

Again: I'm not sure if the MappedByteBuffer is apropriate for what you
want to do, since AFAICS it only maps bytes (and other native types) and
not complete Objects!

Objects can be serialized to bytes.

Antti S. Brax · Jul 29, 2005

I am dealing with very large in-memory data structures.

What does very large mean?

The data
structures are accessed in different frequencies varying between every
30 seconds and few minutes. My first try was to zip the information to
a file when un-needed, but it caused very high I/O wait on Solaris
machine.

Did it cause _IO wait_ or wait on the _compression algorithm_
used in ZIP files? Do you have several ZIP files or one large?
How did you organize the ZIP file? Did you try different
directory hierarchies? Have you tried plain files instead of
ZIP files?

You are not giving us enough information to help you
efficiently.

Ingo R. Homann · Jul 29, 2005

Hi Antti,

Objects can be serialized to bytes.

OK, but isn't this overhead? When using a MappedByteBuffer, *everytime*
you want to access an Object, it must be deserialized, and not only,
when it is stored to disk!

Ciao,
Ingo

Antti S. Brax · Jul 29, 2005

Hi Antti,

OK, but isn't this overhead? When using a MappedByteBuffer, *everytime*
you want to access an Object, it must be deserialized, and not only,
when it is stored to disk!

Of course it is. But I was just pointing out an error in your
statement. Anyway this discussion is useless since the original
poster did not specify that he is serializing objects (his use
of ZIP files indicates against it).

Ingo R. Homann · Jul 29, 2005

Hi,

Of course it is. But I was just pointing out an error in your
statement.

I meant it is not adequate. I did not mean, that it is impossible.

(-; Note, that we are both "Erbsenzähler" (*), as we say in germany. ;-)

Anyway this discussion is useless since the original
poster did not specify that he is serializing objects (his use
of ZIP files indicates against it).

Well, I guess, zipping a serialized Object is more straightforward than
zipping a non-serialized Object which must be serialized before
zipping...

Anyway you are right, the discussion is useless since the OP does not
tell us a bit more about his problem...

Ciao,
Ingo

(*) That is a nice word for "Korinthenkacker" which (as
http://dict.leo.org says) means "nitpicker", which - in my hears -
sounds not so nice... or am I wrong? What do you say in Finland?

Antti S. Brax · Jul 29, 2005

(-; Note, that we are both "Erbsenzähler" (*), as we say in germany. ;-)

(*) That is a nice word for "Korinthenkacker" which (as
http://dict.leo.org says) means "nitpicker", which - in my hears -
sounds not so nice... or am I wrong? What do you say in Finland?

It's unprintable.

zeus · Jul 30, 2005

A lot to comment on.
1. When I say huge data structures I mean a lot of varying sized data
structures from very small to few megabytes in the larger. When I'm
saying a lot I refer to 1000s of such separate structures.
2. I don't see how weak (or other) references can help in this case.
For as long as they have reference they will remain in memory and when
they lose reference then anyway they will be removed (by gc). What I
want is to load and unload from memory as needed.
3. When I say some are accessed frequently I mean every 30 seconds, and
this applies the small data structures mainly. The larger structures
are accessed every 10-15 minutes.
4. I tried the zipping on the larger data structures which are less
frequently accessed and it caused the OS to stuck on IO wait when
zipping and unzipping (gzip input/output stream)
5. Serialization and deserialization are very expensive, however is
there any clean way to avoid them?

Ingo R. Homann · Aug 1, 2005

Hi zeus,

A lot to comment on.
1. When I say huge data structures I mean a lot of varying sized data
structures from very small to few megabytes in the larger. When I'm
saying a lot I refer to 1000s of such separate structures.

OK. A "good" solution should be able to deal with this.

2. I don't see how weak (or other) references can help in this case.
For as long as they have reference they will remain in memory and when
they lose reference then anyway they will be removed (by gc). What I
want is to load and unload from memory as needed.

Well, AFAIK there *are* possibilities to get informed when a WeakRef
will be 'removed'. IIRC, ReferenceQueue does this (but working with
finalize (which works much better than it is said to) may work as well).
So, when a the WeakRef is removed, you store the object to disk, and
when someone tries to access it afterwards, you reload it. I think, this
should work!

3. When I say some are accessed frequently I mean every 30 seconds, and
this applies the small data structures mainly. The larger structures
are accessed every 10-15 minutes.

It should be no greater problem to "count" how often/frequent an object
is accessed and to primarily suspend objects to disk that are accessed
in a low frequency.

4. I tried the zipping on the larger data structures which are less
frequently accessed and it caused the OS to stuck on IO wait when
zipping and unzipping (gzip input/output stream)

That problem should not occur when using this self-implemented solution.

5. Serialization and deserialization are very expensive, however is
there any clean way to avoid them?

You can implement your own methods for doing this, but I cannot say in
how far this is faster that the java-build-in serialisation.

Ciao,
Ingo

Setting Java Virtual Memory	7	Dec 16, 2004
Java / Virtual Memory Problem with Irix 6.5	1	Jun 29, 2005
Simple console input / output framework for teaching beginners	37	May 26, 2014
Help with Java-based RIA Framework	16	Aug 29, 2008
Top 10 Technical requirements for In-Memory Reporting	0	May 27, 2010
Which Python Framework for REST API and Facebook Wrapper?	2	Sep 28, 2013
Senior Instructor ~ Own Training for Java Virtual Machines	0	Jul 9, 2007
JVM/Java memory footprint	16	Jan 29, 2007

"Virtual memory" framework for Java

zeus

Thomas Hawtin

Ingo R. Homann

zeus

Virgil Green

Ingo R. Homann

Antti S. Brax

Antti S. Brax

Ingo R. Homann

Antti S. Brax

Ingo R. Homann

Antti S. Brax

zeus

Ingo R. Homann

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads