Disk Backed Collection/DB for Extremely Large Datasets - Best Option

N

nicholas.wakefield

Hi,

I have to many objects(10m+) to persist in hashmap, I run out of
memory. So I've been experimenting with Sleepycat Java, HSQLDB, XXL and
my own version of a disk backed collection. I've found that persisting
objects in HSQLDB to be most scalable.

However I was wondering what other peoples experiences are as I'm not
totally keen on the idea of the database overhead (logs etc). As my
goal is to be able to lookup objects using a key as fast as possible
with support for many gigabytes of data.
 
R

Roedy Green

However I was wondering what other peoples experiences are as I'm not
totally keen on the idea of the database overhead (logs etc). As my
goal is to be able to lookup objects using a key as fast as possible
with support for many gigabytes of data.

See http://mindprod.com/jgloss/pod.html
--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
N

nicholas.wakefield

Thanks for that, I looked at some of the ones listed. db4o benchmarks
show they won't be as fast as HSQLDB for my usage model - lots of reads
- also the GPL license doesn't work for me in this case.

Any other recommendations.
 
N

nicholas.wakefield

Just tried Perst, and it is almost twice as fast as HSQLDB for reads.
Going to to do some more evaluation but if anyone has an information on
the fastest way to use Perst I would appreciate it. So far I've been
just using createMap as my index.
 
B

Bill Karwin

Thanks for that, I looked at some of the ones listed. db4o benchmarks
show they won't be as fast as HSQLDB for my usage model - lots of reads
- also the GPL license doesn't work for me in this case.

Any other recommendations.

Try Hibernate. I haven't used it yet, but it's gaining a lot of
popularity as a good Java object to DBMS persistence framework. It is
licensed under the LGPL, which (as I understand it, I am not a lawyer)
permits redistribution without requiring your product to be open-source.

http://www.hibernate.org/

Regards,
Bill K.
 
S

Scott Ellsworth

I have to many objects(10m+) to persist in hashmap, I run out of
memory. So I've been experimenting with Sleepycat Java, HSQLDB, XXL and
my own version of a disk backed collection. I've found that persisting
objects in HSQLDB to be most scalable.

I might also look into sqlite for an embedded solution.

In general, databases are usually the way to go if you have millions of
objects in your database. Usually, the designers of such things will do
a better job than you will on your own, and if you use the proper tools
to wrap the database, you can switch different backends in without
having to completely redo your program logic.

Scott
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,023
Latest member
websitedesig25

Latest Threads

Top