Searching a disk-backed Map

A

Arne Vajhøj

Roedy said:
I rolled my own something similar. It is not that much code to handle
the random quotations you see on my website.

What you could do in write a class that uses a HashMap internally just
to hold the keys and objects that are offsets in a sequential file.

When you build the Map, you write the objects out with writeUTF or
writeObject, and record the size/offset of the stream before the
write.

Then to lookup in the Map, you look up the key, get the offset, seek
and do a read. You don't even need to know the length.

This is pretty fast, especially when the drive/OS does read caching.
If you wanted to make it even faster, you could put the objects in a
NIO memory mapped file, but that limits your file size. You also
might put the file on fast flash drive.

I would write such a beast to your specs for $50 US.

I think Stefan is capable of writing his own code.

Arne
 
A

Arne Vajhøj

Stefan said:
I have this crazy idea to write applications that are based
on Java SE only and not require any additional library.
I know that this idea might not be very pragmatic or reasonable.

/If/ Derby would finally be included in Java SE, I would
love to use it.

It is since 1.6.

Arne
 
A

Arne Vajhøj

Tom said:
And if you don't believe me - how about Oracle?

http://www.oracle.com/technology/products/berkeley-db/je/index.html

Relational databases are the most sophisticated tool available to the
developer for data storage and analysis. Most persisted object data is
never analyzed using ad-hoc SQL queries; it is usually simply retrieved
and reconstituted as Java objects. The overhead of using a sophisticated
analytical storage engine is wasted on this basic task of object
retrieval. The full analytical power of the relational model is not
required to efficiently persist Java objects. In many cases, it is
unnecessary overhead. In contrast, Berkeley DB Java Edition does not have
the overhead of an ad-hoc query language like SQL, and so does not incur
this penalty.

The result is faster storage, lower CPU and memory requirements, and a
more efficient development process.

That software is freeware; if i was going to implement a disk-backed
map, it's where i'd start.

I am not sure that I agree with the argument.

It very common to:
- do SQL based reporting based on data stored via ORM
- load objects not by id but by criterias on other fields

Arne
 
T

Tom Anderson

I am not sure that I agree with the argument.

It very common to:
- do SQL based reporting based on data stored via ORM
- load objects not by id but by criterias on other fields

Neither of which are required to implement a disk-backed map.

tom
 
A

Arne Vajhøj

Tom said:
Neither of which are required to implement a disk-backed map.

No.

But I was commenting on the big block of text that you chose
not to quote.

When you remove the text people comment on then becomes very
hard to understand the replies.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top