SymbolTable and string dictionnary

S

Sigfried

I need to do some lookup on a set of strings with associated data. I've
tried HashMap and TreeMap (wich are slower).

I'd like to get more performance, so i've read about Ternary Search Tree
which are supposed to be equally fast with hashmap on successful searchs.

So is there faster than that ? I mean at least 30 % faster than HashMap
? Is there a digital search tree java implementation (with benchmark?)
around ? I didn't find any on google.
 
M

Mark Space

Sigfried said:
So is there faster than that ? I mean at least 30 % faster than HashMap
? Is there a digital search tree java implementation (with benchmark?)
around ? I didn't find any on google.

No, I think HashMap is as fast as it gets, regardless of language.
There may be some tricks you can plan for a specific application, but
you didn't let us in on what your app is doing.
 
R

Roedy Green

I need to do some lookup on a set of strings with associated data. I've
tried HashMap and TreeMap (wich are slower).

I'd like to get more performance, so i've read about Ternary Search Tree
which are supposed to be equally fast with hashmap on successful searchs.

So is there faster than that ? I mean at least 30 % faster than HashMap
? Is there a digital search tree java implementation (with benchmark?)
around ? I didn't find any on google.

For HashMap to work well it needs plenty of breathing room. Increase
the capacity till you find the optimum setting.

HashMap itself has almost no overhead. About the only thing it does is
call hash and chains of collisions. If the capacity is high enough
there won't be many collisions.

Think about tuning hash -- caching, computing a faster way...

--
Roedy Green Canadian Mind Products
http://mindprod.com
"Humanity is conducting an unintended, uncontrolled, globally pervasive experiment
whose ultimate consequences could be second only to global nuclear war."
~ Environment Canada (The Canadian equivalent of the EPA on global warming)
 
S

Sigfried

Mark Space a écrit :
No, I think HashMap is as fast as it gets, regardless of language. There
may be some tricks you can plan for a specific application, but you
didn't let us in on what your app is doing.

You are right. Since i know the total number of elements in the HashMap,
i can pass the initial capacity, and i can hope that there won't be any
collision at all. It's sad there is no public method to check collisions
count.

So String caches the hashCode, and there is one array access in the
table so it can't be faster than that.

So i would need to be sure that there is no collision by using a more
open implementation, which can specify the hash method (the same string
can belongs to several symbol table).
 
R

Roedy Green

You are right. Since i know the total number of elements in the HashMap,
i can pass the initial capacity, and i can hope that there won't be any
collision at all. It's sad there is no public method to check collisions
count.

if you know the keys ahead of time, i.e. they don't change, you can
construct a hashtable with zero collisions.

Sorry I don't have an URL for the algorithm. It is quite old. It
constructs a special purpose hash function.

--
Roedy Green Canadian Mind Products
http://mindprod.com
"Humanity is conducting an unintended, uncontrolled, globally pervasive experiment
whose ultimate consequences could be second only to global nuclear war."
~ Environment Canada (The Canadian equivalent of the EPA on global warming)
 
T

Tom Anderson

Roedy Green a écrit :

Thnaks but i don't know the keys at compile time. I load them once from
data and then i don't update the hashmap.

You could still set up the perfect hash at runtime. It might not be worth
it, though - it all depends on how expensive the construction is, how much
you save on each lookup by doing it, and how many times you do lookups, of
course.

tom
 
S

Sigfried

Tom Anderson a écrit :
You could still set up the perfect hash at runtime. It might not be

The question is how ? I only know i will have a bunch of strings as keys...
 
T

Tom Anderson

Tom Anderson a écrit :

The question is how ? I only know i will have a bunch of strings as keys...

There is an algorithm which sets up the hashtable - it takes a set of keys
as input, and gives you some kind of configuration variables for the table
as output. Traditionally, you run this when you write the code. Instead,
you run it after you've loaded the keys from the data.

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top