SymbolTable and string dictionnary

Discussion in 'Java' started by Sigfried, Nov 26, 2008.

  1. Sigfried

    Sigfried Guest

    I need to do some lookup on a set of strings with associated data. I've
    tried HashMap and TreeMap (wich are slower).

    I'd like to get more performance, so i've read about Ternary Search Tree
    which are supposed to be equally fast with hashmap on successful searchs.

    So is there faster than that ? I mean at least 30 % faster than HashMap
    ? Is there a digital search tree java implementation (with benchmark?)
    around ? I didn't find any on google.
     
    Sigfried, Nov 26, 2008
    #1
    1. Advertising

  2. Sigfried

    Mark Space Guest

    Sigfried wrote:

    > So is there faster than that ? I mean at least 30 % faster than HashMap
    > ? Is there a digital search tree java implementation (with benchmark?)
    > around ? I didn't find any on google.


    No, I think HashMap is as fast as it gets, regardless of language.
    There may be some tricks you can plan for a specific application, but
    you didn't let us in on what your app is doing.
     
    Mark Space, Nov 26, 2008
    #2
    1. Advertising

  3. Sigfried

    Roedy Green Guest

    On Wed, 26 Nov 2008 17:12:40 +0100, Sigfried <>
    wrote, quoted or indirectly quoted someone who said :

    >I need to do some lookup on a set of strings with associated data. I've
    >tried HashMap and TreeMap (wich are slower).
    >
    >I'd like to get more performance, so i've read about Ternary Search Tree
    >which are supposed to be equally fast with hashmap on successful searchs.
    >
    >So is there faster than that ? I mean at least 30 % faster than HashMap
    >? Is there a digital search tree java implementation (with benchmark?)
    >around ? I didn't find any on google.


    For HashMap to work well it needs plenty of breathing room. Increase
    the capacity till you find the optimum setting.

    HashMap itself has almost no overhead. About the only thing it does is
    call hash and chains of collisions. If the capacity is high enough
    there won't be many collisions.

    Think about tuning hash -- caching, computing a faster way...

    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    "Humanity is conducting an unintended, uncontrolled, globally pervasive experiment
    whose ultimate consequences could be second only to global nuclear war."
    ~ Environment Canada (The Canadian equivalent of the EPA on global warming)
     
    Roedy Green, Nov 26, 2008
    #3
  4. Sigfried

    Sigfried Guest

    Mark Space a écrit :
    > Sigfried wrote:
    >
    >> So is there faster than that ? I mean at least 30 % faster than
    >> HashMap ? Is there a digital search tree java implementation (with
    >> benchmark?) around ? I didn't find any on google.

    >
    > No, I think HashMap is as fast as it gets, regardless of language. There
    > may be some tricks you can plan for a specific application, but you
    > didn't let us in on what your app is doing.


    You are right. Since i know the total number of elements in the HashMap,
    i can pass the initial capacity, and i can hope that there won't be any
    collision at all. It's sad there is no public method to check collisions
    count.

    So String caches the hashCode, and there is one array access in the
    table so it can't be faster than that.

    So i would need to be sure that there is no collision by using a more
    open implementation, which can specify the hash method (the same string
    can belongs to several symbol table).
     
    Sigfried, Nov 27, 2008
    #4
  5. Sigfried

    Roedy Green Guest

    On Thu, 27 Nov 2008 09:18:01 +0100, Sigfried <>
    wrote, quoted or indirectly quoted someone who said :

    >
    >You are right. Since i know the total number of elements in the HashMap,
    >i can pass the initial capacity, and i can hope that there won't be any
    >collision at all. It's sad there is no public method to check collisions
    >count.


    if you know the keys ahead of time, i.e. they don't change, you can
    construct a hashtable with zero collisions.

    Sorry I don't have an URL for the algorithm. It is quite old. It
    constructs a special purpose hash function.

    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    "Humanity is conducting an unintended, uncontrolled, globally pervasive experiment
    whose ultimate consequences could be second only to global nuclear war."
    ~ Environment Canada (The Canadian equivalent of the EPA on global warming)
     
    Roedy Green, Nov 27, 2008
    #5
  6. Sigfried

    Roedy Green Guest

    On Thu, 27 Nov 2008 20:49:54 -0800, Patricia Shanahan <>
    wrote, quoted or indirectly quoted someone who said :

    >The search term for this is "perfect hash".


    see http://mindprod.com/jgloss/perfecthash.html
    --
    Roedy Green Canadian Mind Products
    http://mindprod.com
    "Humanity is conducting an unintended, uncontrolled, globally pervasive experiment
    whose ultimate consequences could be second only to global nuclear war."
    ~ Environment Canada (The Canadian equivalent of the EPA on global warming)
     
    Roedy Green, Nov 28, 2008
    #6
  7. Sigfried

    Sigfried Guest

    Roedy Green a écrit :
    > On Thu, 27 Nov 2008 20:49:54 -0800, Patricia Shanahan <>
    > wrote, quoted or indirectly quoted someone who said :
    >
    >> The search term for this is "perfect hash".

    >
    > see http://mindprod.com/jgloss/perfecthash.html


    Thnaks but i don't know the keys at compile time. I load them once from
    data and then i don't update the hashmap.
     
    Sigfried, Nov 28, 2008
    #7
  8. Sigfried

    Tom Anderson Guest

    On Fri, 28 Nov 2008, Sigfried wrote:

    > Roedy Green a écrit :
    >> On Thu, 27 Nov 2008 20:49:54 -0800, Patricia Shanahan <>
    >> wrote, quoted or indirectly quoted someone who said :
    >>
    >>> The search term for this is "perfect hash".

    >>
    >> see http://mindprod.com/jgloss/perfecthash.html

    >
    > Thnaks but i don't know the keys at compile time. I load them once from
    > data and then i don't update the hashmap.


    You could still set up the perfect hash at runtime. It might not be worth
    it, though - it all depends on how expensive the construction is, how much
    you save on each lookup by doing it, and how many times you do lookups, of
    course.

    tom

    --
    Would you like to remember more?
     
    Tom Anderson, Nov 28, 2008
    #8
  9. Sigfried

    Sigfried Guest

    Tom Anderson a écrit :
    > On Fri, 28 Nov 2008, Sigfried wrote:
    >
    >> Roedy Green a écrit :
    >>> On Thu, 27 Nov 2008 20:49:54 -0800, Patricia Shanahan <>
    >>> wrote, quoted or indirectly quoted someone who said :
    >>>
    >>>> The search term for this is "perfect hash".
    >>>
    >>> see http://mindprod.com/jgloss/perfecthash.html

    >>
    >> Thnaks but i don't know the keys at compile time. I load them once
    >> from data and then i don't update the hashmap.

    >
    > You could still set up the perfect hash at runtime. It might not be


    The question is how ? I only know i will have a bunch of strings as keys...

    > worth it, though - it all depends on how expensive the construction is,
    > how much you save on each lookup by doing it, and how many times you do
    > lookups, of course.
    >
    > tom
    >
     
    Sigfried, Dec 1, 2008
    #9
  10. Sigfried

    Tom Anderson Guest

    On Mon, 1 Dec 2008, Sigfried wrote:

    > Tom Anderson a écrit :
    >> On Fri, 28 Nov 2008, Sigfried wrote:
    >>
    >>> Roedy Green a écrit :
    >>>> On Thu, 27 Nov 2008 20:49:54 -0800, Patricia Shanahan <>
    >>>> wrote, quoted or indirectly quoted someone who said :
    >>>>
    >>>>> The search term for this is "perfect hash".
    >>>>
    >>>> see http://mindprod.com/jgloss/perfecthash.html
    >>>
    >>> Thnaks but i don't know the keys at compile time. I load them once from
    >>> data and then i don't update the hashmap.

    >>
    >> You could still set up the perfect hash at runtime. It might not be

    >
    > The question is how ? I only know i will have a bunch of strings as keys...


    There is an algorithm which sets up the hashtable - it takes a set of keys
    as input, and gives you some kind of configuration variables for the table
    as output. Traditionally, you run this when you write the code. Instead,
    you run it after you've loaded the keys from the data.

    tom

    --
    Caps lock is like cruise control for cool.
     
    Tom Anderson, Dec 2, 2008
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. NOBODY
    Replies:
    2
    Views:
    817
    Thomas Weidenfeller
    Oct 17, 2003
  2. Sophie Alléon

    newbie question about dictionnary ?

    Sophie Alléon, Sep 5, 2003, in forum: Python
    Replies:
    9
    Views:
    320
    Bengt Richter
    Sep 5, 2003
  3. Bob Ippolito

    Re: Identity dictionnary

    Bob Ippolito, Feb 29, 2004, in forum: Python
    Replies:
    1
    Views:
    316
    Jean-Sebastien Roy
    Feb 29, 2004
  4. Famille Delorme

    Dictionnary vs Class for configuration

    Famille Delorme, Apr 30, 2004, in forum: Python
    Replies:
    7
    Views:
    336
  5. Sandman
    Replies:
    7
    Views:
    211
    Anno Siegel
    Aug 3, 2004
Loading...

Share This Page