Is there a hash algorithm with direct access to hash elements andreference count?

B

Bo Peng

Dear list,

I am looking for a way to store a large amount of unique sequences that
will be accessed by objects. The most important operations are:

1. Direct access to the sequences (from pointers stored in each object).
Access through key lookup is not acceptable.

2. Given a new sequence, determine if it is already in the factory of
sequences. If so, increase the reference count of the existing sequence
and use it; Otherwise, put the new sequence into the factory.

3. If no object is using a sequence, the sequence should be removed from
the factory.

For the purpose of 2, I think I need a hash table algorithm. I will need
reference count and direct access to sequences for 1 and 3. Is there an
existing library that provide such facility? STL hash does not seem to
work (but I am not quite sure).

Of course, any implementation suggestion is welcome and performance is
at the highest priority here.

Many thanks in advance.
Bo
 
G

Greg

Bo said:
Dear list,

I am looking for a way to store a large amount of unique sequences that
will be accessed by objects. The most important operations are:

1. Direct access to the sequences (from pointers stored in each object).
Access through key lookup is not acceptable.

2. Given a new sequence, determine if it is already in the factory of
sequences. If so, increase the reference count of the existing sequence
and use it; Otherwise, put the new sequence into the factory.

3. If no object is using a sequence, the sequence should be removed from
the factory.

For the purpose of 2, I think I need a hash table algorithm. I will need
reference count and direct access to sequences for 1 and 3. Is there an
existing library that provide such facility? STL hash does not seem to
work (but I am not quite sure).

Of course, any implementation suggestion is welcome and performance is
at the highest priority here.

Sounds like a std::tr1::unordered_multiset would meet your
requirements. Although whether the unordered_multiset uses
reference-counting, multiple copies or some other mechanism to maintain
the count of duplicate entries is not specified by the TR1 draft. And
in fact, a unordered_multiset client is completely insulated from that
implementation detail.

Greg
 
P

Pete Becker

Greg said:
Although whether the unordered_multiset uses
reference-counting, multiple copies or some other mechanism to maintain
the count of duplicate entries is not specified by the TR1 draft.

A multiset can hold multiple entries that compare equal, but equal
entries aren't necessarily the same. It depends on the equality
operator. So, no, a multiset, unordered or otherwise, can't use
reference counting.
 
B

Bo Peng

Pete said:
A multiset can hold multiple entries that compare equal, but equal
entries aren't necessarily the same. It depends on the equality
operator. So, no, a multiset, unordered or otherwise, can't use
reference counting.

Then I have to write everything from scratch? The storage and reference
count parts seem to be easy. I just need something that will not
re-locate itself during add/remove. What I am lacking is a quick way to
look up a sequence. A linear 'memcmp' is not acceptable but a hash
function is also difficult to define and implement. Any suggestion?

Bo
 
G

Greg

Bo said:
Then I have to write everything from scratch? The storage and reference
count parts seem to be easy. I just need something that will not
re-locate itself during add/remove. What I am lacking is a quick way to
look up a sequence. A linear 'memcmp' is not acceptable but a hash
function is also difficult to define and implement. Any suggestion?

You could use ref-counted shared_ptr's (again from the std::tr1
library) to hold pointers to the sequences in the set. The set can be
either an ordered, std::set (and would require a routine to return the
"lesser" of two sequence pointers) or an unordered,
std::tr1::unordered_set (which would require a routine to determine
whether two sequence pointers are equivalent). Inserting a pointer into
the set would also have to decrement the pointer's refcount. Futhermore
when a shared_ptr's refcount reached 0, there would have to be some
logic for the shared_ptr to remove itself from its container.

Greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top