Should I implement this using string hash or multimap..

J

Jonay Aloat

I need to implement the following. Shoul I use multimap or write a string
hash class? ie

Brand Product
==========================
Samson Television
Samsung Television
Samsung VCR
Samsung DVD Player
Samsunk Television

The brand is the key and the product is the data.

If I use multimap, then I can implement it using something like this

typedef std::multimap<string, string> mmap;
typedef std::multimap<string, string>::iterator mapIter;

mmap product_map;
product_map.insert( pair<string, string> string a(Samsung), string
b(Television));
..
..
..
To find the product in brand, all I have to do is iterate using lower and
upper bound.

mapIter lowerB = product_map.lower_bound("Samsung");
mapIter upperB = product_map.upper_bound("Samsung");

for (lowerB; lowerB != upperB; ++lowerB) {
string p = (*lowerB).second;
if ( strcmp (p.c_str(), "VCR") == 0 ) ) {
//do something
} else {
//do something
}
}

Assume that I have a lot of data. Is this implementation efficient? I know
using string as the key is not a good idea. Is there a better way to do
this? Is the performance better using hash? How do I implent the hash
function.

Thanks in advance..

-Jonny
 
J

Jerry Coffin

[ ... ]
To find the product in brand, all I have to do is iterate using lower and
upper bound.

mapIter lowerB = product_map.lower_bound("Samsung");
mapIter upperB = product_map.upper_bound("Samsung");

You can combine those using equal_range:

std::pair<mapIter, mapIter> range = product_map.equal_range("Samsung");

[ ... ]
Assume that I have a lot of data. Is this implementation efficient? I know
using string as the key is not a good idea. Is there a better way to do
this? Is the performance better using hash? How do I implent the hash
function.

multimap lookups take approximately log2(N) operations. Hashing is
normally about linear on the length of a key string. IOW, the two depend
on entirely different factors, making it difficult to predict which will
have better performance except in extreme cases.

Predicting which will be faster requires something much more specific
than "a lot of data". To different people, that might indicate sizes
that vary by several orders of magnitude. OTOH, both hash tables and
multimaps are oriented primarily toward in-memory collections, and
unless you're working on some sort of supercomputer, chances are that
you don't have enough memory for a large enough collection to really
strongly favor a hash table.

My advice would be to start with a multimap since you already seem
comfortable with that, and only worry about using a hash table in the
unlikely event that profiling shows the multimap is inadequate.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top