Strategy for locating strings based on initial characters

  • Thread starter =?ISO-8859-1?Q?Ney_Andr=E9_de_Mello_Zunino?=
  • Start date
?

=?ISO-8859-1?Q?Ney_Andr=E9_de_Mello_Zunino?=

Hello.

Given a sorted collection of strings, what would a good (the best?)
strategy be to allow fast access to an item, based on a search substring
which should match the beginning of the searched item? The actual goal
is to implement a functionality similar to that found in help indices,
where one can locate an item by gradually typing its initial characters.
I expect that some kind of tree structure be present in the solution,
but I am not sure.

Since I am multiposting this to comp.programming and comp.lang.c++, I
would like to make it on-topic on the latter as well by asking whether
the STL provides any algorithms that could help here, assming, for
instance, that the strings are stored in a std::set or sorted std::vector.

Thank you,
 
C

CBFalconer

Ney said:
Given a sorted collection of strings, what would a good (the best?)
strategy be to allow fast access to an item, based on a search
substring which should match the beginning of the searched item?
The actual goal is to implement a functionality similar to that
found in help indices, where one can locate an item by gradually
typing its initial characters. I expect that some kind of tree
structure be present in the solution, but I am not sure.

Look up "trie".
 
P

Prog37

Ney said:
Hello.

Given a sorted collection of strings, what would a good (the best?)
strategy be to allow fast access to an item, based on a search substring
which should match the beginning of the searched item? The actual goal
is to implement a functionality similar to that found in help indices,
where one can locate an item by gradually typing its initial characters.
I expect that some kind of tree structure be present in the solution,
but I am not sure.

Since I am multiposting this to comp.programming and comp.lang.c++, I
would like to make it on-topic on the latter as well by asking whether
the STL provides any algorithms that could help here, assming, for
instance, that the strings are stored in a std::set or sorted std::vector.

Thank you,

I needed something similar at work. My goal was to lookup an object
based on an object name. My implementation used the STL map<key,object>,
because the map stores object sorted and is therefore 2ln(n) for lookups.

The map needs a key that supports the < operator in order to sort the
objects. So I created a class called SortableString that implemented
the < operator using the c function strcmp.

So my template instantiation looked like
map<SortableString, Object*> objectMap;

I believe this sort of implementation would work for what you want to
do. First you would populate the map with a set of SortableStrings with
their text set to the topics of your help articles. The objects would
be pointers to your articles.

When the user types in a partial string you would create a temporary
SortableString object with its text set to the what the user typed in.
You can then use the lower_bound() function to locate the first element
greater than your temporary string (which should be the closest match
amoung all your subject headers. Psuedocode below.

class SortableString {
SortableString(char* string);
~SortableString();
bool operator < (SortableString& other);
};

map<SortableString, ArticleObject*> subjectHeaderMap;


bool AddSubject(char* subject, ArticleObject* article)
{
SortableString* newSubject = new SortableString(subject);
bool rvalue = objectHeaderMap.insert(map<SortableString*,
ArticleObject*>::value_type(newSubject,article).second;
return rvalue;
}

bool MatchSubject(char *test_subject, SortableString* found_subject,
ArticleObject** found_article)
{
bool match_found = false;
SortableString* newSubject = new SortableString(subject);
map<SortableString, ArticleObject*>::Iterator iter;

iter = objectHeaderMap.lower_bound(newSubject);
if (iter != objectHeaderMap.end())
{
*found_subject = iter->first;
*found_article = iter->second;
match_found = true;
}

delete newSubject;
return match_found;
}
 
C

Chris Dollin

Ney said:
Hello.

Given a sorted collection of strings, what would a good (the best?)
strategy be to allow fast access to an item, based on a search substring
which should match the beginning of the searched item? The actual goal
is to implement a functionality similar to that found in help indices,
where one can locate an item by gradually typing its initial characters.
I expect that some kind of tree structure be present in the solution,
but I am not sure.

Modified binary search should do the job. (Modified, because you're
looking for "first possible match" and "last ditto", and because it
*might* be useful to work incrementally, although I strongly suspect
that would be an unnecessary optimisation.)
 
C

Chris \( Val \)

| Hello.
|
| Given a sorted collection of strings, what would a good (the best?)
| strategy be to allow fast access to an item, based on a search substring
| which should match the beginning of the searched item? The actual goal
| is to implement a functionality similar to that found in help indices,
| where one can locate an item by gradually typing its initial characters.
| I expect that some kind of tree structure be present in the solution,
| but I am not sure.
|
| Since I am multiposting this to comp.programming and comp.lang.c++, I
| would like to make it on-topic on the latter as well by asking whether
| the STL provides any algorithms that could help here, assming, for
| instance, that the strings are stored in a std::set or sorted std::vector.

As an idea, I would start by looking into the following:
std::string::find_first_of( Criteria, StartPos );

std::string offers quite a few other members that may also help.

Cheers,
Chris Val
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top