data structure/algorithm problem

N

nan.li.g

Hello, all,
This is probably not so much related to C++ (Sorry in advance). But
I did get this question during a C++ job interview.
You are asked to design some data structure and algorith for the
following function:

List getNames ( List nameList, String prefix );

What it does is: given a name list and a prefix, return all the
names in the name list that start with prefix.
For example, if nameList = { "abc", "abdf", "cde" } and prefix =
"ab", the function should return { "abc", "abdf" }.

I did not do well on that question. But I would like to see some
good designs.

Thanks a lot,
Nan
 
J

John Fullman

Well my choice for the List is an AVL Binary Tree. A tree is used to
avoid excessive string comparing. The string compare you use does not
effect performance really. As long as the compare halts at the end of
the prefix. This means a special compare is required. ( StartsWith() )
I would also return a const ref to avoid copying the tree on return.
Copying an entire collection is expensive on the CPU and memory.
Something like this:

class Tree<typename T>
{
public:
Tree()
{
LeftNode = 0;
RightNode = 0;
}
Tree<T>* Traverse(const T& item)
{
if( item > Value)
return LeftNode;
else
return RightNode;
}
T Value;
private:
Tree<T>* LeftNode;
Tree<T>* RightNode;
};

const Tree<String>& getNames( const Tree<String>& nameList, String
prefix)
{
Tree<T>* current = &nameList;
while(current != 0 && !current.StartsWith(prefix))
nameList.Traverse(prefix);
if(current == 0) //No results
return Tree<String>();
return *current;
}

I'm too lazy to code the StartsWith() and the rest of the AVL tree...
I'll leave that to you.
Hoe that helps.
 
K

Kai-Uwe Bux

Hello, all,
This is probably not so much related to C++ (Sorry in advance). But
I did get this question during a C++ job interview.
You are asked to design some data structure and algorith for the
following function:

List getNames ( List nameList, String prefix );

What it does is: given a name list and a prefix, return all the
names in the name list that start with prefix.
For example, if nameList = { "abc", "abdf", "cde" } and prefix =
"ab", the function should return { "abc", "abdf" }.

I did not do well on that question. But I would like to see some
good designs.

Thanks a lot,
Nan


Here is a simple linear search:


#include <string>
#include <list>
#include <algorithm>
#include <functional>
#include <iterator>
#include <iostream>

template < typename RangeOne, typename RangeTwo >
bool is_prefix ( RangeOne pattern, RangeTwo text ) {
if ( text.size() < pattern.size() ) {
return( false );
}
return( pattern.end()
==
std::mismatch
( pattern.begin(), pattern.end(), text.begin() ).first );
}


// this really should be in std
template < typename IterIn, typename IterOut, typename Predicate >
IterOut copy_if ( IterIn from, IterIn to, IterOut where, Predicate p ) {
while ( from != to ) {
if ( p( *from ) ) {
*where = *from;
++where;
}
++from;
}
return( where );
}

typedef std::string String;
typedef std::list< String > List;


List getNames ( List const & nameList, String const & prefix ) {
List result;
copy_if( nameList.begin(), nameList.end(),
std::back_inserter( result ),
std::bind1st(
std::ptr_fun( is_prefix< String, String > ), prefix ) );
return( result );
}

int main ( void ) {
List l;
l.push_back ( String( "abcd" ) );
l.push_back ( String( "acd" ) );
l.push_back ( String( "abcd" ) );
l.push_back ( String( "acd" ) );
l.push_back ( String( "ab" ) );
l.push_back ( String( "abcd" ) );
l.push_back ( String( "ac" ) );
l.push_back ( String( "ab" ) );
l.push_back ( String( "abx" ) );
String p = "ab";
l = getNames( l, p );
std::copy( l.begin(), l.end(),
std::eek:stream_iterator< String >( std::cout, "\n" ) );
}







Here is something that might pay if searches for different prefixes are done
but the list ist not changed:

#include <string>
#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>

typedef std::string String;
typedef std::vector< String > List;

std::string succ ( std::string str ) {
while ( str.size() > 0 ) {
++ *str.rbegin();
if ( 0 == *str.rbegin() ) {
str.erase( str.size() - 1 );
} else {
break;
}
}
return( str );
}

List getNames ( List const &nameList, String prefix ) {
// this one assumes the list to be sorted
List::const_iterator from
= std::lower_bound( nameList.begin(), nameList.end(), prefix );
prefix = succ( prefix );
List::const_iterator to
=
prefix.size() > 0 ?
std::lower_bound( nameList.begin(), nameList.end(), prefix )
:
nameList.end();
List result;
std::copy( from, to, std::back_inserter( result ) );
return( result );
}

int main ( void ) {
List l;
l.push_back ( String( "abcd" ) );
l.push_back ( String( "acd" ) );
l.push_back ( String( "abcd" ) );
l.push_back ( String( "acd" ) );
l.push_back ( String( "ab" ) );
l.push_back ( String( "abcd" ) );
l.push_back ( String( "ac" ) );
l.push_back ( String( "ab" ) );
l.push_back ( String( "abx" ) );
String p = "ab";
std::sort( l.begin(), l.end() );
List range = getNames( l, p );
std::copy( range.begin(), range.end(),
std::eek:stream_iterator< String >( std::cout, "\n" ) );
}




Best

Kai-Uwe Bux
 
N

Nan Li

Thanks for the ideas. Here is my STL solution. The STL set ( based on
some balanced binary search tree, probably red-black tree) certainly
helps here. Another tricky part is to compute the upper bound of all
strings starting with the prefix.


#include <string>
#include <set>
#include <iostream>

using namespace std;

typedef set<string> StrSet;

static const int BASE = 27;

int toNum( const string& s )
{
int num = 0;
for ( int i=0; i< s.length(); i++ )
num = num * BASE + (s - 'a' + 1);
return num;
}

void toStr( int num, string* s )
{
if ( s ) {
if ( num / BASE > 0 )
toStr( num / BASE, s );
if ( num % BASE > 0 )
s->push_back( 'a' + ( num % BASE ) - 1 );
}
}

string plus_one( const string& s )
{
int t_num = toNum( s ) + 1;
string t;
toStr( t_num , &t );
return t;
}

int main()
{
int name_num;
cin >> name_num;


StrSet directory;
for ( int i = 0; i < name_num; i++ ) {
string name;
cin >> name;
directory.insert( name );
}

string prefix;
cin >> prefix;
StrSet::iterator iter1 = directory.lower_bound( prefix );

string prefixNext = plus_one( prefix );
StrSet::iterator iter2;
if ( prefixNext == "a" )
iter2 = directory.end();
else
iter2 = directory.lower_bound( prefixNext );

for ( StrSet::iterator iter = iter1 ; iter != iter2; ++iter )
cout << *iter << endl;

return 0;
}

input:

10
alexandrea
alexandria
alexandrina
alexia
alexina
alexis
aaren
aaron
abbey
zzeppelin
zz

output:
zzeppelin

input:
10
alexandrea
alexandria
alexandrina
alexia
alexina
alexis
aaren
aaron
abbey
zzeppelin
ale

output:
alexandrea
alexandria
alexandrina
alexia
alexina
alexis
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top