# is there any FAST ALGORITHM about Forward Maximum Match of string?

Discussion in 'C++' started by hn.ft.pris@gmail.com, Dec 15, 2006.

1. ### Guest

"Forward maximum match" mean that there is a dictionary contains
thousands of items which are single words. For example, "kid", "nice",
"best"...
And here get a string "kidnicebs", the forward maximum match algorithm
need to find the longest string that matches the string in dictionary.
Because the longest string in the dictionary has a length of 4, so the
algoritm pick the first four characters of the string(that's why it's
called forward):

kidn-------not find in dictionary
then, first three: kid --------find in dictionay

then kid is picked out as a segment, then the algorithm continue with
"nicebs"...

I'm wondering is there any fast algorithm to deal with this demand,
what kind of data structure should I use? Thanks for helping me.

, Dec 15, 2006

2. ### Joe SeighGuest

wrote:
> "Forward maximum match" mean that there is a dictionary contains
> thousands of items which are single words. For example, "kid", "nice",
> "best"...
> And here get a string "kidnicebs", the forward maximum match algorithm
> need to find the longest string that matches the string in dictionary.
> Because the longest string in the dictionary has a length of 4, so the
> algoritm pick the first four characters of the string(that's why it's
> called forward):
>
> kidn-------not find in dictionary
> then, first three: kid --------find in dictionay
>
> then kid is picked out as a segment, then the algorithm continue with
> "nicebs"...
>
> I'm wondering is there any fast algorithm to deal with this demand,
> what kind of data structure should I use? Thanks for helping me.
>

Firstly, don't multi-post to c.l.c++ and c.l.c++.m. You're off topic
for both. Posting to comp.programming and setting followup to same.

You can do it with having to examine each character in the input
string only once. There may be a temptation to try to exploit some
sort of skip ahead based on the length of the longest string found
so far but I suspect that number of potential target words nullifies
any advantage of skip ahead since it's likely to only be 1. I'll
let others have fun with this.

btw, this isn't a homework problem, is it?

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Joe Seigh, Dec 15, 2006