Help with string matching algorythm

T

Tomislav Kralj

Hello,

I would be so gratefull to ANYONE who can help me!

I'm writting a program in C++.
A part of my program is to compare two strings and as a result I have to
get a number (range:0-1) which represents a similarity beetwen those two
strings.

Which algorythm to use?
I searched Web, but there's a million of them, and I don't know which to
use.
I don't need a solution in C++, just a hint which algorythm to use to
implement this concept.

Thanx!
 
T

Todd Burch

Tomislav said:
I'm writting a program in C++.
A part of my program is to compare two strings and as a result I have to
get a number (range:0-1) which represents a similarity beetwen those two
strings.

Which algorythm to use?
I searched Web, but there's a million of them, and I don't know which to
use.
I don't need a solution in C++, just a hint which algorythm to use to
implement this concept.

Thanx!

strcmp()

Todd
 
R

Robert Klemme

2007/8/9 said:
Hello,

I would be so gratefull to ANYONE who can help me!

I'm writting a program in C++.
A part of my program is to compare two strings and as a result I have to
get a number (range:0-1) which represents a similarity beetwen those two
strings.

Which algorythm to use?
I searched Web, but there's a million of them, and I don't know which to
use.
I don't need a solution in C++, just a hint which algorythm to use to
implement this concept.

There is no general answer to your question. It depends on what you
want to do with the result. There must be some requirements or at
least more information about the nature of your problem. There is no
general definition of the term "similarity" for text strings - it
really depends on the application case.

Kind regards

robert
 
O

Olivier Renaud

Tomislav Kralj a écrit :
Hello,

I would be so gratefull to ANYONE who can help me!

I'm writting a program in C++.
A part of my program is to compare two strings and as a result I have to
get a number (range:0-1) which represents a similarity beetwen those two
strings.

Which algorythm to use?
I searched Web, but there's a million of them, and I don't know which to
use.
I don't need a solution in C++, just a hint which algorythm to use to
implement this concept.

Thanx!
You problem is similar to finding the edit distance between two strings.
Have a look at http://en.wikipedia.org/wiki/Edit_distance and
http://en.wikipedia.org/wiki/String_metrics.
I don't know which one could give a result in the range [0,1], however.
 
O

Olivier Renaud

Olivier Renaud a écrit :
Tomislav Kralj a écrit :
Hello,

I would be so gratefull to ANYONE who can help me!

I'm writting a program in C++.
A part of my program is to compare two strings and as a result I have to
get a number (range:0-1) which represents a similarity beetwen those two
strings.

Which algorythm to use?
I searched Web, but there's a million of them, and I don't know which to
use.
I don't need a solution in C++, just a hint which algorythm to use to
implement this concept.

Thanx!
You problem is similar to finding the edit distance between two
strings. Have a look at http://en.wikipedia.org/wiki/Edit_distance and
http://en.wikipedia.org/wiki/String_metrics.
I don't know which one could give a result in the range [0,1], however.
I think using the levenshtein distance this way should do the trick :
levenshtein_distance(a, b) / max(a.size, b.size)

since the result of the levenshtein distance is at most the length of
the longer string.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top