howto implement a word Suggester in Spell Checker

A

Aguilera

Hello
I have made a spell checker in java and would like to add a word
suggester to it aswell.
At the moment my programme can catch the incorrect words in the
String.
but i would like it to suggest an alternative correct word for every
miss spelled word?
So does anyone know how i could make my progrrame suggest a word if it
finds some incorrect word .
 
J

Jeffrey Palm

Aguilera said:
Hello
I have made a spell checker in java and would like to add a word
suggester to it aswell.
At the moment my programme can catch the incorrect words in the
String.
but i would like it to suggest an alternative correct word for every
miss spelled word?
So does anyone know how i could make my progrrame suggest a word if it
finds some incorrect word .

This is quite vague. Is this a GUI thing, command-line, ???

If it's a GUI highlight the word. For a commane-line version simply
print a message.

Jeff
 
M

Matthew Zimmer

Hmm...I've never personally written a spell checker or a word suggester,
but I suspect that the concept for how they do it might go something
like this:

Given a dictionary D of correct words
and given a sentence S comprised of words Ws
For any W that is spelled wrong find a word in D that is "close"

Now, the real question is what does "close" mean. The only thing that I
can think of would be to compare the letters in the wrong word to every
word in your dictionary and give some value of "closeness" to it. FOr
example, something like every common letter between the two in the same
order gives you a point. Then, just show the top 10 close words and let
the user pick. Unfortunately, it's early and I can't seem to think of
any good algorithms that would make this quick as the brute force method
could be painfully slow if you have a large dictionary. I suspect there
is some way to represent the words in the dictionary mathematically such
that a comparison for closeness is very quick.
 
E

Erich Reimberg N.

Hi,

You can tell how close is a word from another if you count how many letter
does one word have to change to become the other. For example:

Father is 1 letter away from Fater, or
mispelled is 2 letters away from mipselled

This way, for every incorrect word in your text, you can suggest all of
the other words that are N steps (or letters) away from the incorrect one.
You should ajust N to suggest words that are likely to be the correct
word in the original text.

Erich
 
T

Tim Ward

Matthew Zimmer said:
Now, the real question is what does "close" mean.

I think you need two sources of information:

(a) list/description of common spelling mistakes
(b) list/description of common typing mistakes

and try and come up with some way of encoding them into some combination of
algorithms and tables. Obviously you can tap into this process at any point,
from the raw research data to using someone else's complete solution.

For example, you could just fire up Word and use that.
 
G

GaryM

Now, the real question is what does "close" mean.

Just a throw in suggestion. There is a an algorithm called Soundex
which derives a code for english spoken works. Good for munged words so
long as target words and entered words sound the same.
 
P

Phillip Lord

Aguilera> Hello I have made a spell checker in java and would like
Aguilera> to add a word suggester to it aswell. At the moment my
Aguilera> programme can catch the incorrect words in the String.
Aguilera> but i would like it to suggest an alternative correct word
Aguilera> for every miss spelled word? So does anyone know how i
Aguilera> could make my progrrame suggest a word if it finds some
Aguilera> incorrect word .


1) Try not to. Always best to use some one else for a thing such as
this.

2) If you choose to the phoentic algorithms (based on the sounds of
words) generally work better than lexical algorithms (based on the
letters in the word). Have a look at the "aspell" web page. It has
a link to the algorithms (and implementations of them) that it
uses.

Phil
 
M

Michael Borgwardt

Phillip said:
2) If you choose to the phoentic algorithms (based on the sounds of
words) generally work better than lexical algorithms (based on the
letters in the word).

I think that neither will do well alone, since a phonetic algorithm
will often not do well on typos.
 
W

William Brogden

Aguilera said:
Hello
I have made a spell checker in java and would like to add a word
suggester to it aswell.
At the moment my programme can catch the incorrect words in the
String.
but i would like it to suggest an alternative correct word for every
miss spelled word?
So does anyone know how i could make my progrrame suggest a word if it
finds some incorrect word .

You might try a phonetic lookup, based on the idea that people tend to
make misspellings that make phonetic sense. Naturally this takes a big
dictionary.
Here is a simple example that uses the "Metaphone" algorithm
http://www.wbrogden.com/java/Phonetic/index.html
 
P

Phillip Lord

Michael> I think that neither will do well alone, since a phonetic
Michael> algorithm will often not do well on typos.

The links on the aspell page cover the combination of the two.

Phil
 
W

William Brogden

GaryM said:
Just a throw in suggestion. There is a an algorithm called Soundex
which derives a code for english spoken works. Good for munged words so
long as target words and entered words sound the same.

Unfortunately, Soundex depends on getting the first letter right. This
falls apart on many sound-alikes. Metaphone is better because it attempts
to match the sound rather than the letter.

Bill
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top