string combination algorithm for a "word descrambler"

G

gdogg1587

Greetings.

I'm working on a program that will "descramble" words. Think of a word
scramble game where there is a list of characters, and several blank
spaces to denote the word(s) that you are to come up with from the list.
i.e.(* denotes a space): _ _ _ _ _ * _ _ _ _ _ _

characters: .hwodlelrlo

answer: hello world.

So, implemented as a c/c++ program, you would have something like:
<begin pseudo-c code>
char letters[11] = {'.','h','w','o','d','l','e','l','r','l','o'};

char PossibleWordOne[100][5];
char PossibleWordTwo[100][6];
//WordOneLength = 5;
//WordTwoLength = 6;
/* Ultimately, the goal is to not know the number of words or their
lengths or even the scrambled characters before execution, but this will
work for the example. I got the 100 because there should be no way of
getting more than 100 possible 'real' words*/


/* In order to make sure the combination of characters is a real word,
verify() will check it against a word list and return True or False Swap
will be a simple function (I hope it'll be simple anyways) that will swap
the characters into the next combination.*/

Permu()
{
static int counter = 1;
static int possibles = 0;
while (counter != 11!) /* Since there are 11! possible combinations */
if (!(Verify(Swap(letters*))
{ counter++;
Permu();
}
else
{
PossibleWordOne[possibles][] = letters*;
possibles++;
counter++;
Permu();
}

<end pseudo-c code>

Well, what do guys think? Any suggestions or sample code would be greatly
appreciated. In order to optimize performance, I would like to load the
word list directly into memory, and preferably have the words categorized
by length. This way, I can find the possible words for the longest word
first, and then go to the next longest. For example, let's say that I find
two possible combinations for the five-letter word. Then, I can seperate
them into two cases to find the four-letter word. Each case would have a
char array with 5 less chars than the original, which should make it a lot
faster to find the second word. Then let's say that only case two turns up
a possible combination for the four letter word. case one could then be
disguarded and you could move on to the next word... and on and on.
Anyways, that's what I've got so far... Like I said any suggestions would
be greatly appreciated.

Thanks,
Gaines
 
D

David Hilsee

gdogg1587 said:
Greetings.

I'm working on a program that will "descramble" words. Think of a word
scramble game where there is a list of characters, and several blank
spaces to denote the word(s) that you are to come up with from the list.
i.e.(* denotes a space): _ _ _ _ _ * _ _ _ _ _ _

characters: .hwodlelrlo

answer: hello world.

So, implemented as a c/c++ program, you would have something like:
<begin pseudo-c code>
char letters[11] = {'.','h','w','o','d','l','e','l','r','l','o'};

char PossibleWordOne[100][5];
char PossibleWordTwo[100][6];
//WordOneLength = 5;
//WordTwoLength = 6;
/* Ultimately, the goal is to not know the number of words or their
lengths or even the scrambled characters before execution, but this will
work for the example. I got the 100 because there should be no way of
getting more than 100 possible 'real' words*/


/* In order to make sure the combination of characters is a real word,
verify() will check it against a word list and return True or False Swap
will be a simple function (I hope it'll be simple anyways) that will swap
the characters into the next combination.*/

Permu()
{
static int counter = 1;
static int possibles = 0;
while (counter != 11!) /* Since there are 11! possible combinations */
if (!(Verify(Swap(letters*))
{ counter++;
Permu();
}
else
{
PossibleWordOne[possibles][] = letters*;
possibles++;
counter++;
Permu();
}

<end pseudo-c code>

Well, what do guys think? Any suggestions or sample code would be greatly
appreciated. In order to optimize performance, I would like to load the
word list directly into memory, and preferably have the words categorized
by length. This way, I can find the possible words for the longest word
first, and then go to the next longest. For example, let's say that I find
two possible combinations for the five-letter word. Then, I can seperate
them into two cases to find the four-letter word. Each case would have a
char array with 5 less chars than the original, which should make it a lot
faster to find the second word. Then let's say that only case two turns up
a possible combination for the four letter word. case one could then be
disguarded and you could move on to the next word... and on and on.
Anyways, that's what I've got so far... Like I said any suggestions would
be greatly appreciated.

I've written a similar program before, just for a laugh. To determine the
"legal" words that one can make, there's a simple, efficient approach that
doesn't take too much time to implement. Simply scan the entire list of
"legal" words, testing if it is a subset of the scrambled letters. This
tends to be pretty quick (virtually instantaneous on my machine), as there
are only around 180,000 words in the English language. Then, for each word
you can make that is of an appropriate length, examine each pair (or
whatever size tuple) to see if, together, they "use up" all of the scrambled
letters. Of course, you may have multiple correct answers. For other
programs, it may pay to try to do something "smarter", but, in my
experience, nothing fancy is required for a program like this. However, it
might get out of hand if you had a lot of letters that could make many
words, and there are many words in the sentence.
 
G

gdogg1587

Hmm... That's a pretty good idea. 18,000 is a lot less than 11! And a whole
lot less than 100!. And, I could categorize the word list by length of word
to make it even faster! That's great! Thanks.
 
G

gdogg1587

Hmm... That's a pretty good idea. 18,000 is a lot less than 11! And a whole
lot less than 100!. And, I could categorize the word list by length of word
to make it even faster! That's great! Thanks.
 
Joined
Mar 3, 2021
Messages
241
Reaction score
30
Sorry for the late response! Start a new thread and we'll work through it. I just wrote a C++ version so I could be more familiar with the problem. David Hilsee's answer is a pretty good starting point. First, find a word list (I use ENABLE). Read it into an array, throwing out any words longer than your list of letters. Next, for each word, throw it out if it contains any invalid letters. Lastly, check if it contains less-than-or-equal-to the number of each letter you have. If it passes both tests, add it to your list of valid words. Then, sort it, etc, to pick whatever you want to return.

But, again, start a new thread and we can get into the nitty gritty.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top