[SUMMARY] Word Blender (#108)

R

Ruby Quiz

I'm almost embarrassed to admit that I originally turned this quiz down. I
thought it would be too similar to the Scrabble Stems problem we did long, long
ago. Ben politely explained that he felt it was different enough though, and
then sent some guy named Guido after me in a parking lot one night. That
convinced me to actually work the problem, and I had a change of heart. I think
we can tell from the popularity of the problem that Ben is smarter than I am, so

Many solvers chose to build the entire game, so we will take a look at that
approach. Though the quiz doesn't explicitly call for it, most programmers
chose to add scoring or a time limit to their solution to spice up the action.
I went with the time limit and we will examine my own code below.

The first step is to get a word list, of course. There were several approaches
to this process, since manipulating every word in the dictionary each time could
get a little slow. My answer to this was just to cache the word list after I
had built it once and reuse that for all future runs. Here's the code that

# game date cache
CACHE_FILE = ".game_words"

if File.exist? CACHE_FILE # load from cache
word_list = File.open(CACHE_FILE) { |file| Marshal.load(file) }
else # build word list
# prepare data structure
words_by_signature = Hash.new { |words, sig| words[sig] = Array.new }

File.foreach(ARGV.shift || "/usr/share/dict/words") do |word|
word.downcase!
word.delete!("^a-z")

next unless word.length.between? 3, 6

(words_by_signature[word.split("").sort.join] << word).uniq!
end

# prepare recursive signature search
def choices( sig,
seen = Hash.new { |all, cur| all[cur] = true; false },
&blk )
sig.length.times do |i|
shorter = sig[0...i] + sig[(i+1)...sig.length]
unless seen[shorter]
blk[shorter]
choices(shorter, seen, &blk) unless shorter.length == 3
end
end
end

# prepare game data structure
word_list = Hash.new

# build game choices
words_by_signature.keys.grep(/\A.{6}\Z/) do |possible|
word_list[possible] = words_by_signature[possible]

choices(possible) do |shorter_signature|
if words_by_signature.include? shorter_signature
word_list[possible].push(*words_by_signature[shorter_signature])
end
end
end

File.open(CACHE_FILE, "w") { |file| Marshal.dump(word_list, file) }
end

# ...

This uses Marshal to build a trivial word cache with only a few lines of code.
If the cache exists, we slurp it back in. Otherwise we build the cache and save
it for future runs.

To build the cache, I pluck all three to six letter words out of the indicated
dictionary file and build a word list containing all six letter words linked to
smaller words using the same letters.

The main trick used in this recursive grouping of words is the use of
"signatures." A word's signature is just the sorted order of the letters in the
word: aejms for james, for example. Comparing signatures makes it trivial to
find words that use the same letters, since their signatures will be the same.

Using the signatures, the choices() method just removes one character at a time
recursing through the word list. This allows me to find all of the smaller
words that can be formed using the same letters.

If you wanted to generate the entire list of games as the quiz suggested, the
above is all you need. Each key is one possible round with the values being the
words that can be matched in that round.

I wanted to play though and built a full game interface. My interface requires
Unix, because those are the tricks I know. Here's the start of that code:

# ...

require "io/wait"

### game interface (requires Unix) ###
TERMINAL_STATE = `stty -g`
system "stty raw -echo cbreak"
at_exit { system "stty #{TERMINAL_STATE}" }
clear = `clear`

# a raw mode savvy puts
def out(*args) print(*(args + ["\r\n"])) end

# for easy selection
words = word_list.keys

# ...

This setup code memorizes the original state of the user's terminal, modifies
that state to raw mode so I can read individual characters as they are pressed,
arranges to have the terminal settings restored at exit, grabs the escape
sequence we can use to clear the terminal, and builds a puts() like method that
works with raw mode. This code doesn't really have much to do with Ruby. I'm
just shelling out to standard Unix utilities here.

We're now ready for the game event loop:

# ...

rounds = 0
loop do
# select letters
letters = current = words[rand(words.size)]
while word_list.include? letters
letters = letters.split("").sort_by { rand }.join
end
letters.gsub!(/.(?=.)/, '\0 ')

# round data
matches = Array.new
current_match = String.new
start = Time.now
message = nil
last_update = start - 1

# ...

I begin by selecting a word for the round and scrambling that word's letters
until they are no longer a real word. Then I setup some variables to hold data
for the round like whether or not the user has found a six letter word and
should advance as well as any feedback message I am currently showing the user
and the last time I refreshed the screen.

Now we start the round event loop:

# ...

# round event loop
until Time.now >= start + 2 * 60
# game display
if last_update <= Time.now - 1
print clear

out " Time left: #{120 - (Time.now - start).round} seconds"
out " Your words: #{matches.join(', ')}"
out
unless message.nil?
out message
out
end
print current_match
\$stdout.flush

last_update = Time.now
end

# ...

The round event loop runs for two minutes and this first bit is responsible for
drawing the screen. After clearing the screen, it prints the letters, time
left, words found, and any feedback message to the screen. Note that I update
the screen each second, assuming no other activity, so the user will notice the
clock counting down.

Here's the input processing:

# ...

# input handler
char = \$stdin.getc
case char
when ?a..?z, ?A..?Z # read input
current_match << char.chr.downcase
message = nil
last_update = start - 1
when ?\b, 127 # backspace/delete
current_match = current_match[0..-2]
message = nil
last_update = start - 1
when ?\r, ?\n # test entered word
if word_list[current].include? current_match
matches << current_match
matches = matches.sort_by { |word| [word.size, word] }
if not advance and current_match.length == 6
message = "You will advance to the next round!"
else
message = nil
end
else
message = "Unknown word."
end
current_match = String.new
last_update = start - 1
end
end
end

# ...

This just checks to see if there is data waiting on STDIN. We don't want to
read from it without checking as that could block the application waiting for
input. The ready?() method used here is added by the io/wait library and will
return true if there is data waiting. The rest of the code just handles the
input we get. Letters are recorded, we support backspace, and a carriage-return
tells us to try the current word, set some feedback, and refresh the display.

When the round is done, we finish out the game loop:

# ...

# round results
print clear
missed = word_list[current] - matches
unless missed.empty?
out "Other words using \"#{letters}:\""
out missed.sort_by { |word| [word.size, word] }.join(", ")
out
end
rounds += 1

out "You made #{matches.size} word#{'s' if matches.size != 1}, ",
"including at least one six letter word. Nice work."
out "Press any key to begin the next round."

\$stdin.getc
else
out "You made #{matches.size} word#{'s' if matches.size != 1}, ",
"but failed to find a six letter word."

break # end game
end
end

# game results
out "You completed #{rounds} round#{'s' if rounds != 1}. ",
"Thanks for playing."

The above code just prints missed words and results. If you found a six letter
word, the code will loop to a new round. Otherwise it will break out of the
program.

A big thank you to Ben (and Guido!) for convincing me to try the quiz and to all
those that took the time to play with it.

Tomorrow we'll try a problem that has been making the rounds, literally...

M

Martin DeMello

The main trick used in this recursive grouping of words is the use of
"signatures." A word's signature is just the sorted order of the letters in the
word: aejms for james, for example. Comparing signatures makes it trivial to
find words that use the same letters, since their signatures will be the same.

Using the signatures, the choices() method just removes one character at a time
recursing through the word list. This allows me to find all of the smaller
words that can be formed using the same letters.

I used a different signature method - I mapped each unique letter in
the target word to a prime, and then used the product of the primes of
all the letters in a word as its signature. That way, a is contained
in b if signature(b) % signature(a) == 0, and you can generate the
signatures via each_byte, array lookup and integer multiplication (no
need for split, sort or string deletion).

martin

J

James Edward Gray II

I used a different signature method - I mapped each unique letter in
the target word to a prime, and then used the product of the primes of
all the letters in a word as its signature. That way, a is contained
in b if signature(b) % signature(a) == 0, and you can generate the
signatures via each_byte, array lookup and integer multiplication (no
need for split, sort or string deletion).

Very interesting. I've never seen that before. I like it.

My version of "signatures" comes out of Programming Peals, if my
memory is right. Just FYI.

James Edward Gray II

B

Ben Bleything

I used a different signature method - I mapped each unique letter in
the target word to a prime, and then used the product of the primes of
all the letters in a word as its signature. That way, a is contained
in b if signature(b) % signature(a) == 0, and you can generate the
signatures via each_byte, array lookup and integer multiplication (no
need for split, sort or string deletion).

Holy crap. That is incredibly cool. Thanks for sharing this technique!

Ben

M

Martin DeMello

Very interesting. I've never seen that before. I like it.

The numbers overflow 32 bits in the general case, I think, but for
this restricted problem it works very nicely.

martin

P

Phrogz

Martin said:
The numbers overflow 32 bits in the general case, I think, but for
this restricted problem it works very nicely.

Specifically (I was wondering) you can use 9 primes and still be under
32 bits, 15 primes and still be under 64 bits.

module Enumerable
def product; inject(){ |n,p| p*n }; end
end

first9 = primes[0..8]
puts "First 9 primes:\n%s\nproduct: %d (%d bits)" %
[
first9.inspect,
first9.product,
first9.product.to_s(2).length
]

first15 = primes[0..14]
puts "\nFirst 15 primes:\n%s\nproduct: %d (%d bits)" %
[
first15.inspect,
first15.product,
first15.product.to_s(2).length
]

#=> First 9 primes:
#=> [2, 3, 5, 7, 11, 13, 17, 19, 23]
#=> product: 223092870 (28 bits)

#=> First 15 primes:
#=> [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
#=> product: 614889782588491410 (60 bits)

25 primes puts you at 121 bits; 26 primes at 128 bits on the nose.

B

Ben Bleything

I'm almost embarrassed to admit that I originally turned this quiz down. I
thought it would be too similar to the Scrabble Stems problem we did long, long
ago. Ben politely explained that he felt it was different enough though, and
then sent some guy named Guido after me in a parking lot one night. That
convinced me to actually work the problem, and I had a change of heart. I think
we can tell from the popularity of the problem that Ben is smarter than I am, so

Coupla things:

1) Gianni, not Guido. I can understand the confusion, though, as it was
dark and probably hurt a bunch.

2) I didn't manage to write a solution to the quiz, despite the fact that
I had at least a month's time more than most everyone else, so none of
I wanted to play though and built a full game interface. My interface requires
Unix, because those are the tricks I know. Here's the start of that code:

This setup code memorizes the original state of the user's terminal, modifies
that state to raw mode so I can read individual characters as they are pressed,
arranges to have the terminal settings restored at exit, grabs the escape
sequence we can use to clear the terminal, and builds a puts() like method that
works with raw mode. This code doesn't really have much to do with Ruby. I'm
just shelling out to standard Unix utilities here.

<snip awesomeness>

Further proof of #2 above.

Ben

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Members online

No members online now.