[QUIZ] Word Chains (#44)

R

Ruby Quiz

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Our own Dave Thomas has also posted some Ruby code challenges on his blog in the
past. There are several interesting problems there:

http://codekata.org/

This week's Ruby Quiz is one of my favorite problems posted by Dave.

Given two words of equal length as command-line arguments, build a chain of
words connecting the first to the second. Each word in the chain must be in the
dictionary and every step along the chain changes exactly one letter from the
previous word.

Again, your program should accept input as two command-line arguments. Your
program should also allow a -d command-line switch followed by a dictionary file
to use, but feel free to choose a sensible default for your system. The result
should be a word chain starting with the first word and ending with the last
printed to STDOUT, one word per line. Print an error message if a chain cannot
be found.

Bonus points for finding the shortest chain and/or a small execution time.

Here's an example:

$ time ruby -d word_chain.rb duck ruby
Loading dictionary...
Building chain...
duck
ruck
rusk
ruse
rube
ruby

real 0m27.551s
user 0m27.402s
sys 0m0.113s
 
F

Francois Paul

does anyone have a working link for the wordlist.txt dictionary that
Dave links (broken) to from codekata?
 
J

James Edward Gray II

does anyone have a working link for the wordlist.txt dictionary
that Dave links (broken) to from codekata?

Not that one specifically, but I usually use SCOWL from:

http://wordlist.sourceforge.net/

Although I see I was lazy with the program used in the quiz and just
read from /usr/share/dict/words.

Another word list that has been used in past Ruby Quiz challenges is
the 12dicts package from:

http://prdownloads.sourceforge.net/wordlist/12dicts-4.0.zip

Hope that helps.

James Edward Gray II
 
S

Simon Kröger

Dominik said:
How about another bonus challenge:

If your program can find the shortest chain between two words, then
extend/modify it to find a longest shortest chain for all words with n
letters in the dictionary.
In other words: find two words for which the shortest chain is at least
as long as the shortest chain between any other word pair.

For example:

$ ruby longest_shortest.rb -d /usr/share/dict/words 4
idol
idyl
odyl
odal
oral
orad
brad
bead
beak
beck
back
bach
each
etch
utch
utah
utas
upas

And another challenge, explain all the words your program emits..

who/what/where is utas? or utch? even 'orad' seems totaly foreign
to me - but than perhaps thats normal as i'm just a german guy.

Back to your question, yes sounds more like a challenge (in terms
of acceptable runtime). I would like to suggest finding chains
between words of different lengths.

like this:

refugees
refugee
refuges
refutes
reputes
reputed
deputed
debuted
debated
rebated
related
relayed
delayed
decayed
decoyed
decoded
decodes
decides
deciles
defiles
refiles
refile
refine
repine
rapine
raping
rating
bating
basing
basin
basis
basts
bests
beats
beams
seams
shams
shame

don't get me wrong, i just think your dictionary is a bit over
the top...

cheers

Simon
 
D

Dominik Bathon

And another challenge, explain all the words your program emits..

Yes, I also wondered what some of those mean, but they are all in =20
/usr/share/dict/words on my system (Gentoo)...
who/what/where is utas? or utch? even 'orad' seems totaly foreign
to me - but than perhaps thats normal as i'm just a german guy.

So am I.
Back to your question, yes sounds more like a challenge (in terms
of acceptable runtime). I would like to suggest finding chains
between words of different lengths.

like this:

refugees
refugee
refuges
refutes
reputes
reputed
deputed
debuted
debated
rebated
related
relayed
delayed
decayed
decoyed
decoded
decodes
decides
deciles
defiles
refiles
refile
refine
repine
rapine
raping
rating
bating
basing
basin
basis
basts
bests
beats
beams
seams
shams
shame

Good idea.

Dominik
 
G

Gavin Kistner

My fun extra credit (that I may or may not get to) is to find the
longest chain for a given starting word :)
 
J

James Edward Gray II

--Apple-Mail-5-899604133
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

Begin forwarded message:
From: "Max Rabenmann" <[email protected]>
Date: August 27, 2005 3:26:14 PM CDT
To: (e-mail address removed)
Subject: Please Forward: Ruby Quiz Submission


Hi,

I produced a solution for rubyquiz #44.
It may not be the fastest solution, but it works.

Thank you.

greetings from germany

_________________________________________________________________
Sie suchen E-Mails, Dokumente oder Fotos? Die neue MSN Suche
Toolbar mit Windows-Desktopsuche liefert in sekundenschnelle
Ergebnisse. Jetzt neu! http://desktop.msn.de/ Jetzt gratis downloaden!

--Apple-Mail-5-899604133
Content-Transfer-Encoding: 7bit
Content-Type: application/octet-stream;
x-unix-mode=0666;
name="wordmorph.rb"
Content-Disposition: attachment;
filename=wordmorph.rb

#!/usr/bin/ruby
# # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# written by /flashdrv[1-3n]k/ @ Freenode
# (e-mail address removed)
# -
# 3.2GHz P4 @ 2.4GHz
# -
# british english dictionary with 206298 lines
# % time ruby wordmorph.rb -d /usr/share/dict/british-english-large duck ruby
# duck
# luck
# lucy
# luby
# ruby
# ruby wordmorph.rb -d /usr/share/dict/british-english-large duck ruby
# 576,89s user 107,22s system 77% cpu 14:39,37 total
# -
# british english dictionary with 49794 lines
# % time ruby wordmorph.rb -d /usr/share/dict/british-english-small duck ruby
# duck
# duct
# duet
# dues
# rues
# rubs
# ruby
# ruby wordmorph.rb -d /usr/share/dict/british-english-small duck ruby
# 58,82s user 11,87s system 81% cpu 1:26,63 total
# # # # # # # # # # # # # # # # # # # # # # # # # # # # #

# difference between same sized strings in substitutions
def kosten(s1, s2)
c = 0
(c...s1.length).each do |x|
c += 1 if s1[x,1] != s2[x,1]
end
c
end

def nono
puts "Words are not linked in #{$*[1]}"
exit
end

# legal input
INPUT = $*[2..3]

if /^[A-Za-z]+$/.match(INPUT.to_s) and ((INPUT.to_s.size % 2) == 0) and INPUT[0] != INPUT[1]
SIZE = INPUT[0].size and File.exists?($*[1])
DICTFILEREAD = File.open($*[1], 'r').read
else
puts "Usage: wordmorph -d dictionary string1 string2\n\nConditions:\n a) strings must be the same size\n b) strings must be different\n c) strings consist of letters only\n d) dictionary must exist"
exit
end

# dijkstra
# --------

# get all words with the same size
vertex = DICTFILEREAD.scan(/^[A-Za-z]{#{SIZE}}(?=\n)/).map {|x| x.downcase}

# route impossible?
nono if vertex.include?(INPUT[0]) == false

unbenutzt = Array.new(vertex)
vertex.delete(INPUT[1])

# initialize vertexes and edges
distanz = Hash.new
vorgaenger = Hash.new
vertex.each do |v|
distanz[v] = 99
vorgaenger[v] = nil
end
distanz[INPUT[1]] = 0
vorgaenger[INPUT[1]] = INPUT[1]

# calculate costs
while unbenutzt.size > 0 do
u = unbenutzt.sort {|x,y| distanz[x] <=> distanz[y] }[0]
unbenutzt.delete(u)
unbenutzt.each do |v|
kost = kosten(u, v) == 1 ? 1 : 99
if (distanz + kost) < distanz[v]
distanz[v] = distanz + kosten(u, v)
vorgaenger[v] = u
break if v == INPUT[0]
end
end
end

# route impossible?
nono if vorgaenger[INPUT[0]] == nil

# remove start node and unchecked nodes
vorgaenger.delete_if do |key, value|
key == INPUT[1] or value == nil
end

# print route
puts i=INPUT[0].downcase
while (i = vorgaenger) != nil
puts i
end

--Apple-Mail-5-899604133
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
format=flowed





--Apple-Mail-5-899604133--
 
G

Gavin Kistner

Here's my solution. It's most surely not the fastest, but it can be
fast.

Despite the availability of HighLine and others, I still am not adept
at setting up command-line processing. It just takes too long for
such a menial task. So mine takes four arguments, in order:
* The first word
* The second word
* A max recursion depth (more below)
* The path to a dictionary file
(I used 12dicts's "2of12inf" in my testing. My own /usr/share/dict/
words didn't even have the word "rats" in it, and was filled with all
sorts of stupid password-testing non-words.)

Rather than learn Djikstra's algorithm, I stubbornly yet-again chose
to come up with my own shortest-path algorithm. Mine does a single-
ended depth-first search, but uses a global variable to keep track of
the shortest path found so far, and stops any paths which surpass
that limit.

(Due to some fencepost error in my thinking, if it finds a 10-item
chain, it will keep finding other 10-item chains until it finds a 9-
item chain. I was too lazy to think about the problem and figure out
where it lay; I kept trying "empirical programming" ("Let's see what
happens when I change this from >= to >...") but never found the
right combination to get it pared down properly. This is why I know
mine will not be the fastest :)

Before I implemented the $max_depth aspect, my algorithm would
merrily dive 1600 levels deep and overflow the stack. By starting off
with a reasonable maximum depth, I both prevent that and (with a good
guess at what chain length I might be able to find) can massively
speed things up.

For example, using a pair of words that is particularly hard for my
algorithm to find the chain for:
[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 12
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt,
going no deeper than 12
...
425.621661 seconds elapsed

[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 10
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt,
going no deeper than 10
...
17.003073 seconds elapsed

[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 8
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt,
going no deeper than 8
...
3.069903 seconds elapsed

[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 6
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt,
going no deeper than 6
...
(no path found)
1.527844 seconds elapsed



I kept meaning to try a solution that does a double-ended 'blossom',
spreading out from the two end words and seeing if any of the
available words in each word intersect. Because I had limited
programming time, I didn't get to try this out; I was concerned that
it would be a memory hog. Possibly it would have been uber fast.

I played around with sets versus hashes versus arrays in various
spots of my code, but (even with profiling) couldn't find an area
that made much of a speed difference. The algorithm is king.

I just realized that I'm only keeping track of already-visited nodes
per recursion path...I probably could speed things up quite a bit by
globally reducing the pool of available links for ALL paths as a word
is visited. Damn.

Maybe next shortest-path quiz we get I'll finally look up Djikstra's
(Dijkstra's?) algorithm and try to win the speed race. It's just less
fun (but more effective) to look up someone else's research and
implement it; far more fun to play with (broken) algorithms on my own :)


#!/usr/local/bin/ruby

class String
# Finds all words in _legal_words_ which may be achieved by varying
# one letter of the receiving string.
#
# Legal words must respond to the #include? method
def variations( legal_words )
letters = 'abcdefghijklmnopqrstuvwxyz'.split('')
my_characters = self.split( '' )
all_variations = []
my_characters.each_with_index{ |orig_char,i|
letters.each{ |new_char|
if new_char != orig_char
test_word = self.dup
test_word[ i ] = new_char
all_variations << test_word if legal_words.include?
( test_word )
end
}
}
all_variations
end

# Finds and returns an array that links the current word to
_dest_word_,
# where each link in the chain is a 'variation' of the previous link.
#
# Legal words is a common pool of words to
def link_to( dest_word, legal_words, forbidden_words=[], depth=1 )
return nil if (depth+1) >= $max_depth

#if $DEBUG
#puts "#{'.'*depth}link from '#{self}' (d:#{depth}; max:#
{$max_depth})"
#end

# find the words that I can link to, that haven't already been seen
links = self.variations( legal_words ) - forbidden_words

if links.include?( dest_word )
#Sweet, I have a direct route!
[ self ]
else
path = nil
links.each{ |iword|
seen_words = forbidden_words.dup << self
if test_path = iword.link_to
(dest_word,legal_words,seen_words,depth+1)
total_length = depth + test_path.length + 1
return nil if total_length > $max_depth
path = test_path
$max_depth = total_length
puts "FOUND PATH of length #{total_length}" if $DEBUG
end
}
if path
path << self
else
nil
end
end
end
end



start_word = ARGV[0] || 'crave'
end_word = ARGV[1] || 'primp'
$max_depth = Integer( ARGV[2] || 10 )
dict = ARGV[3] || '/usr/share/dict/words'
#dict = ARGV[3] || './12dicts-4/2of12inf.txt'



desired_length = start_word.length
unless end_word.length == desired_length
msg = "Error: '#{start_word}' and '#{end_word}' are not the same
length"
msg << "(#{start_word.length} vs. #{end_word.length})"
raise msg
end

print "Finding chain between #{start_word} and #{end_word} using #
{dict}, "
puts "going no deeper than #{$max_depth}"

# Hash seems to be a hair faster than Set for my purposes
hash_path = "#{dict.gsub( /[^\w\d]/, '_' )}_#{desired_length}.marshall"

# Load/generate words of the right length
avail_words = {}
if File.exists?( hash_path )
File.open( hash_path, 'r' ){ |f|
avail_words = Marshal.load( f )
}
else
File.open( dict, 'r' ){ |f|
w = f.read.split(/[\r\n]+/)
# No capital words, or words ending with % (12dicts)
w.reject!{ |word| word.length != desired_length or /[^a-z]/ =~
word }
w.each{ |word| avail_words[ word ] = true }
}
File.open( hash_path, 'wb' ){ |f|
f << Marshal.dump( avail_words )
}
end

puts "#{avail_words.length} words with #{desired_length} letters"

unless avail_words.include?( end_word )
raise "Error: '#{end_word}' is not included in #{dict}"
end

#Because Math is Hard
$max_depth += 1

start_time = Time.new
if path = start_word.link_to( end_word, avail_words )
path.reverse!
path << end_word
puts path.join( "\n" )
else
puts "(no path found)"
end
end_time = Time.new
puts " ", "#{end_time-start_time} seconds elapsed"
 
G

Gavin Kistner

warn "Loading dictionary..." if $DEBUG
WordChain.load_words(dictionary_file, start)
warn "Building chain..." if $DEBUG
puts WordChain.new(start, finish)

What's the benefit to using #warn rather than #puts here? stderr
versus stdout?
 
J

James Edward Gray II

What's the benefit to using #warn rather than #puts here? stderr
versus stdout?

I can redirect them separately. For example, I can throw the chain
into a file, but still see the progress messages:

$ ruby -d word_chain.rb lead gold > chain.txt
Loading dictionary...
Building chain...
$ cat chain.txt
lead
load
goad
gold

Hope that makes sense.

James Edward Gray II
 
D

Dominik Bathon

My solution is attached or at <http://tinyurl.com/dpvot> (needs data:-u= rl
support in your browser)

I really liked how the interface turned out:

| puts WordChains.new(words).from("ruby").shortest_path_to("duck")
|
| WordChains.new(words).from("ruby").each { |path|
| puts path.join(", ")
| }

Is there some kind of priority queue in the stdlib or available as a ge=
m?

There is nothing in the stdlib currently, but there is rbtree and a RCR t=
o =20
include it in the stdlib:

http://www.rcrchive.net/rcr/show/306

Dominik
 
J

Jannis Harder

--Apple-Mail-3-918284045
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
format=flowed

Hi,

I implemented a bidirectional breadth-first search. I'm using regexps
to speed up dictionary actions.

Extra features:
-v Debug output.
-s Single line output.
Morphing of more than two words.
Support for non alphabetic chars.


--Apple-Mail-3-918284045
Content-Transfer-Encoding: 7bit
Content-Type: text/x-ruby-script;
x-unix-mode=0644;
name="wordmorph.rb"
Content-Disposition: attachment;
filename=wordmorph.rb

#!/usr/bin/env ruby

class Array
def concat!(other)
self.push(*other)
end
end

module WordMorph
module Helper
def variation_regexp(word)
out_res=[]
(0...word.size).each do |index|
out_res << /#{Regexp.escape(word[0,index])}.#{Regexp.escape(word[index+1..-1])}/
end
Regexp.union(*out_res)
end
end

class Morpher
include Helper
def initialize(words,word_list)
@words=words.map{|word|word.downcase.tr("^a-z","")}

if @words.map{|e|e.size}.uniq.size != 1
raise "Word size has to be the same for all words"
end

@[email protected]
@word_list = WordList.new(word_list).resize!(@size)
end

def morph
out = []
([email protected]).each do |index|
out.concat! morph_two(*(@words[index,2]))
end
out
end

#private
def morph_two(a,b=nil)
unless b
return [a]
end
warn "Morphing #{a} to #{b}." if $DEBUG
a_paths = [[a]]
b_paths = []
word_list = @word_list.dup
word_list.drop(a,b)
current_paths = a_paths
next_paths = b_paths

loop do

# connection found?
b_lasts = b_paths.map{|e|e.last}
a_paths.each do |a_item|
re = /^#{variation_regexp(a_item.last)}$/
b_paths.each_index do |b_index|
if b_lasts[b_index] =~ re
warn "Done morphing #{a} to #{b}." if $DEBUG
return a_item + b_paths[b_index].reverse[0..-2]
end
end
end

# connections left?
if a_paths.empty? or b_paths.empty?
raise "No connection"
end

# next step
current_paths.map! do |path|
last = path.last
variations = word_list.drop_variations(last)
word_list.to_s
variations.map do |e|
path.dup << e
end
end
current_paths.replace(
current_paths.inject([]) do |result,i|
result.concat!(i)
end
)

#next time use the other paths
current_paths,next_paths = next_paths,current_paths
if $DEBUG
if current_paths == b_paths
warn "a:"
warn a_paths.map{|e|e.join" -> "}.join("\n")
else
warn "b:"
warn b_paths.map{|e|e.join" <- "}.join("\n")
end
end

end

end
end

class WordList
include Helper

def to_s
@words.dup
end

def dup
WordList.new(@words.dup,false)
end

def initialize(source,cleanup=true)
case source
when IO
@words = source.read
when Array
@words = source.join("\n")
else
@words = source.to_s
end
cleanup! if cleanup
end

def drop(*words)
words.flatten.each do |word|
@words.sub!(/^#{Regexp.escape word}\n/,'')
end
words
end

def resize!(length)
#@words.gsub!(/^(.{0,#{length-1}}|.{#{length+1},})\n/,'')
@words = @words.scan(/^.{#{length}}\n/).join
self
end

def drop_variations(word)
out = []
@words.gsub!(/^(#{variation_regexp(word)})\n/) do
out << $1
''
end
out
end


def cleanup!
@words.tr!("\r |;:,.\t_","\n")
@words.gsub!(/\n{2,}/,"\n")
if @words[-1] != ?\n
@words << "\n"
end
end
end
end
if __FILE__ == $0
dict = STDIN

if ARGV.include?'-d'
file = ARGV.slice!(ARGV.index('-d'),2)[1]
dict = File.open(file)
end

if ARGV.delete '-v'
$DEBUG=true
end
if ARGV.delete '-s'
single_line=true
end

words = ARGV

out = WordMorph::Morpher.new(words,dict).morph
if single_line
puts out.join(' ')
else
puts out
end
dict.close
end

--Apple-Mail-3-918284045
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
format=flowed


Jannis Harder

puts"\e[2J\e[0;11r";$>.sync=m="\e[C";c='/,-=<>*+.:&%$'.split'';k=[!1]*25
z=",rekcah ybuR rehtona tsuJ".reverse;while k.index(!1);i=-1;print"\eM"*
7,"\e[H",k.map{|q|q ?" ":c[rand(13)]},"\e[6H",k.map{|q|u=z[i+=1,1];q ?u:
m},"\n",k.map{|q|q ?" ":m};k[rand(25)]=sleep 0.1;end;puts"\e[2J\e[r"+z#J


--Apple-Mail-3-918284045--
 
L

Levin Alexander

Dominik Bathon said:
There is nothing in the stdlib currently, but there is rbtree and a RCR= =20
to include it in the stdlib:

http://www.rcrchive.net/rcr/show/306

I tried to build it but it fails the unit tests

The check for "rb_block_proc" and the later warnings seems to suggest tha=
t =20
there is something wrong with my system. Any hints?


Thank You,
Levin

$ ruby -v
ruby 1.8.3 (2005-06-23) [i486-linux]
$ ruby extconf.rb
checking for allocation framework... yes
checking for rb_obj_init_copy()... no
checking for rb_block_proc()... no
checking for rb_yield_values()... no
checking for rb_marshal_dump()... no
checking for rb_marshal_load()... no
checking for inline keyword... __inline
creating Makefile
$ make
gcc -fPIC -Wall -g -O2 -fPIC -std=3Dc89 -pedantic -Wall -Wno-long-long =
-I. =20
-I/usr/lib/ruby/1.8/i486-linux -I/usr/lib/ruby/1.8/i486-linux -I. =20
-DNDEBUG -DHAVE_OBJECT_ALLOCATE -Dinline=3D__inline -c dict.c
gcc -fPIC -Wall -g -O2 -fPIC -std=3Dc89 -pedantic -Wall -Wno-long-long =
-I. =20
-I/usr/lib/ruby/1.8/i486-linux -I/usr/lib/ruby/1.8/i486-linux -I. =20
-DNDEBUG -DHAVE_OBJECT_ALLOCATE -Dinline=3D__inline -c rbtree.c
rbtree.c:58: warning: static declaration for `rb_block_proc' follows =20
non-static
rbtree.c:66: warning: static declaration for `rb_yield_values' follows =20
non-static
rbtree.c:1486: warning: static declaration for `rb_marshal_dump' follows =
=20
non-static
rbtree.c:1492: warning: static declaration for `rb_marshal_load' follows =
=20
non-static
gcc -shared -L"/usr/lib" -o rbtree.so dict.o rbtree.o -lruby1.8 =20
-lpthread -ldl -lcrypt -lm -lc
$ ruby -v test.rb
Loaded suite test
Started
......................test.rb:819: warning: rb_f_lambda() is deprecated; =
=20
use rb_block_proc() instead
..................test.rb:121: warning: rb_f_lambda() is deprecated; use =
=20
rb_block_proc() instead
test.rb:126: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
test.rb:57: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
test.rb:59: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
test.rb:139: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
test.rb:154: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
........test.rb:179: warning: rb_f_lambda() is deprecated; use =20
rb_block_proc() instead
test.rb:185: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
test.rb:201: warning: block supersedes default value argument
test.rb:564: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
....test.rb:483: warning: rb_f_lambda() is deprecated; use rb_block_proc(=
) =20
instead
...test.rb:574: warning: rb_f_lambda() is deprecated; use rb_block_proc()=
=20
instead
..test.rb:649: warning: rb_f_lambda() is deprecated; use rb_block_proc() =
=20
instead
test.rb:653: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
Ftest.rb:23: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
test.rb:326: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
Ftest.rb:581: warning: rb_f_lambda() is deprecated; use rb_block_proc() =
=20
instead
test.rb:609: warning: RBTree#readjust() uses current comparison block, us=
e =20
RBTree#readjust(nil) to set default comparison block
test.rb:613: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
F.test.rb:623: warning: rb_f_lambda() is deprecated; use rb_block_proc()=
=20
instead
test.rb:624: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
...test.rb:145: warning: rb_f_lambda() is deprecated; use rb_block_proc()=
=20
instead
test.rb:310: warning: rb_f_lambda() is deprecated; use rb_block_proc() =20
instead
...test.rb:468: warning: rb_f_lambda() is deprecated; use rb_block_proc()=
=20
instead
.......
Finished in 0.1 seconds.

1) Failure:
test_merge(RBTreeTest) [test.rb:425]:
<#<RBTree: {["a", "A"]=3D>nil,
["b", "B"]=3D>nil,
["c", "C"]=3D>nil,
["d", "D"]=3D>nil,
["e", "E"]=3D>nil},
default=3Dnil,
cmp_proc=3Dnil>> expected but was
<#<RBTree: {["e", "E"]=3D>nil}, default=3Dnil, cmp_proc=3Dnil>>.

2) Failure:
test_pp(RBTreeTest) [test.rb:664]:
<"#<RBTree: {\"a\"=3D>\"A\", \"b\"=3D>\"B\"}, default=3Dnil, cmp_proc=3Dn=
il>\n"> =20
expected but was
<"#<RBTree: {[\"a\", \"A\"]=3D>nil, [\"b\", \"B\"]=3D>nil}, default=3Dnil=
, =20
cmp_proc=3Dnil>\n">.

3) Failure:
test_reject(RBTreeTest) [test.rb:382]:
<#<RBTree: {["c", "C"]=3D>nil, ["d", "D"]=3D>nil}, default=3Dnil, cmp_pro=
c=3Dnil>> =20
expected but was
<nil>.

84 tests, 283 assertions, 3 failures, 0 errors
 
A

Alexandru Popescu

#: Jannis Harder changed the world a bit at a time by saying on 8/28/2005 8:39 PM :#
Hi,

I implemented a bidirectional breadth-first search. I'm using regexps
to speed up dictionary actions.

Extra features:
-v Debug output.
-s Single line output.
Morphing of more than two words.
Support for non alphabetic chars.

Sorry to come in the middle of the solution postings but I was wondering if anybody can point me to
an equivalent of linux 'time' for windows.

thanks in advance,
:alex |.::the_mindstorm::.|
 
F

flashdrvnk

Alexandru said:
#: Jannis Harder changed the world a bit at a time by saying on
8/28/2005 8:39 PM :#


Sorry to come in the middle of the solution postings but I was
wondering if anybody can point me to an equivalent of linux 'time' for
windows.

thanks in advance,
:alex |.::the_mindstorm::.|
You could use ruby's Time class to determine the duration.
Besides, time is mostly a built-in of the shell.

Kind regards.
 
G

Gavin Kistner

For example, using a pair of words that is particularly hard for my
algorithm to find the chain for:
[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 12
Finding chain between crate and primp using ./
12dicts-4/2of12inf.txt, going no deeper than 12
...
425.621661 seconds elapsed

[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 10
Finding chain between crate and primp using ./
12dicts-4/2of12inf.txt, going no deeper than 10
...
17.003073 seconds elapsed

[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 8
Finding chain between crate and primp using ./
12dicts-4/2of12inf.txt, going no deeper than 8
...
3.069903 seconds elapsed

[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 6
Finding chain between crate and primp using ./
12dicts-4/2of12inf.txt, going no deeper than 6
...
(no path found)
1.527844 seconds elapsed

After I posted my solution, I took a shower, scribbled a few diagrams
on the foggy glass walls and figured out how to massively speed
things up. I hence post my new solution. It's still a unidirectional
depth-first search, but it keeps track of visitation depth on nodes,
preventing it from revisiting words that were previously reached more
quickly.

For long chains, it's a MASSIVE speed up. For short ones, it's "only"
a 2x improvement.

Compare results to the above:

[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 20
Chain between 'crate' and 'primp', no longer than 20 links:
...
--> 15.59s (after loading dictionary)

[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 12
Chain between 'crate' and 'primp', no longer than 12 links:
...
--> 2.76s (after loading dictionary)

[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 10
Chain between 'crate' and 'primp', no longer than 10 links:
...
--> 2.54s (after loading dictionary)

[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 8
Chain between 'crate' and 'primp', no longer than 8 links:
...
--> 1.96s (after loading dictionary)

[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 6
Chain between 'crate' and 'primp', no longer than 6 links:
(no such chain exists)
--> 0.78s (after loading dictionary)


I've finally been able to discover the much-sought-after link between
kittens and rabies! (My previous code was unable to find this chain
after leaving it for about half an hour.)

[Sliver:~/Desktop/12dicts] gkistner$ ruby wordchain-gk2.rb kitten
rabies 20
Searching in 8121 words with 6 letters
Chain between 'kitten' and 'rabies', no longer than 20 links:
kitten
bitten
batten
batted
tatted
totted
tooted
booted
boobed
bobbed
robbed
rubbed
rubber
rubier
rubies
rabies
--> 29.08s (after loading dictionary)



Now I actually feel like my code has a chance of competing with the
'big boys'. Yay!
James, could you run some speed comparisons of various solutions in
your summary analysis?


#!/usr/local/bin/ruby
require 'set'

class String
LETTERS = ('a'..'z').to_a
# Finds and returns an array that links the current word to
_dest_word_,
# where each link in the chain is a word that differs from the
previous
# only by a single character.
#
# The _visitation_map_ parameter is a hash containing all legal
words as
# keys, and that should be initialized with the values mapping
to the
# deepest depth allowable.
def chain_to( dest_word, visitation_map, depth=1 )
return nil if depth > $max_length

# Find variations on myself which haven't been reached by a
shorter path
# and update the visitation map at the same time
links = Set.new
0.upto( self.length-1 ){ |i|
old_char = self[ i ]
LETTERS.each{ |new_char|
if new_char != old_char
test_word = self.dup
test_word[ i ] = new_char
#Following returns nil if the word isn't in the
dictionary
shortest_path = visitation_map[ test_word ]
if shortest_path && shortest_path > depth
#I've gotten to this word faster than anyone
else
#Put my score in the high score board, and
use this word again
visitation_map[ test_word ] = depth
links << test_word
end
end
}
}

path_from_me = nil
if links.include?( dest_word )
#Sweet, I have a direct route!
path_from_me = [ self ]
else
links.each{ |test_word|
path = test_word.chain_to( dest_word,
visitation_map, depth + 1 )
if path
total_length = depth + path.length + 1
#Only use the found path if it's shorter than
one found already
if total_length <= $max_length
warn "Found a chain of length #
{total_length}" if $DEBUG
path_from_me = path
$max_length = total_length
end
end
}
if path_from_me
path_from_me.unshift( self )
end
end
path_from_me
end

end

start_word = ARGV[0] || 'crave'
end_word = ARGV[1] || 'primp'
$max_length = Integer( ARGV[2] || start_word.length * 3 )
dict = ARGV[3] || '/usr/share/dict/words'
#dict = ARGV[3] || '2of12inf.txt'


desired_length = start_word.length
unless end_word.length == desired_length
msg = "Error: '#{start_word}' and '#{end_word}' are not the same
length"
msg << "(#{start_word.length} vs. #{end_word.length})"
raise msg
end

# Load words of the right length
avail_words = {}
File.open( dict, 'r' ){ |f|
w = f.read.split(/[\r\n]+/)
# No capital words, or words ending with % (12dicts)
w.reject!{ |word| word.length != desired_length or /[^a-z]/ =~
word }
w.each{ |word| avail_words[ word ] = $max_length }
}
avail_words[ start_word ] = 1

puts "Searching in #{avail_words.length} words with #{desired_length}
letters"

unless avail_words.include?( end_word )
raise "Error: '#{end_word}' is not included in #{dict}"
end

print "Chain between '#{start_word}' and '#{end_word}', "
puts "no longer than #{$max_length} links:"

start_time = Time.new
if path = start_word.chain_to( end_word, avail_words )
puts path.join( "\n" )
puts end_word
else
puts "(no such chain exists)"
end
end_time = Time.new
puts "--> %.2fs (after loading dictionary)\n " % [ end_time-start_time ]
 
L

Levin Alexander

Gavin Kistner said:
I've finally been able to discover the much-sought-after link between =20
kittens and rabies! (My previous code was unable to find this chain =20
after leaving it for about half an hour.)

An interesting chain to stress-test your algorithms is "sahara" <-> =20
"cloudy" (46 links)
(in my dictionary at least)

$ time ruby word_chains.rb sahara cloudy
sahara
samara
tamara
tamera
tamers
tapers
capers
caters
waters
wavers
savers
severs
levers
levees
levies
bevies
belies
belied
belted
bolted
boated
coated
crated
craned
cranes
cranks
cracks
crocks
croaks
creaks
creeks
creels
cruels
cruets
crusts
crests
chests
cheats
cleats
bleats
bloats
floats
flouts
clouts
clouds
cloudy

real 0m28.984s
user 0m9.243s
sys 0m0.755s
 
S

Simon Kröger

Heya dear quizers,

here is my solution.
This is a breadth-first algorithm from both sides.

It may be interesting that it doesn't search the dictinary
for words simmilar do the given ones (or the reachables) but
generates them (rotating each character of the word from a to
z) and looks up the result in a big Set of words (the $dict).

Because the lookup in a hashtable (what Set uses) is constant
the algorithm doesn't slow down with bigger dictinaries. It
does slow down with longer words, but thats quite normal.

Half of the lines are for option handling, search_words and
find_chain are the only interresting methods.

I would realy like to see some speed comparisons (on a large
dict :) ).

One more thing, this doesn't use recursion, so doing realy
long chains shoudn't be a problem (except for runtime of
course)

cheers

Simon

-----------------------------------------------------------

require 'set'
require 'getoptlong'

def search_words words, known
hash= Hash.new
words.each do |word|
(s = word.dup).size.times do |i|
26.times do |c| s = ?a + (s + 1 - ?a) % 26
hash = word if $dict.include?(s) && !known
end
end
end
hash
end

def find_chain word_a, word_b
reachable_a = (temp_a = {word_a => 0}).dup
reachable_b = (temp_b = {word_b => 0}).dup
found = nil

loop do
reachable_a.merge!(temp_a = search_words(temp_a.keys, reachable_a))
break if found=temp_a.inject(nil){|r, w|reachable_b[w[0]]? w[0]:r}
warn "reachable a: #{reachable_a.size}" if $verbose

reachable_b.merge!(temp_b = search_words(temp_b.keys, reachable_b))
break if found=temp_b.inject(nil){|r, w|reachable_a[w[0]]? w[0]:r}
warn "reachable b: #{reachable_b.size}" if $verbose

if temp_a.size == 0 || temp_b.size == 0
puts "now way!";
exit 0
end
end

word, list = found, [found]
while (word = reachable_a[word]) && word != 0
list.insert(0, word)
end
word = found
while (word = reachable_b[word]) && word != 0
list.push(word)
end
list
end

def usage
puts "#{$PROGRAM_NAME} [-v|--verbose] [-h|--help]"+
"[-d|--dictionary] {word1} {word2}"
exit
end

opts = GetoptLong.new(
[ "--dictionary", "-d", GetoptLong::REQUIRED_ARGUMENT ],
[ "--verbose", "-v", GetoptLong::NO_ARGUMENT ],
[ "--help", "-h", GetoptLong::NO_ARGUMENT ])

$verbose, dictfile = nil, 'words.txt'
opts.each do |opt, arg|
usage if opt == "--help"
$verbose = true if opt == "--verbose"
if opt == "--dictionary"
dictfile = arg
usage if dictfile.empty?
end
end
usage if ARGV.size != 2

$dict = Set.new
open(dictfile){|f| f.each{|w|$dict << w.chomp}}

puts find_chain(ARGV[0], ARGV[1])

-----------------------------------------------------------
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,023
Latest member
websitedesig25

Latest Threads

Top