[QUIZ] Crossword Solver (#132)

Discussion in 'Ruby' started by Ruby Quiz, Jul 27, 2007.

  1. Ruby Quiz

    Ruby Quiz Guest

    The three rules of Ruby Quiz:

    1. Please do not post any solutions or spoiler discussion for this quiz until
    48 hours have passed from the time on this message.

    2. Support Ruby Quiz by submitting ideas as often as you can:

    http://www.rubyquiz.com/

    3. Enjoy!

    Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
    on Ruby Talk follow the discussion. Please reply to the original quiz message,
    if you can.

    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    by Eugene Kalenkovich

    Write a Ruby crossword solver. Randomly fill a crossword template with words
    from a dictionary. Use any dictionary you want (/usr/share/dict/words, text of
    pickaxe, etc.).

    Template is provided in a file formatted very similarly to one in quiz #10
    (http://www.rubyquiz.com/quiz10.html), with "X" substituted with "#". Simple
    template example:

    _ _ _ _ _

    _ # _ # _

    _ _ _ _ _

    _ # _ # _

    _ _ _ _ _

    Format output any way you want, just make it readable. Each run of the program
    with big enough dictionary should give a different solution. Example results for
    a simple template:

    F U G U E

    U U R

    D E I G N

    G D S

    E J E C T



    R E S T S

    I K T

    N I E C E

    D W E

    S U S A N

    Bonus 1 (simple). Allow partially pre-filled templates:

    # # _ # # # # M

    _ _ _ _ _ _ # A

    # # _ # # _ # T

    R U B Y Q U I Z

    U # _ # # _ # #

    B # _ _ _ _ _ _

    Y # # # # _ # #

    One of result variants:

    U M

    P A S C A L A

    A O T

    R U B Y Q U I Z

    U L N

    B Y E A G E R

    Y E

    Bonus 2: Avoid combinatorial explosion. Fill a big template within reasonable
    time:

    _ _ _ _ # _ _ _ _ _ _ _ _ _ # _ _ _ _

    _ # _ # _ # _ # _ # _ # _ # _ # _ # _

    _ _ _ _ _ _ _ _ _ # _ _ _ _ _ _ _ _ _

    _ # _ # _ # _ # _ _ _ # _ # _ # _ # _

    # _ _ _ _ # _ # _ # _ # _ # _ _ _ _ #

    _ # _ # # # _ _ _ _ _ _ _ # # # _ # _

    _ _ _ _ _ _ # # _ # _ # # _ _ _ _ _ _

    _ # _ # # _ # _ _ _ _ _ # _ # # _ # _

    _ _ _ _ _ _ _ _ # _ # _ _ _ _ _ _ _ _

    _ # # _ # _ # _ _ _ _ _ # _ # _ # # _

    _ _ _ _ _ _ _ _ # _ # _ _ _ _ _ _ _ _

    _ # _ # # _ # _ _ _ _ _ # _ # # _ # _

    _ _ _ _ _ _ # # _ # _ # # _ _ _ _ _ _

    _ # _ # # # _ _ _ _ _ _ _ # # # _ # _

    # _ _ _ _ # _ # _ # _ # _ # _ _ _ _ #

    _ # _ # _ # _ # _ _ _ # _ # _ # _ # _

    _ _ _ _ _ _ _ _ _ # _ _ _ _ _ _ _ _ _

    _ # _ # _ # _ # _ # _ # _ # _ # _ # _

    _ _ _ _ # _ _ _ _ _ _ _ _ _ # _ _ _ _



    C H A O L E N I E N T L Y C O I L

    Y V L M D O E O K A

    S H A R E A B L E O E D I P A L L Y

    T I A R N U N G E A S

    F L I P Y T T E C O H N

    I A O L I V I E R O I

    R E B U F F F M A S H M A N

    R L A B Y T E S D A T

    I N E R T I A L Y I S S U A N C E

    T U R A S P E N O D N

    A L T E R E R S E U P R O O T E D

    T O S E A S E S B O I

    E X T A N T B N S U N T A N

    S T U T R E C H T A G

    W E E P N A L R D E L L

    P R U T M A O U A L M

    R E I N S E R T S S O M E W H E R E

    O N H U O E A N R A

    S A G S B E R N A D I N E U S E D
     
    Ruby Quiz, Jul 27, 2007
    #1
    1. Advertising

  2. Re: Crossword Solver (#132)

    Where can I get a file with the dictionary list? I would rather not
    have to make my own.
    --
    Posted via http://www.ruby-forum.com/.
     
    Lloyd Linklater, Jul 27, 2007
    #2
    1. Advertising

  3. Ruby Quiz

    Hans Fugal Guest

    Re: Crossword Solver (#132)

    Lloyd Linklater wrote:
    > Where can I get a file with the dictionary list? I would rather not
    > have to make my own.


    /usr/share/dict/words on most any *NIX.
     
    Hans Fugal, Jul 27, 2007
    #3
  4. Re: Crossword Solver (#132)

    Ball, Donald A Jr (Library) wrote:
    > http://wordlist.sourceforge.net/ is a good resource, especially if
    > you're on Windows and don't have /usr/share/dict/words


    Perfect! I got this, then chose the 6 files that were straight
    wordlists. I wrote a program to read each file, change it to uppercase,
    get rid of non alpha characters (which are no good for the puzzle), get
    rid of all duplicates and write the new (sorted) list to a new file. It
    has something like 108k words. nice!

    Thanks all!
    --
    Posted via http://www.ruby-forum.com/.
     
    Lloyd Linklater, Jul 27, 2007
    #4
  5. Ruby Quiz wrote:
    >
    > Write a Ruby crossword solver. Randomly fill a crossword template with words
    > from a dictionary. Use any dictionary you want (/usr/share/dict/words, text of
    > pickaxe, etc.).
    >


    This solution requires Gecode/R ( http://gecoder.rubyforge.org/ ).

    == Overview

    The problem is modelled as a matrix of cells where each cell can be
    assigned a number representing the letters a..z and #. The letters are
    converted to integers in order to use integer variables to represent
    them. Similarly each word is converted to a base 26 number with the
    converted letters as digits. Hence allowing the words to be represented
    as integers as well.

    The constraint that sequences of cells longer than one letter must form
    words is then added by constraining that the linear combination of the
    involved cells forming the corresponding base 26 number, with each
    variable as one digit, must be equal to one of the words converted to a
    number.

    == Weaknesses

    It has two major problems
    * The words converted to numbers have to fit in the range of an integer
    variable, which only allows words of up to 6 characters in my case. This
    can probably be helped by using multiple numbers to represent each word.
    * There's no randomness in the exploration. Gecode itself does support
    custom branching (which would allow such randomness), but Gecode/R does not.

    In some cases the exploration will take a long time. I'm unsure whether
    it's because of flaws in the model or because of the problem's difficulty.

    == Code

    require 'enumerator'
    require 'rubygems'
    require 'gecoder'

    # The base we use when converting words to and from numbers.
    BASE = ('a'..'z').to_a.size
    # The offset of characters compared to digits in word-numbers.
    OFFSET = 'a'[0]
    # The range of integers that we allow converted words to be in. We are
    # only using the unsigned half, we could use both halves, but it would
    complicate
    # things without giving a larger allowed word length.
    ALLOWED_INT_RANGE = 0..Gecode::Raw::Limits::Int::INT_MAX
    # The maximum length of a word allowed.
    MAX_WORD_LENGTH = (Math.log(ALLOWED_INT_RANGE.last) /
    Math.log(BASE)).floor

    # Describes an immutable dictionary which represents all contained words
    # as numbers of base BASE where each digit is the corresponding letter
    # itself converted to a number of base BASE.
    class Dictionary
    # Creates a dictionary from the contents of the specified dictionary
    # file which is assumed to contain one word per line and be sorted.
    def initialize(dictionary_location)
    @word_arrays = []
    File.open(dictionary_location) do |dict|
    previous_word = nil
    dict.each_line do |line|
    word = line.chomp.downcase
    # Only allow words that only contain the characters a-z and are
    # short enough.
    next if previous_word == word or word.size > MAX_WORD_LENGTH or
    word =~ /[^a-z]/
    (@word_arrays[word.length] ||= []) << self.class.s_to_i(word)
    previous_word = word
    end
    end
    end

    # Gets an enumeration containing all numbers representing word of the
    # specified length.
    def words_of_size(n)
    @word_arrays[n] || []
    end

    # Converts a string to a number of base BASE (inverse of #i_to_s ).
    def self.s_to_i(string)
    string.downcase.unpack('C*').map{ |x| x - OFFSET }.to_number(BASE)
    end

    # Converts a number of base BASE back to the corresponding string
    # (inverse of #s_to_i ).
    def self.i_to_s(int)
    res = []
    loop do
    digit = int % BASE
    res << digit
    int /= BASE
    break if int.zero?
    end
    res.reverse.map{ |x| x + OFFSET }.pack('C*')
    end
    end

    class Array
    # Computes a number of the specified base using the array's elements
    # as digits.
    def to_number(base = 10)
    inject{ |result, variable| variable + result * base }
    end
    end

    # Models the solution to a partially completed crossword.
    class Crossword < Gecode::Model
    # The template should take the format described in RubyQuiz #132 . The
    # words used are selected from the specified dictionary.
    def initialize(template, dictionary)
    @dictionary = dictionary

    # Break down the template and create a corresponding square matrix.
    # We let each square be represented by integer variable with domain
    # -1...BASE where -1 signifies # and the rest signify letters.
    squares = template.split(/\n\s*\n/).map!{ |line| line.split(' ') }
    @letters = int_var_matrix(squares.size, squares.first.size,
    -1...BASE)

    # Do an initial pass, filling in the prefilled squares.
    squares.each_with_index do |row, i|
    row.each_with_index do |letter, j|
    unless letter == '_'
    # Prefilled letter.
    @letters[i,j].must == self.class.s_to_i(letter)
    end
    end
    end

    # Add the constraint that sequences longer than one letter must form
    # words. @words will accumelate all word variables created.
    @words = []
    # Left to right pass.
    left_to_right_pass(squares, @letters)
    # Top to bottom pass.
    left_to_right_pass(squares.transpose, @letters.transpose)

    branch_on wrap_enum(@words), :variable => :largest_degree,
    :value => :min
    end

    # Displays the solved crossword in the same format as shown in the
    # quiz examples.
    def to_s
    output = []
    @letters.values.each_slice(@letters.column_size) do |row|
    output << row.map{ |x| self.class.i_to_s(x) }.join(' ')
    end
    output.join("\n\n").upcase.gsub('#', ' ')
    end

    private

    # Parses the template from left to right, line for line, constraining
    # sequences of two or more subsequent squares to form a word in the
    # dictionary.
    def left_to_right_pass(template, variables)
    template.each_with_index do |row, i|
    letters = []
    row.each_with_index do |letter, j|
    if letter == '#'
    must_form_word(letters) if letters.size > 1
    letters = []
    else
    letters << variables[i,j]
    end
    end
    must_form_word(letters) if letters.size > 1
    end
    end

    # Converts a word from integer form to string form, including the #.
    def self.i_to_s(int)
    if int == -1
    return '#'
    else
    Dictionary.i_to_s(int)
    end
    end

    # Converts a word from string form to integer form, including the #.
    def self.s_to_i(string)
    if string == '#'
    return -1
    else
    Dictionary.s_to_i(string)
    end
    end

    # Constrains the specified variables to form a word contained in the
    # dictionary.
    def must_form_word(letter_vars)
    raise 'The word is too long.' if letter_vars.size > MAX_WORD_LENGTH
    # Create a variable for the word with the dictionary's words as
    # domain and add the constraint.
    word = int_var @dictionary.words_of_size(letter_vars.size)
    letter_vars.to_number(BASE).must == word
    @words << word
    end
    end

    puts 'Reading the dictionary...'
    dictionary = Dictionary.new(ARGV.shift || '/usr/share/dict/words')
    puts 'Please enter the template (end with ^D)'
    template = ''
    loop do
    line = $stdin.gets
    break if line.nil?
    template << line
    end
    puts 'Building the model...'
    model = Crossword.new(template, dictionary)
    puts 'Searching for a solution...'
    puts((model.solve! || 'Failed').to_s)

    __END__

    --
    Andreas Launila
     
    Andreas Launila, Jul 29, 2007
    #5
  6. "Ruby Quiz" <> wrote in message
    >
    > Write a Ruby crossword solver. Randomly fill a crossword template with
    > words
    > from a dictionary. Use any dictionary you want (/usr/share/dict/words,
    > text of
    > pickaxe, etc.).
    >


    My solution, slightly optimized for performance version of the code that was
    used originally to generate examples for this quiz:

    class CrossTempl
    def initialize(templ)
    @tmpl=templ.collect { |line| line.split(//) }
    @words=[]
    @tmpl.each_index do |i|
    @tmpl.each_with_index do |char, j|
    if char!='#'
    if (j==0||@tmpl[j-1]=='#') && !@tmpl[j+1].nil? &&
    @tmpl[j+1]!='#'
    @words << [i,j,0] if self.template(i,j,0).include?('_')
    end
    if (i==0||@tmpl[i-1][j]=='#') && !@tmpl[i+1].nil? &&
    @tmpl[i+1][j]!='#'
    @words << [i,j,1] if self.template(i,j,1).include?('_')
    end
    end
    end
    end
    @words.sort! {|a,b| a[0]+a[1]<=>b[0]+b[1]}
    end

    def template(i,j,dir)
    str=''
    chr=@tmpl[j]
    while !chr.nil? && chr!='#'
    str<<chr
    i+=dir
    j+=1-dir
    break if @tmpl.nil?
    chr=@tmpl[j]
    end
    str
    end

    def [](idx)
    ar=@words[idx]
    return nil if ar.nil?
    template(ar[0],ar[1],ar[2])
    end

    def []=(idx,word)
    ar=@words[idx]
    i=ar[0]; j=ar[1];
    word.split(//).each do |chr|
    @tmpl[j]=chr
    i+=ar[2]
    j+=1-ar[2]
    end
    end

    def to_s
    @tmpl.each do |line|
    lstr=line.to_s
    puts lstr.gsub(/#/,' ').gsub(/(.)/) {"#{$&} "}
    end
    end

    end

    $words=[]
    File.open("/usr/share/dict/words") {|f| f.readlines}.each do |word|
    w=word.upcase.delete("^A-Z")
    l=w.length
    $words[l]||=[]
    $words[l]<<w
    end
    $words.each {|ar| ar.uniq! if ar }

    tfile=ARGV[0]
    if tfile.nil?
    STDERR.puts "please provide a template file"
    exit 1
    end

    tmpl=File.read(tfile).split(/\n/).collect{|line| line.gsub(/\s/,'').upcase}
    $ct=CrossTempl.new(tmpl)

    def findWord(i)
    pattern=$ct
    return true if pattern.nil?
    choices=$words[pattern.length].grep(/\A#{pattern.tr("_", ".")}\Z/).sort_by
    { rand }
    until choices.empty?
    guess=choices.pop
    $ct=guess
    puts $ct
    return true if findWord(i+1)
    break if i>0 && rand(2)==1 # fighting incorrect word sorting
    # by un-ballancing search
    end
    $ct=pattern
    return false
    end

    if findWord(0)
    puts $ct
    else
    puts "O-ops"
    end
     
    Eugene Kalenkovich, Jul 31, 2007
    #6
  7. Re: Crossword Solver (#132)

    On Jul 27, 8:10 am, Ruby Quiz <> wrote:
    > The three rules of Ruby Quiz:


    Here is my solution. :D

    http://pastie.caboo.se/83968

    I think there is still a little bug in which search cycles, but it's
    highly improbable (got it once in ~50 runs) Will try to fix it.
     
    Rubén Medellín, Aug 1, 2007
    #7
  8. Andreas Launila wrote:
    > Ruby Quiz wrote:
    >> Write a Ruby crossword solver. Randomly fill a crossword template with words
    >> from a dictionary. Use any dictionary you want (/usr/share/dict/words, text of
    >> pickaxe, etc.).
    >>

    >
    > This solution requires Gecode/R ( http://gecoder.rubyforge.org/ ).
    >


    I thought I would update this solution to use the recently introduced
    tuple constraints. The updated version no longer has a limitation on
    word-lengths and seems to scale well with both the template size and
    dictionary size.

    The model is almost the same as before. Each square is represented by an
    integer variable that can be assigned a number representing one of the
    letters a..z and #. The difference is that a different type of
    constraint is used to force sequences of cells, longer than one letter,
    to form words. The model now uses tuple constraints, which each
    constrain an enumeration of integer variables (e.g. letters that should
    form words) to take the value of one of many tuples (e.g. all tuples
    representing words of the correct length).

    The code requires Gecode/R 0.8.1 to run, which can be installed, along
    with the Gecode 2.1.1 dependency, using

    gem install gecoder-with-gecode


    == Code

    require 'enumerator'
    require 'rubygems'
    require 'gecoder'

    # The alphabet used.
    ALPHABET = ('a'..'z').to_a

    # Describes an immutable dictionary which represents all contained words
    # as an array of integers where each integer represents a letter. Each
    # integer used is between 0 and ALPHABET.size - 1.
    class Dictionary
    # Creates a dictionary from the contents of the specified dictionary
    # file which is assumed to contain one word per line and be sorted.
    def initialize(dictionary_location)
    @word_arrays = []
    File.open(dictionary_location) do |dict|
    previous_word = nil
    # A regexp that matches strings not in our alphabet.
    not_in_alphabet = /[^#{ALPHABET.join}]/
    dict.each_line do |line|
    word = line.chomp.downcase
    # Only allow words that are in our alphabet.
    next if previous_word == word or not_in_alphabet.match(word)
    (@word_arrays[word.length] ||= []) <<
    self.class.s_to_i_array(word)
    previous_word = word
    end
    end
    end

    # Gets an enumeration containing all arrays of integers representing
    # word of the specified length.
    def words_of_size(n)
    @word_arrays[n] || []
    end

    # Converts a string to an array of integers (inverse
    # of #i_array_to_s ).
    def self.s_to_i_array(string)
    if @c_to_i_map.nil?
    @c_to_i_map =
    Hash[*ALPHABET.zip((0...ALPHABET.size).to_a).flatten]
    end
    @c_to_i_map.values_at *string.scan(/./)
    end

    # Converts an array of integers back to the corresponding string
    # (inverse of #s_to_i_array ).
    def self.i_array_to_s(int_array)
    if @i_to_c_map.nil?
    @i_to_c_map = ALPHABET
    end
    @i_to_c_map.values_at *int_array
    end
    end

    # Models the solution to a partially completed crossword.
    class Crossword < Gecode::Model
    # The template should take the format described in RubyQuiz #132 . The
    # words used are selected from the specified dictionary.
    def initialize(template, dictionary)
    @dictionary = dictionary

    # Break down the template and create a corresponding square matrix.
    # We let each square be represented by an integer variable with
    # domain -1...BASE where -1 signify # and the rest signify letters.
    squares = template.split(/\n\s*\n/).map!{ |line| line.split(' ') }
    @letters =
    int_var_matrix(squares.size, squares.first.size,
    -1...ALPHABET.size)

    # Do an initial pass, filling in the prefilled squares.
    squares.each_with_index do |row, i|
    row.each_with_index do |letter, j|
    unless letter == '_'
    # Prefilled letter.
    @letters[i,j].must == self.class.s_to_i_array(letter).first
    end
    end
    end

    # Add the constraint that sequences longer than one letter must form
    # words.
    # Left to right pass.
    left_to_right_pass(squares, @letters)
    # Top to bottom pass.
    left_to_right_pass(squares.transpose, @letters.transpose)

    # Branch on intersections first.
    branch_on @letters, :variable => :largest_degree, :value => :min
    end

    # Displays the solved crossword in the same format as shown in the
    # quiz examples.
    def to_s
    output = []
    @letters.values.each_slice(@letters.column_size) do |row|
    output << row.map{ |x| self.class.i_array_to_s([x]) }.join(' ')
    end
    output.join("\n\n").upcase.gsub('#', ' ')
    end

    private

    # Parses the template from left to right, line for line, constraining
    # sequences of two or more subsequent squares to form a word in the
    # dictionary.
    def left_to_right_pass(template, variables)
    template.each_with_index do |row, i|
    letters = []
    row.each_with_index do |letter, j|
    if letter == '#'
    must_form_word(letters) if letters.size > 1
    letters = []
    else
    letters << variables[i,j]
    end
    end
    must_form_word(letters) if letters.size > 1
    end
    end

    # Converts a word from integer form to string form, including the #.
    def self.i_array_to_s(int_array)
    if int_array == [-1]
    return '#'
    else
    Dictionary.i_array_to_s(int_array)
    end
    end

    # Converts a word from string form to integer form, including the #.
    def self.s_to_i_array(string)
    if string == '#'
    return [-1]
    else
    Dictionary.s_to_i_array(string)
    end
    end

    # Constrains the specified variables to form a word contained in the
    # dictionary.
    def must_form_word(letters)
    words_of_right_size = @dictionary.words_of_size(letters.size)
    if words_of_right_size.empty?
    raise RuntimeError, "There are no words of size #{letters.size}."
    end
    # Add the tuple constraint. Use propagation kind :memory if you run
    # out of memory.
    wrap_enum(letters).must_be.in words_of_right_size, :kind => :speed
    end
    end

    puts 'Reading the dictionary'
    dictionary = Dictionary.new(ARGV.shift || '/usr/share/dict/words')
    puts 'Enter the template (end with ^D)'
    template = ''
    loop do
    line = $stdin.gets
    break if line.nil?
    template << line
    end
    puts 'Building the model'
    model = Crossword.new(template, dictionary)
    puts 'Searching for a solution'
    puts((model.solve! || 'Failed').to_s)

    __END__

    --
    Andreas Launila
     
    Andreas Launila, Apr 24, 2008
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ruby Quiz

    [QUIZ] crossword.rb (#10)

    Ruby Quiz, Dec 3, 2004, in forum: Ruby
    Replies:
    2
    Views:
    122
    Gavin Kistner
    Dec 3, 2004
  2. Lyndon Samson

    Idea for Ruby Quiz - Su Doku solver

    Lyndon Samson, Apr 23, 2005, in forum: Ruby
    Replies:
    6
    Views:
    105
    James Edward Gray II
    Apr 24, 2005
  3. Ruby Quiz

    [QUIZ] Sodoku Solver (#43)

    Ruby Quiz, Aug 19, 2005, in forum: Ruby
    Replies:
    39
    Views:
    362
    Josef 'Jupp' SCHUGT
    Aug 25, 2005
  4. Ruby Quiz

    [SUMMARY] Crossword Solver (#132)

    Ruby Quiz, Aug 2, 2007, in forum: Ruby
    Replies:
    0
    Views:
    138
    Ruby Quiz
    Aug 2, 2007
  5. Martin Rinehart

    80 columns wide? 132 columns wide?

    Martin Rinehart, Oct 31, 2008, in forum: Javascript
    Replies:
    16
    Views:
    181
    John W Kennedy
    Nov 13, 2008
Loading...

Share This Page