[QUIZ] Scrabble Stems (#12)

Discussion in 'Ruby' started by Ruby Quiz, Dec 17, 2004.

  1. Ruby Quiz

    Ruby Quiz Guest

    The three rules of Ruby Quiz:

    1. Please do not post any solutions or spoiler discussion for this quiz until
    48 hours have passed from the time on this message.

    2. Support Ruby Quiz by submitting ideas as often as you can:

    http://www.grayproductions.net/ruby_quiz/

    3. Enjoy!

    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    by Martin DeMello

    In Scrabble parlance, a 'bingo stem' is defined as a set of six
    letters that combines with a large fraction of the alphabet to anagram
    to valid seven letter words (a 'bingo' is a move using all seven tiles
    on your rack). For instance, one of the more prolific stems, SATIRE,
    combines with twenty letters, A, B, C, D, E, F, G, H, I, K, L, M, N,
    O, P, R, S, T, V, W (forming words like ASTERIA, BAITERS, RACIEST
    ...).

    Write a program that, given a word list and a cutoff n, finds all 6
    letter stems that combine with n or more letters, sorted in order of
    how many distinct letters they combine with.
    Ruby Quiz, Dec 17, 2004
    #1
    1. Advertising

  2. When talking of 6 letter stems, do I have to find all the 6 letter
    stems that also have a meaning by themselves? (Like Satire for
    example). Or may I find even nonsense stems?


    On Fri, 17 Dec 2004 22:45:39 +0900, Ruby Quiz <> wrote:
    > The three rules of Ruby Quiz:
    >
    > 1. Please do not post any solutions or spoiler discussion for this quiz until
    > 48 hours have passed from the time on this message.
    >
    > 2. Support Ruby Quiz by submitting ideas as often as you can:
    >
    > http://www.grayproductions.net/ruby_quiz/
    >
    > 3. Enjoy!
    >
    > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    >
    > by Martin DeMello
    >
    > In Scrabble parlance, a 'bingo stem' is defined as a set of six
    > letters that combines with a large fraction of the alphabet to anagram
    > to valid seven letter words (a 'bingo' is a move using all seven tiles
    > on your rack). For instance, one of the more prolific stems, SATIRE,
    > combines with twenty letters, A, B, C, D, E, F, G, H, I, K, L, M, N,
    > O, P, R, S, T, V, W (forming words like ASTERIA, BAITERS, RACIEST
    > ...).
    >
    > Write a program that, given a word list and a cutoff n, finds all 6
    > letter stems that combine with n or more letters, sorted in order of
    > how many distinct letters they combine with.
    >
    >
    Giovanni Intini, Dec 18, 2004
    #2
    1. Advertising

  3. On Dec 18, 2004, at 6:16 AM, Giovanni Intini wrote:

    > When talking of 6 letter stems, do I have to find all the 6 letter
    > stems that also have a meaning by themselves? (Like Satire for
    > example). Or may I find even nonsense stems?


    I don't believe the stems meaning or lack thereof is significant.

    James Edward Gray II
    James Edward Gray II, Dec 18, 2004
    #3
  4. Giovanni Intini <> wrote:
    > When talking of 6 letter stems, do I have to find all the 6 letter
    > stems that also have a meaning by themselves? (Like Satire for
    > example). Or may I find even nonsense stems?


    No, all six letter stems count. The "meaningful" rearrangement is just a
    handy way to refer to the stem.

    martin
    Martin DeMello, Dec 18, 2004
    #4
  5. Ruby Quiz

    Glenn Parker Guest

    Ruby Quiz wrote:
    >
    > In Scrabble parlance, a 'bingo stem' is defined as a set of six
    > letters that combines with a large fraction of the alphabet to anagram
    > to valid seven letter words.
    >
    > Write a program that, given a word list and a cutoff n, finds all 6
    > letter stems that combine with n or more letters, sorted in order of
    > how many distinct letters they combine with.


    If this quiz is really just about Scrabble, can we assume the input word
    list will contain only 7-letter words?

    If I filter my dictionary (2of4brif.txt) down to just 7-letter words, I
    can handle this task in less than an hour of wall-clock time on a 2GHz
    PC. If I allow all words with 7 or more letters, I don't know if I'll
    be able to finish the processing in under a week's time.

    I've certainly learned a few things about Ruby performance tuning along
    the way.

    --
    Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>
    Glenn Parker, Dec 18, 2004
    #5
  6. On Dec 18, 2004, at 3:42 PM, Glenn Parker wrote:

    > Ruby Quiz wrote:
    >> In Scrabble parlance, a 'bingo stem' is defined as a set of six
    >> letters that combines with a large fraction of the alphabet to anagram
    >> to valid seven letter words.
    >> Write a program that, given a word list and a cutoff n, finds all 6
    >> letter stems that combine with n or more letters, sorted in order of
    >> how many distinct letters they combine with.

    >
    > If this quiz is really just about Scrabble, can we assume the input
    > word list will contain only 7-letter words?


    I filtered out words less than and more than seven letters.

    > If I filter my dictionary (2of4brif.txt) down to just 7-letter words,
    > I can handle this task in less than an hour of wall-clock time on a
    > 2GHz PC. If I allow all words with 7 or more letters, I don't know if
    > I'll be able to finish the processing in under a week's time.
    >
    > I've certainly learned a few things about Ruby performance tuning
    > along the way.


    Hmmm, I would say that it doesn't HAVE to take that long. My program
    runs in under 30 seconds. ;)

    James Edward Gray II
    James Edward Gray II, Dec 18, 2004
    #6
  7. Ruby Quiz

    Glenn Parker Guest

    James Edward Gray II wrote:
    > Hmmm, I would say that it doesn't HAVE to take that long. My program
    > runs in under 30 seconds. ;)


    Thanks, stripping it down to handle only 7-letter words made a big
    difference. It runs in well under 4 seconds now. :)

    --
    Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>
    Glenn Parker, Dec 18, 2004
    #7
  8. From the original post it looked like you could have anything longer than 6.
    Giovanni Intini, Dec 18, 2004
    #8
  9. On Dec 18, 2004, at 5:49 PM, Giovanni Intini wrote:

    > From the original post it looked like you could have anything longer
    > than 6.


    In the game of Scrabble, you are limited to seven "tiles". You get 50
    points when you can play all seven at once.

    Technically, that could be in longer words (using letters already on
    the board). I didn't get the impression this quiz was concerned with
    that though.

    James Edward Gray II
    James Edward Gray II, Dec 19, 2004
    #9
  10. Ruby Quiz

    Glenn Parker Guest

    James Edward Gray II wrote:
    > On Dec 18, 2004, at 5:49 PM, Giovanni Intini wrote:
    >
    >> From the original post it looked like you could have anything longer
    >> than 6.

    >
    > In the game of Scrabble, you are limited to seven "tiles". You get 50
    > points when you can play all seven at once.


    The quiz alluded to Scrabble, but I have to agree that the phrasing was
    still ambiguous about the length of the input words.

    --
    Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>
    Glenn Parker, Dec 19, 2004
    #10
  11. Giovanni Intini <> wrote:
    > From the original post it looked like you could have anything longer than 6.


    My bad, then - I meant six letter stems that form seven letter words
    when combined with at least n out of the 26 letters in the alphabet.
    There are seven-to-make-eight stems too, but as James noted, the quiz is
    not concerned with them (it's a trivial extension, though).

    martin
    Martin DeMello, Dec 19, 2004
    #11
  12. Ruby Quiz

    Carlos Guest

    Re: [QUIZ][SOLUTION] Scrabble Stems (#12)

    [Ruby Quiz <>, 2004-12-17 14.45 CET]
    > In Scrabble parlance, a 'bingo stem' is defined as a set of six
    > letters that combines with a large fraction of the alphabet to anagram
    > to valid seven letter words (a 'bingo' is a move using all seven tiles
    > on your rack). For instance, one of the more prolific stems, SATIRE,
    > combines with twenty letters, A, B, C, D, E, F, G, H, I, K, L, M, N,
    > O, P, R, S, T, V, W (forming words like ASTERIA, BAITERS, RACIEST
    > ...).
    >
    > Write a program that, given a word list and a cutoff n, finds all 6
    > letter stems that combine with n or more letters, sorted in order of
    > how many distinct letters they combine with.


    Here is a possible (I hope correct) solution. SATIRE combines only with 16
    letters according to my dictionary.

    DICT = "/usr/share/dict/words"
    CUTOFF = ARGV[0].to_i

    STEMS = {}

    File.open(DICT) do |f|
    f.each do |word|
    word.chomp!
    next if word.length != 7
    word.downcase!
    letters = word.split(//).sort!
    uniques = letters.uniq
    word = letters.join
    uniques.each do |letter|
    stem = word.sub(/#{letter}/, "")
    (STEMS[stem] ||= {})[letter] = 1
    end
    end
    end

    result = STEMS.delete_if { |k,v| v.size < CUTOFF }.
    sort_by { |k,v| v.size }.
    reverse!.
    collect! { |k,v| [k, v.size] }

    result.each do |stem, combining| puts "#{stem} #{combining}" end
    Carlos, Dec 19, 2004
    #12
  13. Ruby Quiz

    Glenn Parker Guest

    Re: [QUIZ] [SOLUTION] Scrabble Stems (#12)

    --------------030707030609070201020205
    Content-Type: text/plain; charset=ISO-8859-1; format=flowed
    Content-Transfer-Encoding: 7bit

    > by Martin DeMello
    >
    > In Scrabble parlance, a 'bingo stem' is defined as a set of six
    > letters that combines with a large fraction of the alphabet to anagram
    > to valid seven letter words (a 'bingo' is a move using all seven tiles
    > on your rack). For instance, one of the more prolific stems, SATIRE,
    > combines with twenty letters, A, B, C, D, E, F, G, H, I, K, L, M, N,
    > O, P, R, S, T, V, W (forming words like ASTERIA, BAITERS, RACIEST
    > ...).
    >
    > Write a program that, given a word list and a cutoff n, finds all 6
    > letter stems that combine with n or more letters, sorted in order of
    > how many distinct letters they combine with.


    Here's my solution. My earlier performance problems resulted from
    several very inefficient attempts at generating unique letter
    combinations, i.e. list all unique sets of "N letters taken 6 at a
    time". Reducing the problem to "7 letters taken 6 at a time" made that
    part trivial, but I kept plugging at the more general problem.

    The dictionary I used was 2of4brif.txt, available as part of the 12Dicts
    package at http://prdownloads.sourceforge.net/wordlist/12dicts-4.0.zip

    On my 2GHz P4, I get 5319 stems with a cutoff of 3 in 4.062 seconds. A
    version optimized for 7-letter words (not shown) takes barely over 2
    seconds.

    If I swap line 47 in for line 48, so that all words of 7 letters or more
    are processed, I get 167,227 stems with a cutoff of 3 in 2196.453
    seconds (~37 min.).

    --
    Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>

    --------------030707030609070201020205
    Content-Type: text/plain;
    name="bingo1.rb"
    Content-Transfer-Encoding: 7bit
    Content-Disposition: inline;
    filename="bingo1.rb"

    #!/usr/bin/env ruby

    class Combiner
    include Enumerable

    def initialize(pick, from)
    # How many objects should be returned in each set
    @pick = pick
    # List of objects from which to produce combinations
    @from = from
    end

    def each
    unpicked = @from.length - @pick
    # Indices of objects in @from to return as a set.
    selectors = (0...@pick).to_a
    loop do
    yield selectors.collect {|i| @from}
    # Advance the selectors array to the next combination.
    s = @pick - 1
    s -= 1 while selectors == s + unpicked
    break if s < 0
    value = selectors
    s.upto(@pick - 1) do |i|
    selectors = (value += 1)
    end
    end
    end
    end

    class BingoStems
    STEMLENGTH = 6

    def initialize(filename, cutoff)
    @filename = filename
    @cutoff = cutoff
    @stems = {}
    end

    def find
    start_time = Time.now
    # Collect stems for each word in the word list
    File.open(@filename) do |file|
    file.each do |word|
    word.chomp!
    # Use the next line alternative to process longer words.
    #next if word.length <= STEMLENGTH
    next if word.length != (STEMLENGTH + 1)
    word.downcase!
    Combiner.new(STEMLENGTH, word.split(//).sort).each do |stem|
    (@stems[stem.join('')] ||= {})[word] = true
    end
    end
    end
    # Discard stems with less than @cutoff word combos
    @stems.delete_if {|k, v| v.length < @cutoff}
    puts "found #{@stems.length} stems in #{Time.now - start_time} seconds"
    self
    end

    def report(stream = $stdout, verbose = false)
    keys = @stems.keys.sort_by {|a| @stems[a].length}
    keys.each do |key|
    stream.puts "#{key}: #{@stems[key].length}"
    next unless verbose
    @stems[key].each_key do |word|
    stream.puts " " + word
    end
    end
    stream.flush
    end

    end

    CUTOFF = ARGV[0].to_i
    DICT = ARGV[1]
    BingoStems.new(DICT, CUTOFF).find.report(File.open('output.txt', 'w'), true)

    --------------030707030609070201020205--
    Glenn Parker, Dec 19, 2004
    #13
  14. Ruby Quiz

    Dennis Ranke Guest

    Re: [QUIZ][SOLUTION] Scrabble Stems (#12)

    Carlos wrote:
    > Here is a possible (I hope correct) solution. SATIRE combines only with 16
    > letters according to my dictionary.


    Very simmilar to my solution: (it finds 18 combining letters for SATIRE
    using the ENABLE2K word list)

    hash = Hash.new {|h, k| h[k] = 0}
    File.foreach(ARGV[0] || 'WORD.LST') do |line|
    line.strip!
    if line.size == 7
    letters = line.downcase.scan(/./).sort.join
    7.times do |i|
    hash[letters[0, i] + letters[(i+1)..-1]] |= 1 << (letters - 97)
    end
    end
    end

    cutoff = (ARGV[1] || '15').to_i
    count = {}
    hash.each do |k, v|
    v = (v & 0x5555555) + ((v>>1) & 0x5555555)
    v = (v & 0x3333333) + ((v>>2) & 0x3333333)
    v = (v & 0xf0f0f0f) + ((v>>4) & 0xf0f0f0f)
    v = (v & 0x0ff00ff) + ((v>>8) & 0x0ff00ff)
    v = (v & 0x000ffff) + ((v>>16) & 0x000ffff)
    count[k] = v if v >= cutoff
    end

    count.keys.sort_by {|k| count[k]}.each do |letters|
    printf "%s: (%d) ", letters, count[letters]
    combi = hash[letters]
    26.times do |i|
    print((i+97).chr) if combi == 1
    end
    puts
    end
    Dennis Ranke, Dec 19, 2004
    #14
  15. Re: [QUIZ] [SOLUTION] Scrabble Stems (#12)

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    This is my solution for _meaningful_ stems (words in the dictionary):
    #Usage: <Dictionary> <Stemsize> <Cutoff>

    #On my machine:
    #8728 stems
    #13969 possible bingos
    #1141 stems have more than 6 bingos
    #stemsize is 6
    #bingosize is 7
    #brute force needs 3169957232 comparisons
    #I need 226928 hash lookups
    #22.270484 sec (searching bingos)
    #6.235895 sec (reading dictionary)
    #28.506379 sec (total)
    #0.00326608375343721 sec / stem

    stemsize = (ARGV[1] || 6).to_i
    minbingos = (ARGV[2] || 6).to_i

    alphabeth=("a".."z").to_a;
    stems = []
    bingowords = []

    results = {}

    start1 = Time.new
    File.foreach(File.expand_path(ARGV[0]||"~/dict")) do |line|
    chomped = line.chomp.downcase
    stems << chomped if chomped.size == stemsize
    bingowords << chomped if chomped.size == stemsize+1
    end

    start2 = Time.new

    sbingos = bingowords.map{|word| word.split("").sort.join}
    sbingohash={}

    sbingos.each_index do |index|
    sbingohash[sbingos[index]]=bingowords[index]
    end



    stems.each do |stem|

    bingosfound = [];
    xbingo = stem.split("")
    alphabeth.each do |char|
    stbingo = (xbingo+[char]).sort.join
    fbingo = sbingohash[stbingo];
    bingosfound << fbingo if fbingo
    end


    if bingosfound.size >= minbingos
    results[stem] = bingosfound

    end

    end
    out =""
    results.to_a.sort_by{|e|-(e[1].size)}.each do |result|
    out << "#{result[0]} #{result[1].size} #{result[1].join","}\n"
    end
    done = Time.new;

    puts "\nOrdered:\n\n"
    puts out
    puts "#{stems.size} stems\n#{bingowords.size} possible
    bingos\n#{results.size} stems have more than #{minbingos} bingos"
    puts "stemsize is #{stemsize}\nbingosize is #{stemsize+1}"
    puts "brute force needs #{stems.size*alphabeth.size*bingowords.size}
    comparisons"
    puts "I need #{stems.size*alphabeth.size} hash lookups"
    puts "#{(done-start2).to_s} sec (searching bingos)"
    puts "#{(start2-start1).to_s} sec (reading dictionary)"
    puts "#{(done-start1).to_s} sec (total)"
    puts "#{((done-start1).to_f/stems.size).to_s} sec / stem"








    Jannis Harder

    "jp6iSZmkLp5ISZlEiW5C".unpack("m")[0].unpack("C*").map{|x|x.chr}.join.
    unpack("B*")[0].scan(/.{24}/){i=7;$&.scan(/..../){print\
    "\e[3#{i-=1};1;40m ";$&.each_byte{|z|print" #"[z-?0,1]*2}};puts"\e[0m"}
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.6 (Darwin)

    iD8DBQFBxfY65YRWfc27RzQRAkJXAKCSR8rsHajHIrFORwy1mstEbQ4XOwCeIscl
    Y4O3UkTlCLMZnD0F27a21cg=
    =aRbb
    -----END PGP SIGNATURE-----
    Jannis Harder, Dec 19, 2004
    #15
  16. Ruby Quiz

    Carlos Guest

    Re: [QUIZ][SOLUTION] Scrabble Stems (#12)

    [Julius Plenz <>, 2004-12-19 21.37 CET]
    > * Carlos <> [2004-12-19]:
    > > DICT = "/usr/share/dict/words"

    >
    > ....
    >
    > > stem = word.sub(/#{letter}/, "")


    Oops... no need to interpolate. This line should be

    stem = word.sub(letter, "")

    as in Martin DeMello's solution.

    Thanks.
    Carlos, Dec 19, 2004
    #16
  17. Re: [SOLUTION] Scrabble Stems (#12)

    Here's my solution. It doesn't really have any clever optimizations,
    but it runs in seconds.

    James Edward Gray II

    #!/usr/bin/env ruby

    unless ARGV.size >= 2 and ARGV[1] =~ /^[1-9]|1\d|2[0-6]$/
    puts "Usage: #$0 WORD_LIST_FILE(S) MINIMUM_STEM_LIMIT"
    exit
    end

    $limit = ARGV.pop.to_i
    $stems = { }

    while line = ARGF.gets
    line.chomp!
    line.tr!("^a-zA-Z", "")
    line.downcase!

    next unless line.length == 7

    word = line.split("")
    word.each_index do |i|
    stem = word.dup
    stem.delete_at(i)
    stem = stem.sort.join

    if $stems.include?(stem)
    $stems[stem] << word unless $stems[stem].include?(word)
    else
    $stems[stem] = [ word ]
    end
    end
    end

    $stems.each_pair do |stem, letters|
    next if letters.size < $limit

    puts stem
    puts "\t#{letters.size}: #{letters.sort.join}"
    end
    James Edward Gray II, Dec 20, 2004
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Charlie Cosse

    Tux Math Scrabble v2.2

    Charlie Cosse, Jul 29, 2003, in forum: Python
    Replies:
    0
    Views:
    438
    Charlie Cosse
    Jul 29, 2003
  2. Replies:
    6
    Views:
    1,359
  3. Martin DeMello

    [QUIZ][SOLUTION] Scrabble Stems (#12)

    Martin DeMello, Dec 19, 2004, in forum: Ruby
    Replies:
    1
    Views:
    90
    Andrew Johnson
    Dec 19, 2004
  4. Ruby Quiz

    [SUMMARY] Scrabble Stems (#12)

    Ruby Quiz, Dec 23, 2004, in forum: Ruby
    Replies:
    0
    Views:
    99
    Ruby Quiz
    Dec 23, 2004
  5. Replies:
    0
    Views:
    364
Loading...

Share This Page