[QUIZ] Posix Pangrams (#97)

R

Ruby Quiz

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Martin DeMello

A pangram is a sentence containing every letter of the alphabet at least once (a
famous example in English being "the quick brown fox jumps over the lazy dog").
For maximum style points a pangram should read smoothly, and have both as few
repeated letters as possible (ideally zero), and as few words as possible.

This quiz extends the idea to the posix utilities[1] - write a program to find
pangrammatic collections of posix utilities that (1) use the fewest utilities
and (2) have the minimum number of repeated letters. In either case, break ties
on the other criterion; that is, your first solution should also have as few
repeated letters as possible, and your second one should use as few utilities as
possible.

[1] http://www.unix.org/version3/apis/cu.html has a complete list
 
M

Martin Coxall

Interesting. I had a high school chemistry teacher who had us play
similar games with symbols of the periodic table.

It's really all about trying to make swearwords, isn't it? Copper is
your friend.

On the unix front, I nominate 'mingetty'.

Martin
 
D

David Vallner

--------------enig2FD4F54F3DF30327EF6B2A44
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Martin said:
It's really all about trying to make swearwords, isn't it? Copper is
your friend.
=20
On the unix front, I nominate 'mingetty'.
=20

My innocent eyes!

*hurries to cover the pickle jar with small white round objects floating
in formaldehyde with a cloth*

David Vallner


--------------enig2FD4F54F3DF30327EF6B2A44
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFJxgSy6MhrS8astoRAhNMAJ9RpiAvS9aKGIkoKoN+3/0mF0ofqQCfYN0N
fxuu1sqtY/i26cJKFwIdnkM=
=cfHc
-----END PGP SIGNATURE-----

--------------enig2FD4F54F3DF30327EF6B2A44--
 
H

Hal Fulton

Christian said:
We use to play that because the lessons are so boring. :)

Longest English words made up from chemical symbols:

THErMoPHOsPHOrEsCEnCe
NONRePReSEnTaTIONAlISm

Longest German words made up from chemical symbols:

INFLaTiONSINdIKAtOReN
AuSLaNdSINVEsTiTIONeN
KNOTeNReCHNErAPPLiKAtION


I love that kind of wordplay. I appreciate your
sharing it.


Thanks,
Hydrogen Aluminum
 
M

Morton Goldberg

Here is my solution. It's not what I planned. I intended to submit an
algorithm that would find all the minimal Posix pangrams, and I
worked out a rather nice (IMO) recursive algorithm to do just that.
But Ruby doesn't allocate enough stack space for that algorithm to
run to completion <sigh>. So I've settled on an algorithm that finds
just one minimal pangram, but uses pseudo-random numbers to produce a
different pangram each time it is run.

My best pangram so far is

[
"fg", "jobs", "qhold", "stty", "umask",
"unexpand", "vi", "write", "zcat"
]

Regards, Morton

<code>
#! /usr/bin/env ruby -w
#
# Created by Morton Goldberg on October 08, 2006.
#
# Ruby Quiz 97 -- POSIX Pangrams
# quiz_97.3.rb

class Array

def shuffle!
n = size - 1
while n > 0
k = rand(n)
self[k], self[n] = self[n], self[k]
n -= 1
end
self
end

end

# Removed c99, fort77, and m4 -- the complication they add is IMHO
# unedifying.
WORDS = %w[
admin alias ar asa at awk basename batch bc bg cal cat cd cflow
chgrp chmod chown cksum cmp comm command compress cp crontab csplit
ctags cut cxref date dd delta df diff dirname du echo ed env ex
expand
expr false fc fg file find fold fuser gencat get getconf getopts
grep hash head iconv id ipcrm ipcs jobs join kill lex link ln locale
localedef logger logname lp ls mailx make man mesg mkdir mkfifo more
mv newgrp nice nl nm nohup od paste patch pathchk pax pr printf
prs ps
pwd qalter qdel qhold qmove qmsg qrerun qrls qselect qsig qstat qsub
read renice rm rmdel rmdir sact sccs sed sh sleep sort split strings
strip stty tabs tail talk tee test time touch tput tr true tsort tty
type ulimit umask unalias uname uncompress unexpand unget uniq
unlink
uucp uudecode uuencode uustat uux val vi wait wc what who write
xargs
yacc zcat
]

# Return true if _wds_ is a pangram.
def pangram?(wds)
wds.join.split(//).uniq.size == 26
end

# Return array giving pangram statistics:
# [<words>, <total-chars>, <repeated-chars>]
def stats(pan)
tmp = pan.join.split(//)
[pan.size, tmp.size, tmp.size - tmp.uniq.size]
end

# Given a pangram, return list of pangrams, where each panaram in the
list
# is derived from the given one by removing one word.
def diminish(pan)
result = pan.collect do |item|
rest = pan - [item]
rest if pangram?(rest)
end
result.compact.shuffle!
end

# Given a list of pangrams return a minimal pangram that can be derived
# from it.
def find_minimal(pans)
pan = pans.pop
reduced = diminish(pan)
return pan if reduced.empty?
find_minimal(reduced)
end

# Find a minimal pangram.
pangram = find_minimal([WORDS])
p pangram # =>
# [
# "fg", "jobs", "qhold", "stty", "umask",
# "unexpand", "vi", "write", "zcat"
# ]
p stats(pangram) # => [9, 39, 13]
</code>
 
M

Morton Goldberg

I woke up this morning with the realization that the code I posted
yesterday could be considerably simplified. Here is the simpler version.

Regards, Morton

<code>
#! /usr/bin/env ruby -w
#
# Created by Morton Goldberg on October 09, 2006.
#
# Ruby Quiz 97 -- POSIX Pangrams
# quiz_97.4.rb

# Removed c99, fort77, and m4 -- the complication they add is IMHO
# unedifying.
WORDS = %w[
admin alias ar asa at awk basename batch bc bg cal cat cd cflow
chgrp chmod chown cksum cmp comm command compress cp crontab csplit
ctags cut cxref date dd delta df diff dirname du echo ed env ex
expand
expr false fc fg file find fold fuser gencat get getconf getopts
grep hash head iconv id ipcrm ipcs jobs join kill lex link ln locale
localedef logger logname lp ls mailx make man mesg mkdir mkfifo more
mv newgrp nice nl nm nohup od paste patch pathchk pax pr printf
prs ps
pwd qalter qdel qhold qmove qmsg qrerun qrls qselect qsig qstat qsub
read renice rm rmdel rmdir sact sccs sed sh sleep sort split strings
strip stty tabs tail talk tee test time touch tput tr true tsort tty
type ulimit umask unalias uname uncompress unexpand unget uniq
unlink
uucp uudecode uuencode uustat uux val vi wait wc what who write
xargs
yacc zcat
]

# Return true if _wds_ is a pangram.
def pangram?(wds)
wds.join.split(//).uniq.size == 26
end

# Return array giving pangram statistics:
# [<words>, <total-chars>, <repeated-chars>]
def stats(pan)
tmp = pan.join.split(//)
[pan.size, tmp.size, tmp.size - tmp.uniq.size]
end

# Given a pangram, return a pangram derived from it by removing one
word.
def remove_one(pan)
result = pan.collect do |item|
diff = pan - [item]
diff if pangram?(diff)
end
result.compact!
result[rand(result.size)] unless result.empty?
end

# Given a pangram return a minimal pangram derived from it.
def find_minimal(pan)
nxt = remove_one(pan)
return pan unless nxt
find_minimal(nxt)
end

# Find a minimal pangram.
pangram = find_minimal(WORDS)
p pangram # =>
# [
# "expr", "getconf", "jobs", "mv", "qdel",
# "type", "unlink", "what", "zcat"
# ]
p stats(pangram) # => [9, 39, 13]
</code>
 
C

camerooni

I haven't found a solution with fewer repeated characters, but

"zcat stty jobs cxref newgrp iconv cksum qhold"

uses one fewer utility
 
M

Martin Coxall

I haven't found a solution with fewer repeated characters, but

"zcat stty jobs cxref newgrp iconv cksum qhold"

Isn't it supposed to use every letter?

Martin
 
J

James Edward Gray II

Isn't it supposed to use every letter?

Which letters are missing?

$ ruby -e 'puts(("a".."z").to_a - "zcat stty jobs cxref newgrp iconv
cksum qhold".split(""))'
$

James Edward Gray II
 
M

Martin Coxall

Which letters are missing?
$ ruby -e 'puts(("a".."z").to_a - "zcat stty jobs cxref newgrp iconv
cksum qhold".split(""))'
$

Aha! So it does.

I'm just jealous that yours is smaller than mine.

Martin
 
C

camerooni

After a few optimization passes, I've finally figured out how to get
the optimal POSIX pangrams in about 30 minutes on a fast, modern
machine. I am still relatively new to Ruby so this application is
surely not as concise as it could be. To run it, save the two files,
create a posix-utilities.txt with the utilities, one per line, and run
the unit test.

-- pangrams.rb

#

# A set of routines to find minimal pangrams from an input set of
words. A pangram is a set of

# words that contain all the letters of the alphabet. Unless there are
some properties of

# pangrams that I am unaware of, finding the minimal pangrammatic
subset of a set of words

# is in a family of constraint-satisfaction problems that is
NP-complete, as in, there is

# not a way to find _the_ minimal subset that is guaranteed to run in
polynomial time with respect

# to the size of the dataset.

#

# The most common ways of finding excellent, but not provably perfect
solutions are to either

# use a non-deterministic algorithm, or to try to prune the number of
subsets that you search

# as much as possible, and backtrack as often as possible, quitting
after a pre-determined
# quality threshold has been reached.
#

# The Pangrams class below implements both of these algorithms,
providing methods to yield a

# specified number of random pangrams, or to do a backtracking search
to yield successively

# better pangrams.

#

# In order not to waste time trying non-pangrams, the strategy for
building pangrams is to

# build a list by finding the least common letter not already there,
and then searching for pangrams

# using the utilities containing the least-common missing letter. If at
any point the string

# is longer or has more repeated characters than the known minimums
then we stop searching and

# backtrack. So we abort early and try to limit the number of decisions
to try at each step.

#

# The search algorithm does not run in polynomial time, but for 160
elements, it can search the

# problem space in about 30 minutes on a fast, modern machine
(dual-core xeon). For larger datasets

# (like aspell) the non-deterministic approach would be more practical.

#

# Two minimal POSIX pangrams are:

# Minimal number of utilities: "zcat stty jobs cxref newgrp iconv
cksum qhold"

# Minimal repeated letters: "zcat tty jobs lex awk mv uniq chgrp df"
(4 repeats)

# (This is one of the ones that Christian found, in a different
order)
#



# Takes a list of words and creates a histogram of letters. The
histogram can then

# be queried for pangramness and repeated letter counts.

class LetterHistogram

# Initializes with a set of words

def initialize(words=nil)

@hist = Hash.new(0)

@total_letters = 0

words.each {|word| add word} unless words.nil?

end



# Adds a word to the pangram

def add(word)

word.scan(/./) {|l| @hist[l] = @hist[l] + 1 if ('a'..'z') === l}

@total_letters += word.size

end



# Returns true if this list of words has a pangram

def pangram?

return @hist.size == 26

end



# Returns the number of repeated letter

def repeats

@total_letters - @hist.size

end



# Returns a list of missing letters. Used in conjunction with the
WordLetterMap below to

# find a good word to add to the list to make a pangram.

def missing_letters

missing = Array.new

('a'..'z').each {|l| missing << l if @hist[l] == 0}

missing

end

end



# Contans a map of Letters => Words containing that letter

class WordLetterMap

def initialize(words)

@map = Hash.new

words.each {|w| add w}

end



# Adds a new word

def add(word)

word.scan(/./) do |l|

if ('a'..'z') === l

@map[l] = Array.new unless @map.has_key? l

@map[l] = @map[l] << word

end

end

end



# Return a list of words containing the least common missing letter

# Used to limit the number of choices as we search the minimal
pangram space

def least_common(words=nil, histogram=nil)

histogram = LetterHistogram.new words if histogram.nil?

min_words = nil

histogram.missing_letters.each do |l|

new_words = @map[l] - words

if min_words.nil? || min_words.size > words.size

min_words = new_words

end

end



return min_words

end

end



# Holds a list of words and generates minimal pangrams, both
non-deterministically,

# and by searching, backtracking to avoid non-minimal branches

class Pangrams

# Exception to throw when we have returned the maximum number of
pangrams

class AllDone < Exception

end



# Initialize with a word list

def initialize(words=nil)

if words.nil?

@words = Array.new

else

@words = words

end

end



# Adds a word to the set of words to produce pangrams

def add(word)

@words[@words.size] = word

end



# The number of words in the set of words

def size

@words.size

end



# Loads a list of words from a file

def self.from_file(filename)

p = Pangrams.new

File.new(filename).each_line {|line| p.add line.strip}

return p

end



# Yields randomly generated minimal pangrams to the passed block

def random(count,&block)

@word_letters = WordLetterMap.new @words

(0..count).each {|i| random_pangram([],&block)}

end



# Searches for good minimal pangrams, yielding the ones it finds to
the block

def search(max_count=0,&block)

@min_size = size

@min_repeats = 1000

@max_count = max_count

@count = 0

@word_letters = WordLetterMap.new @words

begin

pangram_search([],&block)

rescue AllDone

# Quit gracefully

end

end



private

# searches the pangram space by finding all words containing the
least common letter until

# a pangram has been built, passing each found pangram to the block.
The algorithm tries to

# be smart searching the word space by choosing the least common
letters first, making the

# overall space smaller. It also tries make smart use of backtracking
to avoid searching

# non-lucrative branches.

def pangram_search(words, &block)

# Bust out if we've found enough pangrams

raise AllDone.new if @max_count != 0 && @count > @max_count



h = LetterHistogram.new words



# If we already have more words or more repeats, then no need to
look any

# further, we should backtrack and try something else.

return if words.size >= @min_size && h.repeats >= @min_repeats



# This pangram is somehow minimal, so pass to the block

if h.pangram?

@min_size = words.size if words.size < @min_size

@min_repeats = h.repeats if h.repeats < @min_repeats

@count += 1

yield words,h

return

end



# No pangram yet, find children and descend

new_words = @word_letters.least_common words,h

new_words.each {|w| pangram_search words + [w], &block}

end



# Builds a pangram by finding the least common letter that is not
represented in the

# passed-in array, and recursing until a pangram is built. This
algorithm should build

# minimal pangrams almost all of the time (i.e. removing a word from
the set will make

# the word list non-pangrammatic and will yield very good pangrams,
but probably not the

# most optimal ones.

def random_pangram(words, &block)

# Do we already have a pangram? If so, pass to block and quit

h = LetterHistogram.new words

if h.pangram?

yield words,h

return

end



# Be non-deterministic, and descend on a random word

new_words = @word_letters.least_common words,h

new_word = new_words[rand(new_words.size)]

random_pangram words + [new_word], &block

end

end


-- test_pangrams.rb

require 'test/unit'

require 'pangrams'



# Unit test harness to test the histogram and the frequency map code,
and also to exercise the pangram
# generation code. The tests assume a list of posix words, one per
line, in 'posix-words.txt'

class TestPangrams < Test::Unit::TestCase

def test_histogram

l = LetterHistogram.new ['moon']

assert_equal false, l.pangram?

assert_equal 1, l.repeats



pangram = %w{the quick brown fox jumps over the lazy dog}

l = LetterHistogram.new pangram

assert_equal true, l.pangram?



l = LetterHistogram.new ['qwertyuiopasdfghjklzxcvbn'] # m is
missing

m = l.missing_letters

assert_equal 1, m.size

assert_equal 'm', m.first

end



def test_random_search

puts "-- Testing Random Pangram Generation --"



p = Pangrams.from_file 'posix-words.txt'

min_repeats_length = p.size

min_repeats = 1000

min_repeats_pangram = nil



min_size = 160

min_size_repeats = 1000

min_size_pangram = nil



count = 0



p.random(1000) do |p, hist|

repeats = hist.repeats

size = p.size

if repeats < min_repeats || repeats == min_repeats && size <
min_repeats_length

min_repeats_pangram = p

min_repeats = repeats

min_repeats_length = p.size

puts "New min-repeats pangram: #{min_repeats_pangram.join ' '}
with #{min_repeats} repeats"

$stdout.flush

end

if size < min_size || size == min_size && repeats <
min_size_repeats

min_size = size

min_size_repeats = repeats

min_size_pangram = p

puts "New min-size pangram: #{min_size_pangram.join ' '} with
#{min_size} words"

$stdout.flush

end

end

end



def test_backtracking_search

puts "--Testing backtracking search--"



p = Pangrams.from_file 'posix-words.txt'

min_repeats_length = p.size

min_repeats = 1000

min_repeats_pangram = nil



min_size = 160

min_size_repeats = 1000

min_size_pangram = nil



p.search(50) do |p, hist|

repeats = hist.repeats

size = p.size

if repeats < min_repeats || repeats == min_repeats && size <
min_repeats_length

min_repeats_pangram = p

min_repeats = repeats

min_repeats_length = p.size

puts "New min-repeats pangram: #{min_repeats_pangram.join ' '}
with #{min_repeats} repeats"

$stdout.flush

end

if size < min_size || size == min_size && repeats <
min_size_repeats

min_size = size

min_size_repeats = repeats

min_size_pangram = p

puts "New min-size pangram: #{min_size_pangram.join ' '} with
#{min_size} words"

$stdout.flush

end

end

end

end
 
K

Ken Bloom

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Martin DeMello

A pangram is a sentence containing every letter of the alphabet at least once (a
famous example in English being "the quick brown fox jumps over the lazy dog").
For maximum style points a pangram should read smoothly, and have both as few
repeated letters as possible (ideally zero), and as few words as possible.

This quiz extends the idea to the posix utilities[1] - write a program to find
pangrammatic collections of posix utilities that (1) use the fewest utilities
and (2) have the minimum number of repeated letters. In either case, break ties
on the other criterion; that is, your first solution should also have as few
repeated letters as possible, and your second one should use as few utilities as
possible.

[1] http://www.unix.org/version3/apis/cu.html has a complete list

Finding the pangram with the minimum number of words is NP-Hard. The
reduction is from Minimum Set Cover[1]:

MINIMUM SET COVER:
INSTANCE: Collection C of subsets of a finite set S.
SOLUTION: A set cover for S, i.e., a subset C' of C such that
every element in S belongs to at least one member of C'.
MEASURE: Cardinality of the set cover, i.e., size(C')

Reduction:
Each element in S becomes mapped to a unique letter in the alphabet used
by pangrams. (Note that in general, we can play Pangrams in any
langauage, the alphabet can be of length we want) Each subset K (which is
an element of C) becomes a word, made up of the letters corresponding to
the elements of K. Finding a pangram with the minimum number of words in
this setup would give a solution to the set-cover instance.

I suspect that the other optimization problem here may also be NP-Hard.
I'm not sure (in all of the 5 minutes that I'm writing this post) how to do
a simple reduction to that problem though.

--Ken Bloom

[1] http://www.ensta.fr/~diam//ro/online/viggo_wwwcompendium/node146.html
 
M

Martin Coxall

I suspect that the other optimization problem here may also be NP-
Hard.
I'm not sure (in all of the 5 minutes that I'm writing this post)
how to do
a simple reduction to that problem though.

True, but in this case the set to cover was small enough to make it
computationally feasible without having to rewrite it in Fortran. Phew.

Martin
 
K

Ken Bloom

True, but in this case the set to cover was small enough to make it
computationally feasible without having to rewrite it in Fortran. Phew.

Martin

There's also the matter of program structure. When dealing with an
NP-Hard problem, you know that any solution geared to do a heuristic
search quickly is not guaranteed to find the minimum.

And if you do naive searches of the entire solution space, the
number of POSIX utilities involved here is not all that trivial.
 
J

James Edward Gray II

There's also the matter of program structure. When dealing with an
NP-Hard problem, you know that any solution geared to do a heuristic
search quickly is not guaranteed to find the minimum.

We haven't talked a lot about this in the past, but I take a pretty
relaxed approach to tough quiz problems. Find a good heuristic and
get close. That's fine with me.

To me, Ruby Quiz is all about keeping the brain sharp. I think
programmers are often susceptible to the "That's not possible!"
mentality and we have to fight against that. MJD once gave a neat
speech about how when Alchemy was as old as our profession is now,
they were still actively trying to turn lead into gold but we seem
content to bury ourselves in rules about what we can't do.

Was this week's quiz solvable beyond a shadow of a doubt in all
cases? No. Does that mean we shouldn't do it? No way. The
solution I talked up in the summary finds a darn good pangram in
under a second! It's within 7 characters of perfect. It maybe
guesswork, but it's damn good guesswork.

I have a great respect for those who can think outside the box like
that. I think we should cultivate that skill in ourselves.

Even at work, I'm the company's resident language geek, so I often
see a problem and launch into a detailed plan of how we can solve it
to death. Luckily, my two far more practical coworkers usually just
politely point out the simple fix that is plenty good enough for our
purposes. If it wasn't for them I would never accomplish a complete
anything.

I'm all about thinking outside the box. I think we should all learn
to love the guesswork!

There's my two cents on this issue.

James Edward Gray II
 
M

Martin Coxall

We haven't talked a lot about this in the past, but I take a pretty
relaxed approach to tough quiz problems. Find a good heuristic and
get close. That's fine with me.

I think it's fine. Now if you'd said "find a solution and then prove
using the symbolic algebra of your choice that it's optimal" would be
a different story.

Next Ruby Quiz: Prove that P = NP, unless it doesn't, in which case
prove that P != NP.

Martin
 
T

Timothy Goddard

Ruby said:
The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Martin DeMello

A pangram is a sentence containing every letter of the alphabet at least once (a
famous example in English being "the quick brown fox jumps over the lazy dog").
For maximum style points a pangram should read smoothly, and have both as few
repeated letters as possible (ideally zero), and as few words as possible.

This quiz extends the idea to the posix utilities[1] - write a program to find
pangrammatic collections of posix utilities that (1) use the fewest utilities
and (2) have the minimum number of repeated letters. In either case, break ties
on the other criterion; that is, your first solution should also have as few
repeated letters as possible, and your second one should use as few utilities as
possible.

[1] http://www.unix.org/version3/apis/cu.html has a complete list

Here's my solution to the problem. I'm not quite sure why it doesn't
find the better result already found by Ezwan as I expected it to be
optimal - please point out my error if you see it. It may well be that
making it optimal would also make it converge much more slowly though.

This uses an A* search algorithm to find the sequence with the least
number of total letters that forms a pangram (= least number of
repeated characters). The script is enormous but on my laptop (1.4GHz
Celeron M processor, 768MB memory) it manages to find the following
solution in under 0.2 seconds:

Words: qalter jobs find chgrp awk uux mv zcat tty
Length: 34
Cost: 34
Pangram? true
Unused Letters: (0)

I'll break the code into parts to make it easier to read:

#########################################
# Raw Data #
#########################################

LETTERS = (?a..?z)
WORDS = %w{admin alias ar asa at awk basename batch bc bg c99 cal cat
cd cflow chgrp chmod chown cksum cmp comm command compress cp crontab
csplit ctags cut cxref date dd delta df diff dirname du echo ed env ex
expand expr FALSE fc fg file find fold fort77 fuser gencat get getconf
getopts grep hash head iconv id ipcrm ipcs jobs join kill lex link ln
locale localedef logger logname lp ls m4 mailx make man mesg mkdir
mkfifo more mv newgrp nice nl nm nohup od paste patch pathchk pax pr
printf prs ps pwd qalter qdel qhold qmove qmsg qrerun qrls qselect qsig
qstat qsub read renice rm rmdel rmdir sact sccs sed sh sleep sort split
strings strip stty tabs tail talk tee test time touch tput tr TRUE
tsort tty type ulimit umask unalias uname uncompress unexpand unget
uniq unlink uucp uudecode uuencode uustat uux val vi wait wc what who
write xargs yacc zcat}

#########################################
# Statistics #
#########################################

# This is all the letters used by a particular word
USED_LETTERS = Hash.new do |h, word|
letters = word.downcase.split(//)
letters.uniq!
h[word] = letters.map {|l| l[0]}.select {|l| LETTERS.include?(l)}
end

# This is the length in letters of a word (auto-caching)
WORD_LENGTH = Hash.new do |h, word|
length = 0
word.downcase.each_byte do |byte|
length += 1 if LETTERS.include?(byte)
end
h[word] = length
end

# This is the minimum number of letters that must be included to add
# a particular letter into the sequence. It is the length (letters
only)
# of the shortest word containing this letter.
LETTER_COSTS = {}
WORDS.each do |word|
letters = Hash.new{0}
word.each_byte {|byte| letters[byte] += 1}
letters.each do |letter, count|
if !LETTER_COSTS[letter] || LETTER_COSTS[letter] >
WORD_LENGTH[word]
LETTER_COSTS[letter] = WORD_LENGTH[word]
end
end
end

#########################################
# Pangram Class - Search Tree States #
#########################################

class Pangram

include Comparable

def initialize(words = [])
@words = words
end

def children
remaining = WORDS - @words
remaining.map do |word|
Pangram.new(@words + [word])
end
end

def unused_letters
unless @unused
@unused = LETTERS.to_a
@words.each do |word|
@unused -= USED_LETTERS[word]
end
end
@unused
end

def length
unless @length
@length = 0
@words.each do |word|
@length += WORD_LENGTH[word]
end
end
@length
end

def complete?
unused_letters.length == 0
end

# Estimate of the cost to reach the end goal.
# Must always over-estimate
def heuristic
unused_letters.inject(0) {|sum, letter| sum + LETTER_COSTS[letter]}
end

def cost
unless @cost
@cost = length + heuristic
end
@cost
end

def <=>(other)
self.cost <=> other.cost
end

def stats
%~
Words: #{@words.join(" ")}
Length: #{length}
Cost: #{self.length + self.heuristic}
Pangram? #{complete?}
Unused Letters: #{unused_letters.map{|l| l.chr}.join(", ")}
(#{unused_letters.length})
~.strip
end
end

#########################################
# Ordered Queue for A* Search #
#########################################

class OrderedQueue
def initialize(array = [])
@Queue = array.sort
end

# Add items to the queue
# This is optimised for large queues
def add(items)
sorted_items = items.sort
idx = 0
loop do
break if idx >= @queue.length - 1
qitem = @Queue[idx]
while !sorted_items.empty? && sorted_items.first < qitem
@queue.insert(idx, sorted_items.shift)
idx += 1
end
idx += 1
end
sorted_items.each {|item| @Queue << item}
end

def pop
@queue.shift
end

def empty?
@queue.length == 0
end
end

#########################################
# Search implementation #
#########################################

if $0 == __FILE__
# Start queue with the empty initial pangram
options = OrderedQueue.new([Pangram.new])
loop do
# Check for exhaustion
if options.empty?
puts "No solution could be found"
exit 1
end

# Get the best current option
option = options.pop

# Check for completion
if option.complete?
puts option.stats
break
end

# Add children to options
options.add(option.children)
end
end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top