R

#### Ruby Quiz

very dependent on the estimate function used to gage distance remaining to the

goal. The quiz recommended the Manhattan Distance function for this role, but

that's not very well suited to the kind of work we are doing here. Daniel

Martin suggested an alternate implementation producing better results.

Let's have a look at Daniel's code below.

Daniel first posted a set of unit tests to verify code correctness. I won't go

into all of those tests here, but let me show a couple. As always, in unit

testing it's a good idea to begin by verifying functionality in some trivial

case:

# ...

def test_simple_horizontal

map = %q(@..X)

solution = @solver.do_quiz_solution(map)

assert_paths_equal(%q(####), solution)

end

# ...

This test ensures that the solver can walk a straight path, with no obstacles

between the starting point and the goal.

Daniel's tests increase in difficulty, of course, and the final test is the one

Daniel used to show how the Manhattan Distance gets into trouble:

# ...

def test_bad_distance

map = %q(@.*..

..~..

..^.X).sub(/ +/,'')

solution = @solver.do_quiz_solution(map)

assert_paths_equal(

%q(#?#..

.?~#.

..^.#), solution, "Got tripped up by manhattan distance")

end

# ...

Here we see a neat feature of Daniel's custom assert_paths_equal() assertion

(not shown). The question marks allow any output in those cells. The reasoning

is that there are sometimes multiple correct paths. Here the first move must be

to one of the question marks, but either will produce the same length path.

The real test here is to ensure that the path goes over the forrest. The

Manhattan Distance function can cause algorithms to favor the mountain route.

Let's move on to the actual code now.

One of the challenges the quiz didn't spend a lot of time on is that A* really

needs a priority queue to manage the paths it is examining, always keeping the

best path so far first in line. Daniel took the easy way out of this problem

and just resorted the paths after each insert:

# I suppose someone would think I should use a heap here.

# I've found that the built-in sort method is much faster

# than any heap implementation in ruby. As a plus, the logic

# is easier to follow.

class PriorityQueue

def initialize

@list = []

end

def add(priority, item)

# Add @list.length so that sort is always using Fixnum comparisons,

# which should be fast, rather than whatever is comparison on `item'

@list << [priority, @list.length, item]

@list.sort!

self

end

def <<(pritem)

add(*pritem)

end

def next

@list.shift[2]

end

def empty?

@list.empty?

end

end

The add() method is all of the magic here. When an item is add()ed, Daniel

actually adds a three element Array to the queue and resorts the entire queue.

The first element of the Array ensures that items are sorted by priority. When

priorities match, the second item breaks ties by add order. The item never

factors into the sort. When the item is retrieved with next(), the two extra

sort fields are discarded.

I have to side-step a moment here and address Daniel's first comment. To be

clear, he is saying that using Ruby's sort() can be faster than a pure Ruby

heap, since sort() is implemented in C. If both elements are correctly

implemented in C, the heap should definitely out perform resorting. Finally,

the Ruby heap can also be faster, as soon as significant input is involved:

#!/usr/bin/env ruby -w

require "sorted_array" # Daniel's PriorityQueue

require "heap" # pure Ruby Heap, from Ruby Quiz #40

require "benchmark"

DATA = Array.new(5_000) { rand(5_000) }.freeze

Benchmark.bmbm(10) do |results|

results.report("sorted_array:") do

queue = PriorityQueue.new

DATA.each { |n| queue.add(n, n) }

queue.next until queue.empty?

end

results.report("ruby_heap:") do

queue = Heap.new

DATA.each { |n| queue.insert(n) }

queue.extract until queue.empty?

end

end

# >> Rehearsal -------------------------------------------------

# >> sorted_array: 33.950000 0.020000 33.970000 ( 33.972859)

# >> ruby_heap: 0.450000 0.000000 0.450000 ( 0.449776)

# >> --------------------------------------- total: 34.420000sec

# >>

# >> user system total real

# >> sorted_array: 33.990000 0.010000 34.000000 ( 34.016562)

# >> ruby_heap: 0.440000 0.000000 0.440000 ( 0.437217)

OK, let's get back to Daniel's code:

require 'enumerator'

class Astar

def do_quiz_solution(puzzle)

@terrain = []

instr = ""

puzzle.each {|rowstr|

next if rowstr =~ /^\s*$/

rowstr.gsub!(/[^[email protected]~X*^]/,'')

instr += rowstr

instr += "\n"

row = []

rowstr.scan(/[[email protected]~X*^]/) {|terrain|

case terrain

when /[[email protected]]/; row << 1

when /[*]/; row << 2

when /\^/; row << 3

when /~/; row << nil

end

}

xind = rowstr.index('X')

aind = rowstr.index('@')

if (aind)

@start = [@terrain.length, aind]

end

if (xind)

@goal = [@terrain.length, xind]

end

@terrain << row

}

if do_find_path

outarr = instr.split(/\n/)

@path.each{|row,col| outarr[row][col]='#'}

return outarr.join("\n")

else

return nil

end

end

# ...

This method manages the quiz process. First, an Array is created to hold the

costs of each cell and a String to hold the passed map. The each() iterator

cleans and dissects the input, filling both structures as it goes. It also

locates the @start and @goal coordinates while it works.

With the setup out of the way, a hand-off is made to do_find_path() which is the

A* implementation. If a path is found, the indicated coordinates are marked on

the original map String and returned. Otherwise, nil is returned.

Here's the heart of the matter:

# ...

def do_find_path

been_there = {}

pqueue = PriorityQueue.new

pqueue << [1,[@start,[],1]]

while !pqueue.empty?

spot,path_so_far,cost_so_far = pqueue.next

next if been_there[spot]

newpath = [path_so_far, spot]

if (spot == @goal)

@path = []

newpath.flatten.each_slice(2) {|i,j| @path << [i,j]}

return @path

end

been_there[spot] = 1

spotsfrom(spot).each {|newspot|

next if been_there[newspot]

tcost = @terrain[newspot[0]][newspot[1]]

newcost = cost_so_far + tcost

pqueue << [newcost + estimate(newspot), [newspot,newpath,newcost]]

}

end

return nil

end

# ...

This is the A* algorithm in Daniel's code. It begins by establishing a Hash to

track where it has been and a PriorityQueue to manage the paths being

considered. From there the code dives into the search loop which terminates

when the the PriorityQueue is empty?(), indicating that all move options have

been exhausted.

At each step through the loop, where we are, the path so far, and the cumlative

cost are pulled out of the PriorityQueue. We skip ahead if we've been on this

spot before. (The first path would have been shorter, since the PriorityQueue

keeps them sorted.) The new path is built, including our new location, and we

check to see if the @goal has been reached. When it has, the path is converted

from to a list of coordinates and returned.

If we're not at the goal, we need to extend the path one step in every

direction. Here the helper method spotsfrom() generates the choices to iterate

over. If we've been there before we can skip it, otherwise the new cost is

calculated via the other helper method estimate() and added, with the new path,

to the PriorityQueue.

Let's have a look at those helper methods:

# ...

def estimate(spot)

# quiz statement version

# (spot[0] - @goal[0]).abs + (spot[1] - @goal[1]).abs

# my version

[(spot[0] - @goal[0]).abs, (spot[1] - @goal[1]).abs].max

end

def spotsfrom(spot)

retval = []

vertadds = [0,1]

horizadds = [0,1]

if (spot[0] > 0) then vertadds << -1; end

if (spot[1] > 0) then horizadds << -1; end

vertadds.each{|v| horizadds.each{|h|

if (v != 0 or h != 0) then

ns = [spot[0]+v,spot[1]+h]

if (@terrain[ns[0]] and @terrain[ns[0]][ns[1]]) then

retval << ns

end

end

}}

retval

end

end

The estimate() method is Daniel's improvement to distance calculation. Since

diagonal moves are allowed to shorten two distances at once, we really just need

to consider the longer distance, vertically or horizontally, to the goal.

The other helper builds a list of neighbors by adding combinations of -1, 0, and

1 as offsets to the current location coordinates. The @terrain Array is used to

reality check these coordinates as in-bounds and walkable, before they are added

to the returned Array.

The final solution is just an instance creation and interface method call away:

if __FILE__ == $0

puts Astar.new.do_quiz_solution(ARGF)

end

My thanks to all the pathfinders who fiddled around with the quiz and to Daniel

Martin especially for helping to clarify intent.

Ruby Quiz for this week: Seek James out at RubyConf and introduce yourself!