[SUMMARY] A* (#98)


R

Ruby Quiz

As was brought up in the discussion, the effectiveness of the A* algorithm is
very dependent on the estimate function used to gage distance remaining to the
goal. The quiz recommended the Manhattan Distance function for this role, but
that's not very well suited to the kind of work we are doing here. Daniel
Martin suggested an alternate implementation producing better results.

Let's have a look at Daniel's code below.

Daniel first posted a set of unit tests to verify code correctness. I won't go
into all of those tests here, but let me show a couple. As always, in unit
testing it's a good idea to begin by verifying functionality in some trivial
case:

# ...

def test_simple_horizontal
map = %q(@..X)
solution = @solver.do_quiz_solution(map)
assert_paths_equal(%q(####), solution)
end

# ...

This test ensures that the solver can walk a straight path, with no obstacles
between the starting point and the goal.

Daniel's tests increase in difficulty, of course, and the final test is the one
Daniel used to show how the Manhattan Distance gets into trouble:

# ...

def test_bad_distance
map = %q(@.*..
..~..
..^.X).sub(/ +/,'')
solution = @solver.do_quiz_solution(map)
assert_paths_equal(
%q(#?#..
.?~#.
..^.#), solution, "Got tripped up by manhattan distance")
end

# ...

Here we see a neat feature of Daniel's custom assert_paths_equal() assertion
(not shown). The question marks allow any output in those cells. The reasoning
is that there are sometimes multiple correct paths. Here the first move must be
to one of the question marks, but either will produce the same length path.

The real test here is to ensure that the path goes over the forrest. The
Manhattan Distance function can cause algorithms to favor the mountain route.

Let's move on to the actual code now.

One of the challenges the quiz didn't spend a lot of time on is that A* really
needs a priority queue to manage the paths it is examining, always keeping the
best path so far first in line. Daniel took the easy way out of this problem
and just resorted the paths after each insert:

# I suppose someone would think I should use a heap here.
# I've found that the built-in sort method is much faster
# than any heap implementation in ruby. As a plus, the logic
# is easier to follow.
class PriorityQueue
def initialize
@list = []
end
def add(priority, item)
# Add @list.length so that sort is always using Fixnum comparisons,
# which should be fast, rather than whatever is comparison on `item'
@list << [priority, @list.length, item]
@list.sort!
self
end
def <<(pritem)
add(*pritem)
end
def next
@list.shift[2]
end
def empty?
@list.empty?
end
end

The add() method is all of the magic here. When an item is add()ed, Daniel
actually adds a three element Array to the queue and resorts the entire queue.
The first element of the Array ensures that items are sorted by priority. When
priorities match, the second item breaks ties by add order. The item never
factors into the sort. When the item is retrieved with next(), the two extra
sort fields are discarded.

I have to side-step a moment here and address Daniel's first comment. To be
clear, he is saying that using Ruby's sort() can be faster than a pure Ruby
heap, since sort() is implemented in C. If both elements are correctly
implemented in C, the heap should definitely out perform resorting. Finally,
the Ruby heap can also be faster, as soon as significant input is involved:

#!/usr/bin/env ruby -w

require "sorted_array" # Daniel's PriorityQueue
require "heap" # pure Ruby Heap, from Ruby Quiz #40
require "benchmark"

DATA = Array.new(5_000) { rand(5_000) }.freeze
Benchmark.bmbm(10) do |results|
results.report("sorted_array:") do
queue = PriorityQueue.new
DATA.each { |n| queue.add(n, n) }
queue.next until queue.empty?
end
results.report("ruby_heap:") do
queue = Heap.new
DATA.each { |n| queue.insert(n) }
queue.extract until queue.empty?
end
end
# >> Rehearsal -------------------------------------------------
# >> sorted_array: 33.950000 0.020000 33.970000 ( 33.972859)
# >> ruby_heap: 0.450000 0.000000 0.450000 ( 0.449776)
# >> --------------------------------------- total: 34.420000sec
# >>
# >> user system total real
# >> sorted_array: 33.990000 0.010000 34.000000 ( 34.016562)
# >> ruby_heap: 0.440000 0.000000 0.440000 ( 0.437217)

OK, let's get back to Daniel's code:

require 'enumerator'

class Astar
def do_quiz_solution(puzzle)
@terrain = []
instr = ""
puzzle.each {|rowstr|
next if rowstr =~ /^\s*$/
rowstr.gsub!(/[^[email protected]~X*^]/,'')
instr += rowstr
instr += "\n"
row = []
rowstr.scan(/[[email protected]~X*^]/) {|terrain|
case terrain
when /[[email protected]]/; row << 1
when /[*]/; row << 2
when /\^/; row << 3
when /~/; row << nil
end
}
xind = rowstr.index('X')
aind = rowstr.index('@')
if (aind)
@start = [@terrain.length, aind]
end
if (xind)
@goal = [@terrain.length, xind]
end
@terrain << row
}
if do_find_path
outarr = instr.split(/\n/)
@path.each{|row,col| outarr[row][col]='#'}
return outarr.join("\n")
else
return nil
end
end

# ...

This method manages the quiz process. First, an Array is created to hold the
costs of each cell and a String to hold the passed map. The each() iterator
cleans and dissects the input, filling both structures as it goes. It also
locates the @start and @goal coordinates while it works.

With the setup out of the way, a hand-off is made to do_find_path() which is the
A* implementation. If a path is found, the indicated coordinates are marked on
the original map String and returned. Otherwise, nil is returned.

Here's the heart of the matter:

# ...

def do_find_path
been_there = {}
pqueue = PriorityQueue.new
pqueue << [1,[@start,[],1]]
while !pqueue.empty?
spot,path_so_far,cost_so_far = pqueue.next
next if been_there[spot]
newpath = [path_so_far, spot]
if (spot == @goal)
@path = []
newpath.flatten.each_slice(2) {|i,j| @path << [i,j]}
return @path
end
been_there[spot] = 1
spotsfrom(spot).each {|newspot|
next if been_there[newspot]
tcost = @terrain[newspot[0]][newspot[1]]
newcost = cost_so_far + tcost
pqueue << [newcost + estimate(newspot), [newspot,newpath,newcost]]
}
end
return nil
end

# ...

This is the A* algorithm in Daniel's code. It begins by establishing a Hash to
track where it has been and a PriorityQueue to manage the paths being
considered. From there the code dives into the search loop which terminates
when the the PriorityQueue is empty?(), indicating that all move options have
been exhausted.

At each step through the loop, where we are, the path so far, and the cumlative
cost are pulled out of the PriorityQueue. We skip ahead if we've been on this
spot before. (The first path would have been shorter, since the PriorityQueue
keeps them sorted.) The new path is built, including our new location, and we
check to see if the @goal has been reached. When it has, the path is converted
from to a list of coordinates and returned.

If we're not at the goal, we need to extend the path one step in every
direction. Here the helper method spotsfrom() generates the choices to iterate
over. If we've been there before we can skip it, otherwise the new cost is
calculated via the other helper method estimate() and added, with the new path,
to the PriorityQueue.

Let's have a look at those helper methods:

# ...

def estimate(spot)
# quiz statement version
# (spot[0] - @goal[0]).abs + (spot[1] - @goal[1]).abs
# my version
[(spot[0] - @goal[0]).abs, (spot[1] - @goal[1]).abs].max
end

def spotsfrom(spot)
retval = []
vertadds = [0,1]
horizadds = [0,1]
if (spot[0] > 0) then vertadds << -1; end
if (spot[1] > 0) then horizadds << -1; end
vertadds.each{|v| horizadds.each{|h|
if (v != 0 or h != 0) then
ns = [spot[0]+v,spot[1]+h]
if (@terrain[ns[0]] and @terrain[ns[0]][ns[1]]) then
retval << ns
end
end
}}
retval
end
end

The estimate() method is Daniel's improvement to distance calculation. Since
diagonal moves are allowed to shorten two distances at once, we really just need
to consider the longer distance, vertically or horizontally, to the goal.

The other helper builds a list of neighbors by adding combinations of -1, 0, and
1 as offsets to the current location coordinates. The @terrain Array is used to
reality check these coordinates as in-bounds and walkable, before they are added
to the returned Array.

The final solution is just an instance creation and interface method call away:

if __FILE__ == $0
puts Astar.new.do_quiz_solution(ARGF)
end

My thanks to all the pathfinders who fiddled around with the quiz and to Daniel
Martin especially for helping to clarify intent.

Ruby Quiz for this week: Seek James out at RubyConf and introduce yourself!
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top