[QUIZ] Learning Tic-Tac-Toe (#11)

M

martinus

Your idea with the operators is very good. Please post your program
when you are finished!

I used a very simple approach: each individual consists of a sequential
number of rules, which are looked through one after another, and the
first rule that fires is executed. Each rule consists of 9 fields that
are used as a match-template for the world. For example a template
looks like this:

[[EMPTY, DONT_CARE, DONT_CARE, DONT_CARE, X, X, O, X, O], 3]

That means that the template matches if the first field of the world is
empty, field 2, 3, 4 are irrelevent, field 4 has to be an X, and so on.
If the template matches the world, the rule triggeres and the player
tries to put its sign to field 3.

I have implemented the standard operators mutation, crossover, roulette
wheel selection, and co-evolution, but over time the population always
gets stuck in a local maximum very early, the solutions are barely able
to play two moves.

martinus
 
J

James Edward Gray II

| This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe,
| with a catch: You're not allowed to embed any knowledge of the game
| into your creation beyond how to make legal moves and recognizing
that
| it has won or lost.
|
| Your program is expected to "learn" from the games it plays, until it
| masters the game and can play flawlessly.

So, I have also tried to program a learning AI player. However, it
still
does not do what it should and I do not know why. Maybe someone with
more brains can help???

I'm not able to see an obvious flaw in your logic, but I did want to
compliment you on your interface. When I saw it I though, "How did
Thomas know to write to my Tic-Tac-Toe library?" I'll post it below so
you can see how similar it is. :D

James Edward Gray II

#!/usr/bin/env ruby

module TicTacToe
module SquaresContainer
def []( index ) @squares[index] end

def blanks() @squares.find_all { |s| s == " " }.size end
def os() @squares.find_all { |s| s == "O" }.size end
def xs() @squares.find_all { |s| s == "X" }.size end
end

class Board
class Row
def initialize( squares, names )
@squares = squares
@names = names
end

include SquaresContainer

def to_board_name( index ) Board.index_to_name(@names[index]) end
end

def self.name_to_index( name )
x = name.gsub!(/([a-cA-C])/, "").to_i - 1
y = ($1.downcase)[0] - ?a
x + y * 3
end

def self.index_to_name( index )
if index >= 6
"c" + (index - 5).to_s
elsif index >= 3
"b" + (index - 2).to_s
else
"a" + (index + 1).to_s
end
end

def initialize( squares )
@squares = squares
end

include SquaresContainer

def []( *indices )
if indices.size == 2
super indices[0] + indices[1] * 3
elsif indices[0].is_a? Fixnum
super indices[0]
else
super Board.name_to_index(indices[0].to_s)
end
end

def each_row
rows = [ [0, 1, 2], [3, 4, 5], [6, 7, 8],
[0, 3, 6], [1, 4, 7], [2, 5, 8],
[0, 4, 8], [2, 4, 6] ]
rows.each do |e|
yield Row.new(@squares.values_at(*e), e)
end
end

def moves
moves = [ ]
@squares.each_with_index do |s, i|
moves << Board.index_to_name(i) if s == " "
end
moves
end

def won?
each_row do |row|
return "X" if row.xs == 3
return "O" if row.os == 3
end
return " " if blanks == 0

false
end

def to_s
@squares.join
end
end

class Player
def initialize( pieces )
@pieces = pieces
end

attr_reader :pieces

def move( board )
raise NotImplementedError, "Player subclasses must define move()."
end

def finish( final_board ) end
end

class HumanPlayer < Player
def move( board )
draw_board board

moves = board.moves
print "Your move? (format: b3) "
move = gets
move.chomp!
until moves.include?(move.downcase)
print "Invalid move. Try again. "
move = gets
move.chomp!
end
move
end

def finish( final_board )
draw_board final_board

if final_board.won? == @pieces
print "Congratulations, you win.\n\n"
elsif final_board.won? == " "
print "Tie game.\n\n"
else
print "You lost Tic-Tac-Toe?!\n\n"
end
end

private

def draw_board( board )
rows = [ [0, 1, 2], [3, 4, 5], [6, 7, 8] ]
names = %w{a b c}
puts
print(rows.map do |r|
names.shift + " " + r.map { |e| board[e] }.join(" | ") + "\n"
end.join(" ---+---+---\n"))
print " 1 2 3\n\n"
end
end

class DumbPlayer < Player
def move( board )
moves = board.moves
moves[rand(moves.size)]
end
end

class SmartPlayer < Player
def move( board )
moves = board.moves
board.each_row do |row|
if row.blanks == 1 and (row.xs == 2 or row.os == 2)
(0..2).each do |e|
return row.to_board_name(e) if row[e] == " "
end
end
end

if board[0] != @pieces and board[0] != " " and board[8] == " "
return "c3"
elsif board[8] != @pieces and board[8] != " " and board[0] == " "
return "a1"
elsif board[2] != @pieces and board[2] != " " and board[6] == " "
return "c1"
elsif board[6] != @pieces and board[6] != " " and board[2] == " "
return "a3"
end

return "b2" if moves.include? "b2"

return "a1" if moves.include? "a1"
return "a3" if moves.include? "a3"
return "c1" if moves.include? "c1"
return "c3" if moves.include? "c3"

moves[rand(moves.size)]
end
end

class Game
def initialize( player1, player2 )
if rand(2) == 1
@x_player = player1.new("X")
@o_player = player2.new("O")
else
@x_player = player2.new("X")
@o_player = player1.new("O")
end

@board = Board.new([" "] * 9)
end

attr_reader :x_player, :eek:_player

def play
until @board.won?
update_board @x_player.move(@board), @x_player.pieces
break if @board.won?

update_board @o_player.move(@board), @o_player.pieces
end

if @o_player.is_a? HumanPlayer
@o_player.finish @board
@x_player.finish @board
else
@x_player.finish @board
@o_player.finish @board
end
end

private

def update_board( move, piece )
m = Board.name_to_index(move)
@board = Board.new((0..8).map { |i| i == m ? piece : @board })
end
end
end

if __FILE__ == $0
if ARGV.size > 0 and ARGV[0] == "-d"
ARGV.shift
game = TicTacToe::Game.new TicTacToe::HumanPlayer,
TicTacToe::DumbPlayer
else
game = TicTacToe::Game.new TicTacToe::HumanPlayer,
TicTacToe::SmartPlayer
end
game.play
end
 
B

Brian Schröder

| This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe,
| with a catch: You're not allowed to embed any knowledge of the game
| into your creation beyond how to make legal moves and recognizing that
| it has won or lost.
|
| Your program is expected to "learn" from the games it plays, until it
| masters the game and can play flawlessly.

So, I have also tried to program a learning AI player. However, it still
does not do what it should and I do not know why. Maybe someone with
more brains can help???

Here is the code for the AI player:

class AIPlayer < Player

def initialize( game, sign )
super( game, sign )
@stats = {}
@cur_stats = []
end

def move
unless @stats.has_key? @game.board
@stats[@game.board.dup] = {}
@game.board.valid_moves.each {|m| @stats[@game.board][m] = 0}
end
moves = @stats[@game.board].sort {|a,b| a[1] <=> b[1]}
result = moves.shift
result = moves.shift while (moves.length > 0) && rand < 0.02
@cur_stats << [@game.board.dup, result[0]]
return result[0]
end

def game_finished( result )
mult = case result
when :won then 1000
when :lost then -100
else 0
end
@cur_stats.each_with_index do |o,i|
@stats[o[0]][o[1]] += mult * 2**i
end
@cur_stats = []
end

end

A short description for the code:

The @game object is the current game and knows the current board. When
it is the AI players move, the #move method is called and the AI player
has to return a number between 0 and 8. Thats because my tictactoe board
is a simple array and the array inidices correspond to these fields:

0 1 2
3 4 5
6 7 8

In the #move method, I create a new entry in the @stats Hash (key is the
board) if it does not have the current board as key and initialize its
value with a Hash (keys are the valid fields and values are set to 0).
After that the Hash with the valid moves for this board is taken from
the @stats Hash and sorted so that the moves with the highest
probability are in front. Then the first, or if rand < 0.02 the second
or if rand < 0.0.2 the third, ... value is returned.

If the game has finished, the #game_finished method is called and the
moves that were chosen in the game are assigned new values depending on
the result of the game.

If I understand your code correctly, you choose with highest probability the
worst move ;). You should either use moves.pop or sort in reverse order.

Note that I used exactly the opposite evaluation function: -1000 for loss and
100 for win, because it is impossible to win against a decent player and I
wanted to avoid losses and tend to play draws. Though I don't know if this
really makes a difference compared with -1 0 1 or other numbers. I hope my
choice does not lead to weird psychoanalysis of me ;)

Best Regards,

Brian
 
T

Thomas Leitner

On Tue, 14 Dec 2004 09:04:53 +0900

| If I understand your code correctly, you choose with highest
| probability the worst move ;). You should either use moves.pop or sort
| in reverse order.
|
| Note that I used exactly the opposite evaluation function: -1000 for
| loss and 100 for win, because it is impossible to win against a decent
| player and I wanted to avoid losses and tend to play draws. Though I
| don't know if this really makes a difference compared with -1 0 1 or
| other numbers. I hope my choice does not lead to weird psychoanalysis
| of me ;)
|
| Best Regards,
|
| Brian
|
| --
| Brian Schröder
| http://www.brian-schroeder.de/
|

Thanks for your help! However, after I changed the code to use
moves.pop, nothing changed. I.e. the learning AI player was as bad as
was before.

Thanks for helping,
Thomas
 
T

Thomas Leitner

| On Fri, 10 Dec 2004 23:29:02 +0900
|
| > This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe,
| > with a catch: You're not allowed to embed any knowledge of the game
| > into your creation beyond how to make legal moves and recognizing
| > that it has won or lost.
| >
| > Your program is expected to "learn" from the games it plays, until
| > it masters the game and can play flawlessly.
| >
| > Submissions can have any interface, but should be able to play
| > against humans interactively. However, I also suggest making it
| > easy to play against another AI, so you can "teach" the program
| > faster.
| >
| > Being able to monitor the learning progression and know when a
| > program has mastered the game would be very interesting, if you can
| > manage it.
|
|

I have attached my solution to this Ruby Quiz.

My first solution for this problem did not quite work out as I wanted it
to (I still do not know why it does not work correctly... :-( ).
Therefore I tried another thing and this one works.

If you want to try the AI, chose it as first player and let it play
against the Random Player until the Random Player does not win anymore.
Than swap positions, ie. AI as second player and Random Player as first
player, and wait again. After that the AI should be (nearly ;-) perfect!

One word to AIPlayer#game_finished: I tried different values for the
winning and losing scores. When the winning score was set to 100 and the
losing score to -100, the AI did not learn correctly. Therefore, the
losing score is now much higher than the winning score, as the AI then
tends to avoid loses (this was also pointed out by Brian Schröder in
ruby-talk#117748).

Bye,
Thomas
 
J

James Edward Gray II

I have attached my solution to this Ruby Quiz.

Just FYI. Here's my latest game with your AI. It plays well, as you
can see, but an exception is thrown:

# One game or endless (o, e)? o
# Welcome to TicTacToe
# Valid players: Random Move Player, Learning AI Player, Human Mind
Player
# Choose first player (enter shortest unambiguous text) : learn
# Choose second player (enter shortest unambiguous text) : human
---
-0-
---
Valid moves: 0,1,2,3,5,6,7,8
0
10-
-0-
---
Valid moves: 2,3,5,6,7,8
7
10-
-0-
01-
Valid moves: 2,3,5,8
2
101
-00
01-
Valid moves: 3,8
3
draw
tictactoe.rb:210:in `play_tictactoe': undefined local variable or
method `game' for #<UserInterface:0x35021c> (NameError)
from tictactoe.rb:250

James Edward Gray II
 
G

Giovanni Intini

It happens when you play a single game. I didn't try to understand
why, but if you choose endless you get no problem.
 
T

Thomas Leitner

On Thu, 16 Dec 2004 07:01:31 +0900

| On Dec 14, 2004, at 7:17 PM, Thomas Leitner wrote:
|
| > I have attached my solution to this Ruby Quiz.
|
| Just FYI. Here's my latest game with your AI. It plays well, as you
| can see, but an exception is thrown:
|
| [SNIP]
|
| 3
| draw
| tictactoe.rb:210:in `play_tictactoe': undefined local variable or
| method `game' for #<UserInterface:0x35021c> (NameError)
| from tictactoe.rb:250
|
| James Edward Gray II
|

Thanks, James!

I should have played in the single game once. Here is the correct
UserInterface#play_tictactoe method:

def play_tictactoe
game = init_game
case game.play
when 0 then puts "first player won"
when 1 then puts "second player won"
when 2 then puts "draw"
end
game.players.each {|p| p.save}
end

Btw. does anybody know a good book about test driven development?

Thomas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,481
Members
44,900
Latest member
Nell636132

Latest Threads

Top