[QUIZ] Learning Tic-Tac-Toe (#11)

R

Ruby Quiz

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.grayproductions.net/ruby_quiz/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe, with a
catch: You're not allowed to embed any knowledge of the game into your creation
beyond how to make legal moves and recognizing that it has won or lost.

Your program is expected to "learn" from the games it plays, until it masters
the game and can play flawlessly.

Submissions can have any interface, but should be able to play against humans
interactively. However, I also suggest making it easy to play against another
AI, so you can "teach" the program faster.

Being able to monitor the learning progression and know when a program has
mastered the game would be very interesting, if you can manage it.
 
T

trans. (T. Onoma)

Oh no! I don't have time for this but you hit a weak spot! I wrote this very
program in VB over a decade ago!

I must not be tempted.... I must not be tempted....

T.


|
| 1. Please do not post any solutions or spoiler discussion for this quiz
| until 48 hours have passed from the time on this message.
|
| 2. Support Ruby Quiz by submitting ideas as often as you can:
|
| http://www.grayproductions.net/ruby_quiz/
|
| 3. Enjoy!
|
| -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|=-=-=
|
| This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe, with a
| catch: You're not allowed to embed any knowledge of the game into your
| creation beyond how to make legal moves and recognizing that it has won or
| lost.
|
| Your program is expected to "learn" from the games it plays, until it
| masters the game and can play flawlessly.
|
| Submissions can have any interface, but should be able to play against
| humans interactively. However, I also suggest making it easy to play
| against another AI, so you can "teach" the program faster.
|
| Being able to monitor the learning progression and know when a program has
| mastered the game would be very interesting, if you can manage it.

--
( o _ カラãƒ
// trans.
/ \ (e-mail address removed)

ruby -rdrb -e
'DRb.start_service;duck=DRbObject.new(nil,"druby://whytheluckystiff.net:6503");puts
duck.toms'

I don't give a damn for a man that can only spell a word one way.
-Mark Twain

[8,16,20,29,78,65,2,14,26,12,12,28,71,114,12,13,12,82,72,21,17,4,10,2,95].
each_with_index{|x,i| $><<(x^'Begin landing your troops').chr}
-Tadayoshi Funaba
 
B

Brian Schröder

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz
until 48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.grayproductions.net/ruby_quiz/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-=-=

This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe, with a
catch: You're not allowed to embed any knowledge of the game into your
creation beyond how to make legal moves and recognizing that it has won or
lost.

Your program is expected to "learn" from the games it plays, until it masters
the game and can play flawlessly.

Submissions can have any interface, but should be able to play against humans
interactively. However, I also suggest making it easy to play against
another AI, so you can "teach" the program faster.

Being able to monitor the learning progression and know when a program has
mastered the game would be very interesting, if you can manage it.

It is not fair to set up an exciting quiz like this only three days before my
exam. How can I resist!

Cheers,

Brian
 
J

James Edward Gray II

It is not fair to set up an exciting quiz like this only three days
before my
exam. How can I resist!

Save it for the courtroom, pal! ;)

James Edward Gray II
 
B

Brian Schröder

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz
until 48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.grayproductions.net/ruby_quiz/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-=-=

This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe, with a
catch: You're not allowed to embed any knowledge of the game into your
creation beyond how to make legal moves and recognizing that it has won or
lost.

Your program is expected to "learn" from the games it plays, until it masters
the game and can play flawlessly.

Submissions can have any interface, but should be able to play against humans
interactively. However, I also suggest making it easy to play against
another AI, so you can "teach" the program faster.

Being able to monitor the learning progression and know when a program has
mastered the game would be very interesting, if you can manage it.

Oh, I couldn't resist this cribbling in my fingers...

Can we access the state of the world or can we access the action the opponent
took. If we can't access anything, it will be quite hard to play perfect.

Regards,

Brian
 
J

James Edward Gray II

Oh, I couldn't resist this cribbling in my fingers...

But I'll be blamed, of course. :D
Can we access the state of the world

The board? I can't think of how you would "make legal moves" without
it.
or can we access the action the opponent took. If we can't access
anything, it will be quite hard to play perfect.

If you see the board at move N and move N + 1, I suspect you'll know
what your opponent did.

James Edward Gray II
 
B

Brian Schröder

But I'll be blamed, of course. :D


The board? I can't think of how you would "make legal moves" without
it.

I expect the world to tell me what the legal moves are. Otherwise my agent
would have knowledge of workings of the world that she should not have.
If you see the board at move N and move N + 1, I suspect you'll know
what your opponent did.

Well, no ;). If we don't know anything about the game played we also don't know
what moves are legal moves for the opponent, and in what the moves result. The
whole thing gets even harder if the world is probabilistic and we can't be
shure that our moves always result in the same change in the world.

But for the sake of simplicity I simply let my ai act, as if the world wasn't
probabilistic, and the good thing is - its right indeed.

Regards,

Brian
 
J

James Edward Gray II

I expect the world to tell me what the legal moves are. Otherwise my
agent
would have knowledge of workings of the world that she should not have.

I'm using a Board object, in my playing around. My goal in writing the
quiz was to outlaw you teaching your program any strategy. It must
learn that.

I personally am okay with seeing the board, but you make your personal
challenge as hard as you like it!

James Edward Gray II
 
J

Jannis Harder

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I think you know the rules and the board but you don't know which moves
are good and which are not.

ps: special sig...


Jannis Harder

"bp6siZmijp5CiZlCiW5CgAAChpbiiZYiiZZCi5aCZ2bs".unpack("m")[0].
unpack("C*").map{|x|x.chr}.join.unpack("B*")[0].scan(/.{24}/){i=7
$&.scan(/..../){print"\e[3#{i-=1};1;40m ";$&.each_byte{|z|
print" #"[z-?0,1]*2}};puts"\e[0m"}
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (Darwin)

iD8DBQFBudFo5YRWfc27RzQRAqaqAJ4wrZI5eu7zqPTwbcT5EGZsqSpl5QCfR6qq
QYtur031SZNOKomdhavbypE=
=7AOX
-----END PGP SIGNATURE-----
 
B

Brian Schröder

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I think you know the rules and the board but you don't know which moves
are good and which are not.

ps: special sig...


Jannis Harder

"bp6siZmijp5CiZlCiW5CgAAChpbiiZYiiZZCi5aCZ2bs".unpack("m")[0].
unpack("C*").map{|x|x.chr}.join.unpack("B*")[0].scan(/.{24}/){i=7
$&.scan(/..../){print"\e[3#{i-=1};1;40m ";$&.each_byte{|z|
print" #"[z-?0,1]*2}};puts"\e[0m"}

Nice and colourfull. Do you use a compact sig creator program?

Regards,

Brian
 
J

Jannis Harder

Am 10.12.2004 um 18:15 schrieb Brian Schröder:
Nice and colourfull. Do you use a compact sig creator program?

Offtopic:

Not a creator program.... an "universal" "color" "picture" sig ;)


the first string in the sig is a base64ed b/w raw image (24*n pixel)

every 4 pixel the sig inserts a blank pixel (the space between the
letters)
and every 24 a newline.

example: "jp6i..."

j>100011 p>101001 6>111010 i>100010
Cyan Mgnt Blue Yelw Grn Red
jjjj jjpp pppp 6666 66ii iiii
1000 1110 1001 1110 1010 0010
# ### # # ### # # # The first line of >RUBY<
...




"jp6iSZmkLp5ISZlEiW5C" >>

"bp6siZmijp5CiZlCiW5CgAAChpbiiZYiiZZCi5aCZ2bs"
|RUBY|
|QUIZ|

"Lp6oKZmobp5MaZlM6W5O4AAO7pjuaZgsbphMKZiIKW/o"
/RUBY\
\RULZ/




Jannis Harder

"Lp6oKZmoaZ5MaZlM7p5O6ZAO6ZjuaZgsaZhMKZiIKW/o".unpack("m")[0].
unpack("C*").map{|x|x.chr}.join.unpack("B*")[0].scan(/.{24}/){i=7
$&.scan(/..../){print"\e[3#{i-=1};1;40m ";$&.each_byte{
|z|print" #"[z-?0,1]*2}};puts"\e[0m"}
 
B

Brian Schröder

Am 10.12.2004 um 18:15 schrieb Brian Schröder:

Offtopic:

Not a creator program.... an "universal" "color" "picture" sig ;)


the first string in the sig is a base64ed b/w raw image (24*n pixel)

every 4 pixel the sig inserts a blank pixel (the space between the
letters)
and every 24 a newline.

example: "jp6i..."

j>100011 p>101001 6>111010 i>100010
Cyan Mgnt Blue Yelw Grn Red
jjjj jjpp pppp 6666 66ii iiii
1000 1110 1001 1110 1010 0010
# ### # # ### # # # The first line of >RUBY<
...

Really nicely done.
 
B

Brian Schröder

I'm using a Board object, in my playing around. My goal in writing the
quiz was to outlaw you teaching your program any strategy. It must
learn that.

I personally am okay with seeing the board, but you make your personal
challenge as hard as you like it!

James Edward Gray II


In this case, would a minimax-a/b algorithm be teaching the ai a strategie?
Tic-Tac-Toe is so shallow, it can be searched completely by minimax-a/b, so
this would be a one-page solution.

Regards,

Brian
 
J

James Edward Gray II

In this case, would a minimax-a/b algorithm be teaching the ai a
strategie?
Tic-Tac-Toe is so shallow, it can be searched completely by
minimax-a/b, so
this would be a one-page solution.

While you could argue that a minimax search has nothing to do with
Tic-Tac-Toe strategy, I doubt you would convince me on the evaluation
routine a minimax search requires. Now if your program learns how to
evaluate positions over time...

James Edward Gray II
 
M

Michael DeHaan

In this case, would a minimax-a/b algorithm be teaching the ai a strategie?
Tic-Tac-Toe is so shallow, it can be searched completely by minimax-a/b, so
this would be a one-page solution.

Regards,

Brian

Man, I really wish I had the energy after work to work on these
instead of playing Gran Turismo and the like :)

Minimax isn't a learning algorithm, and that's a bit too easy because
you'd be able to completely solve the game (full-depth). Rather than
ban such things, how about making the problem harder?

Arbitrary MxN grids and a set time limit of X seconds for each move.
(Say X = 30 seconds on a 1GHz box or something).

At some level, Minimax would break down without a good fitness
algorithm attached.
 
B

Brian Schröder

While you could argue that a minimax search has nothing to do with
Tic-Tac-Toe strategy, I doubt you would convince me on the evaluation
routine a minimax search requires. Now if your program learns how to
evaluate positions over time...

James Edward Gray II

Well, the evalutation routine -100 for lost, +100 for won, 0 otherwise is not
really a evaluation routine, and minimax with alpha beta pruning can search the
whole problem in 6-7s on my machine.

But if I had to learn the rules, that would be more difficult...

Regards,

Brian
 
J

James Edward Gray II

Well, the evalutation routine -100 for lost, +100 for won, 0 otherwise
is not
really a evaluation routine, and minimax with alpha beta pruning can
search the whole problem in 6-7s on my machine.

So what is your program learning at this point and thus how do you
justify it as a valid solution? ;)

James Edward Gray II
 
B

Brian Schröder

So what is your program learning at this point and thus how do you
justify it as a valid solution? ;)

Good Question. At least I know have a perfect opponent for my learning
algorithm. The problem is, that it is impossible to win against a perfect
opponent in TicTacToe. So I won't learn more than not to loose.

Regards,

Brian
 
T

trans. (T. Onoma)

15 PM, Brian Schröder wrote:
| > Well, the evalutation routine -100 for lost, +100 for won, 0 otherwise
| > is not
| > really a evaluation routine, and minimax with alpha beta pruning can
| > search the whole problem in 6-7s on my machine.
|
| So what is your program learning at this point and thus how do you
| justify it as a valid solution? ;)
|
| James Edward Gray II

Does it have to learn how to play tic-tac-toe? Or just how to win?

T.
 
J

James Edward Gray II

Does it have to learn how to play tic-tac-toe? Or just how to win?

In my opinion, all that matters is what the human playing against it
will see. How's that for a dodge? ;)

James Edward Gray II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top