[QUIZ] Learning Tic-Tac-Toe (#11)

Discussion in 'Ruby' started by Ruby Quiz, Dec 10, 2004.

  1. Ruby Quiz

    Ruby Quiz Guest

    The three rules of Ruby Quiz:

    1. Please do not post any solutions or spoiler discussion for this quiz until
    48 hours have passed from the time on this message.

    2. Support Ruby Quiz by submitting ideas as often as you can:

    http://www.grayproductions.net/ruby_quiz/

    3. Enjoy!

    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe, with a
    catch: You're not allowed to embed any knowledge of the game into your creation
    beyond how to make legal moves and recognizing that it has won or lost.

    Your program is expected to "learn" from the games it plays, until it masters
    the game and can play flawlessly.

    Submissions can have any interface, but should be able to play against humans
    interactively. However, I also suggest making it easy to play against another
    AI, so you can "teach" the program faster.

    Being able to monitor the learning progression and know when a program has
    mastered the game would be very interesting, if you can manage it.
     
    Ruby Quiz, Dec 10, 2004
    #1
    1. Advertisements

  2. Oh no! I don't have time for this but you hit a weak spot! I wrote this very
    program in VB over a decade ago!

    I must not be tempted.... I must not be tempted....

    T.


    |
    | 1. Please do not post any solutions or spoiler discussion for this quiz
    | until 48 hours have passed from the time on this message.
    |
    | 2. Support Ruby Quiz by submitting ideas as often as you can:
    |
    | http://www.grayproductions.net/ruby_quiz/
    |
    | 3. Enjoy!
    |
    | -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
    |=-=-=
    |
    | This week's Ruby Quiz is to implement an AI for playing Tic-Tac-Toe, with a
    | catch: You're not allowed to embed any knowledge of the game into your
    | creation beyond how to make legal moves and recognizing that it has won or
    | lost.
    |
    | Your program is expected to "learn" from the games it plays, until it
    | masters the game and can play flawlessly.
    |
    | Submissions can have any interface, but should be able to play against
    | humans interactively. However, I also suggest making it easy to play
    | against another AI, so you can "teach" the program faster.
    |
    | Being able to monitor the learning progression and know when a program has
    | mastered the game would be very interesting, if you can manage it.

    --
    ( o _ カラãƒ
    // trans.
    / \

    ruby -rdrb -e
    'DRb.start_service;duck=DRbObject.new(nil,"druby://whytheluckystiff.net:6503");puts
    duck.toms'

    I don't give a damn for a man that can only spell a word one way.
    -Mark Twain

    [8,16,20,29,78,65,2,14,26,12,12,28,71,114,12,13,12,82,72,21,17,4,10,2,95].
    each_with_index{|x,i| $><<(x^'Begin landing your troops').chr}
    -Tadayoshi Funaba
     
    trans. (T. Onoma), Dec 10, 2004
    #2
    1. Advertisements

  3. It is not fair to set up an exciting quiz like this only three days before my
    exam. How can I resist!

    Cheers,

    Brian
     
    Brian Schröder, Dec 10, 2004
    #3
  4. Save it for the courtroom, pal! ;)

    James Edward Gray II
     
    James Edward Gray II, Dec 10, 2004
    #4
  5. Oh, I couldn't resist this cribbling in my fingers...

    Can we access the state of the world or can we access the action the opponent
    took. If we can't access anything, it will be quite hard to play perfect.

    Regards,

    Brian
     
    Brian Schröder, Dec 10, 2004
    #5
  6. But I'll be blamed, of course. :D
    The board? I can't think of how you would "make legal moves" without
    it.
    If you see the board at move N and move N + 1, I suspect you'll know
    what your opponent did.

    James Edward Gray II
     
    James Edward Gray II, Dec 10, 2004
    #6
  7. I expect the world to tell me what the legal moves are. Otherwise my agent
    would have knowledge of workings of the world that she should not have.
    Well, no ;). If we don't know anything about the game played we also don't know
    what moves are legal moves for the opponent, and in what the moves result. The
    whole thing gets even harder if the world is probabilistic and we can't be
    shure that our moves always result in the same change in the world.

    But for the sake of simplicity I simply let my ai act, as if the world wasn't
    probabilistic, and the good thing is - its right indeed.

    Regards,

    Brian
     
    Brian Schröder, Dec 10, 2004
    #7
  8. I'm using a Board object, in my playing around. My goal in writing the
    quiz was to outlaw you teaching your program any strategy. It must
    learn that.

    I personally am okay with seeing the board, but you make your personal
    challenge as hard as you like it!

    James Edward Gray II
     
    James Edward Gray II, Dec 10, 2004
    #8
  9. -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    I think you know the rules and the board but you don't know which moves
    are good and which are not.

    ps: special sig...


    Jannis Harder

    "bp6siZmijp5CiZlCiW5CgAAChpbiiZYiiZZCi5aCZ2bs".unpack("m")[0].
    unpack("C*").map{|x|x.chr}.join.unpack("B*")[0].scan(/.{24}/){i=7
    $&.scan(/..../){print"\e[3#{i-=1};1;40m ";$&.each_byte{|z|
    print" #"[z-?0,1]*2}};puts"\e[0m"}
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.6 (Darwin)

    iD8DBQFBudFo5YRWfc27RzQRAqaqAJ4wrZI5eu7zqPTwbcT5EGZsqSpl5QCfR6qq
    QYtur031SZNOKomdhavbypE=
    =7AOX
    -----END PGP SIGNATURE-----
     
    Jannis Harder, Dec 10, 2004
    #9
  10. Nice and colourfull. Do you use a compact sig creator program?

    Regards,

    Brian
     
    Brian Schröder, Dec 10, 2004
    #10
  11. Am 10.12.2004 um 18:15 schrieb Brian Schröder:
    Offtopic:

    Not a creator program.... an "universal" "color" "picture" sig ;)


    the first string in the sig is a base64ed b/w raw image (24*n pixel)

    every 4 pixel the sig inserts a blank pixel (the space between the
    letters)
    and every 24 a newline.

    example: "jp6i..."

    j>100011 p>101001 6>111010 i>100010
    Cyan Mgnt Blue Yelw Grn Red
    jjjj jjpp pppp 6666 66ii iiii
    1000 1110 1001 1110 1010 0010
    # ### # # ### # # # The first line of >RUBY<
    ...




    "jp6iSZmkLp5ISZlEiW5C" >>
    "bp6siZmijp5CiZlCiW5CgAAChpbiiZYiiZZCi5aCZ2bs"
    |RUBY|
    |QUIZ|

    "Lp6oKZmobp5MaZlM6W5O4AAO7pjuaZgsbphMKZiIKW/o"
    /RUBY\
    \RULZ/




    Jannis Harder

    "Lp6oKZmoaZ5MaZlM7p5O6ZAO6ZjuaZgsaZhMKZiIKW/o".unpack("m")[0].
    unpack("C*").map{|x|x.chr}.join.unpack("B*")[0].scan(/.{24}/){i=7
    $&.scan(/..../){print"\e[3#{i-=1};1;40m ";$&.each_byte{
    |z|print" #"[z-?0,1]*2}};puts"\e[0m"}
     
    Jannis Harder, Dec 10, 2004
    #11
  12. Really nicely done.
     
    Brian Schröder, Dec 10, 2004
    #12

  13. In this case, would a minimax-a/b algorithm be teaching the ai a strategie?
    Tic-Tac-Toe is so shallow, it can be searched completely by minimax-a/b, so
    this would be a one-page solution.

    Regards,

    Brian
     
    Brian Schröder, Dec 10, 2004
    #13
  14. While you could argue that a minimax search has nothing to do with
    Tic-Tac-Toe strategy, I doubt you would convince me on the evaluation
    routine a minimax search requires. Now if your program learns how to
    evaluate positions over time...

    James Edward Gray II
     
    James Edward Gray II, Dec 10, 2004
    #14
  15. Man, I really wish I had the energy after work to work on these
    instead of playing Gran Turismo and the like :)

    Minimax isn't a learning algorithm, and that's a bit too easy because
    you'd be able to completely solve the game (full-depth). Rather than
    ban such things, how about making the problem harder?

    Arbitrary MxN grids and a set time limit of X seconds for each move.
    (Say X = 30 seconds on a 1GHz box or something).

    At some level, Minimax would break down without a good fitness
    algorithm attached.
     
    Michael DeHaan, Dec 10, 2004
    #15
  16. Well, the evalutation routine -100 for lost, +100 for won, 0 otherwise is not
    really a evaluation routine, and minimax with alpha beta pruning can search the
    whole problem in 6-7s on my machine.

    But if I had to learn the rules, that would be more difficult...

    Regards,

    Brian
     
    Brian Schröder, Dec 10, 2004
    #16
  17. So what is your program learning at this point and thus how do you
    justify it as a valid solution? ;)

    James Edward Gray II
     
    James Edward Gray II, Dec 10, 2004
    #17
  18. Good Question. At least I know have a perfect opponent for my learning
    algorithm. The problem is, that it is impossible to win against a perfect
    opponent in TicTacToe. So I won't learn more than not to loose.

    Regards,

    Brian
     
    Brian Schröder, Dec 10, 2004
    #18
  19. 15 PM, Brian Schröder wrote:
    | > Well, the evalutation routine -100 for lost, +100 for won, 0 otherwise
    | > is not
    | > really a evaluation routine, and minimax with alpha beta pruning can
    | > search the whole problem in 6-7s on my machine.
    |
    | So what is your program learning at this point and thus how do you
    | justify it as a valid solution? ;)
    |
    | James Edward Gray II

    Does it have to learn how to play tic-tac-toe? Or just how to win?

    T.
     
    trans. (T. Onoma), Dec 10, 2004
    #19
  20. In my opinion, all that matters is what the human playing against it
    will see. How's that for a dodge? ;)

    James Edward Gray II
     
    James Edward Gray II, Dec 10, 2004
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.