[SUMMARY] Paper Rock Scissors (#16)

Discussion in 'Ruby' started by Ruby Quiz, Jan 27, 2005.

  1. Ruby Quiz

    Ruby Quiz Guest

    This week's quiz is a classic computer science problem in disguise. It's
    generally done with the Prisoner's Dilemma:

    http://plato.stanford.edu/entries/prisoner-dilemma/

    The game chosen doesn't much matter, but the idea is that there really shouldn't
    be much strategy involved. In the Prisoner's Dilemma, it's generally agreed
    that it's hard to beat a player that confesses every time. For the game of
    Paper Rock Scissors, the winning strategy is to be purely random, as Benedikt
    Huber explained on Ruby Talk:

    You can't give any predictions on the next move of a random player.
    Therefore you have a 1/3 prop. to choose a winning, losing
    or drawing move.

    To be fair, Paper Rock Scissors does have quite a bit of strategy theory these
    days, but the conditions of that theory (mostly body language) are unavailable
    to computer players. Entire books have been written on the subject, believe it
    or not:

    http://www.worldrps.com/

    So random is the best we can do? Is that hard to build? Uh, no. Here's a
    sample by Avi Bryant:

    class AJBRandomPlayer < Player
    def choose
    [:paper, :scissors, :rock][rand(3)]
    end
    end

    If we test that, we get the expected 50/50 results:

    AJBRandomPlayer vs. JEGPaperPlayer
    AJBRandomPlayer: 511.0
    JEGPaperPlayer: 489.0
    AJBRandomPlayer Wins
    AJBRandomPlayer vs. JEGQueuePlayer
    AJBRandomPlayer: 499.5
    JEGQueuePlayer: 500.5
    JEGQueuePlayer Wins

    Of course, that's so uninteresting, we're probably beginning to wonder if
    James's quiz selecting skills are on the fritz. Possibly, but interesting
    solutions make me look good none-the-less. This week, Christian Neukirchen sent
    in more than one of those:

    CNBiasInverter: Choose so that your bias will be the inverted
    opponent's bias.

    CNIrrflug: Pick a random choice. If you win, use it again; else,
    use a random choice.

    CNStepAhead: Try to think a step ahead. If you win, use the choice
    where you'd have lost. If you lose, you the choice where you'd
    have won. Use the same on draw.

    CNBiasFlipper: Always use the choice that hits what the opponent
    said most or second-to-most often (if the most often choice is not
    absolutely prefered).

    CNBiasBreaker: Always use the choice that hits what the opponent
    said most often.

    CNMeanPlayer: Pick a random choice. If you win, use it again; else,
    use the opponent's choice.

    I really should show all of those here, but that would make for a ridiculously
    large summary. Let's go with Christian's favorite:

    class CNBiasInverter < Player
    def initialize(opponent)
    super
    @biases = {:rock => 0, :scissors => 0, :paper => 0}
    @hit = {:rock => :paper, :paper => :scissors, :scissors => :rock}
    end

    def choose
    n = ::Kernel.rand(@biases[:rock] + @biases[:scissors] +
    @biases[:paper]).to_i
    case n
    when 0..@biases[:rock]
    :paper
    when @biases[:rock]..@biases[:rock]+@biases[:scissors]
    :rock
    when @biases[:rock]+@biases[:scissors]..@biases[:rock]+
    @biases[:scissors]+@biases[:paper]
    :scissors
    else
    p @biases[:rock]+@biases[:scissors]..@biases[:paper]
    abort n.to_s
    end
    end

    def result( you, them, win_lose_or_draw )
    @biases[them] += 1
    end
    end

    initialize() sets up the a Hash for tracking the biases. (I don't believe @hit
    is needed.) result() is the compliment to that. It adjusts the proper bias
    count each time the opponent makes a selection.

    choose() does all the interesting work. A random number is chosen between 0 and
    the total of all the bias counts. That number is then associated with the
    indicated bias by some clever use of ranges and the opposite of that bias is
    returned as CNBiasInverter's choice.

    In other words, as the opponent chooses more and more of a particular item, the
    bias count for that item climbs. This will cause the semi-random choice to
    drift towards the opposite of that favored move.

    Let's compare with our baseline:

    CNBiasInverter vs. JEGPaperPlayer
    CNBiasInverter: 995.0
    JEGPaperPlayer: 5.0
    CNBiasInverter Wins
    CNBiasInverter vs. JEGQueuePlayer
    CNBiasInverter: 653.5
    JEGQueuePlayer: 346.5
    CNBiasInverter Wins

    The results are getting better. But, of course, random is still trump:

    AJBRandomPlayer vs. CNBiasInverter
    AJBRandomPlayer: 509.5
    CNBiasInverter: 490.5
    AJBRandomPlayer Wins

    There were many, many interesting strategies, like the one above. But random
    remained the great equalizer. Which leads us to the critical question: What
    exactly is the point of this exercise?

    Cheating, of course!

    With the Prisoner's Dilemma and this quiz, it's common the engineer the
    environment to be ripe for cheating. Since there's no winning strategy
    available, we'll need to bend the rules a little bit. That's because
    programmers have enormous egos and can't stand to lose at anything!

    (Note: Technically, no one even cheated. The definition of cheat that applies
    here is, "to violate rules dishonestly." Go back and reread the quiz, if you
    need to...)

    What's the ultimate cheat? Well, here's the first by Bill Atkins:

    class BACheater < Player
    def initialize opponent
    Object.const_get(opponent).send :define_method, :choose do
    :paper
    end
    end

    def choose
    :scissors
    end
    end

    It doesn't get much simpler than that! Bill's initialize() method uses the
    passed in name of the opponent to locate the correct Class object and redefine
    the choose() method of that Class to something super easy to deal with. The
    opponent is modified to always throw :paper and BACheater always throws
    :scissors.

    That's 100% successful against anything we've seen thus far. Worse, you're
    player is permanently modified when it goes up against BACheater, leaving you
    vulnerable to clever strategies like CNBiasInverter above:

    AJBRandomPlayer vs. BACheater
    AJBRandomPlayer: 0
    BACheater: 1000
    BACheater Wins
    AJBRandomPlayer vs. CNBiasInverter
    AJBRandomPlayer: 4.5
    CNBiasInverter: 995.5
    CNBiasInverter Wins
    BACheater vs. CNBiasInverter
    BACheater: 1000
    CNBiasInverter: 0
    BACheater Wins

    Ouch!

    Another cheat used by more than one player was to try and predict an opponent's
    move, then respond with a counter. Here is Benedikt Huber's version:

    KILLER = { :rock => :paper,
    :paper => :scissors,
    :scissors => :rock }

    class BHCheatPlayer < Player

    def initialize( opponent )
    super
    @opp = Object.const_get(opponent).new(self)
    end

    def choose
    KILLER[@opp.choose]
    end

    def result(you,them,result)
    @opp.result(them,you,result)
    end

    end

    Again initialize() retrieves the Class object, but instead of modifying the
    Class, it simply creates an internal copy of the opponent. result() forwards
    each pick to the copied opponent, to keep it synchronized with the real
    opponent. From there, choose() is obvious: See what the opponent is about to
    do and counter.

    It was pointed out on Ruby Talk that this doesn't demolish random players;
    however, against any random strategy, this becomes a random player. Countering
    a random choice is a still a random move, even if the choice isn't what the
    opponent is about to do.

    There were other great cheats. Jannis Harder's self-repairing program and
    FGSpyPlayer (well commented) are both worth studying. Some cheating approaches
    were also overlooked. For example, no one tried to modify the score, but it can
    be done. There were also a lot of excellent non-cheating solutions. Which
    leaves me with no choice but to say the lame but utterly true, "All the
    submissions are worth a look!"

    My thanks to all my fellow cheaters and the rule abiding players that tolerate
    our antics.

    Tomorrow's quiz was an actual job project of mine, years ago...
     
    Ruby Quiz, Jan 27, 2005
    #1
    1. Advertising

  2. Ruby Quiz

    martinus Guest

    > BTW, I'd be interested if anyone found a nicer looking solution to
    > do a weighted rand.


    What you are doing is called Roulett wheel selection, which is often
    used as a selection algorithm when using genetic algorithms. I think
    this is a bit nicer:

    def choose
    # sum up all values
    sum = @biases.values.inject(0) {|sum, val| sum + val }
    # choose something
    search_sum = rand(sum)
    # count up until found
    current_sum = 0
    found = @biases.detect do |key, value|
    current_sum += value
    current_sum > search_sum
    end
    found[0]
    end

    martinus
     
    martinus, Jan 27, 2005
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. TooNaive

    Building Paper, Rock, Scissors

    TooNaive, Jun 14, 2004, in forum: C Programming
    Replies:
    7
    Views:
    562
    T.M. Sommers
    Jun 14, 2004
  2. Replies:
    2
    Views:
    637
  3. Ruby Quiz

    [QUIZ] Paper Rock Scissors (#16)

    Ruby Quiz, Jan 21, 2005, in forum: Ruby
    Replies:
    24
    Views:
    334
    Florian Gross
    Jan 25, 2005
  4. Alex Norton
    Replies:
    1
    Views:
    133
    Denis McMahon
    May 13, 2013
  5. Benjamin Kaplan
    Replies:
    0
    Views:
    139
    Benjamin Kaplan
    May 12, 2013
Loading...

Share This Page