R
Ruby Quiz
This week's quiz is a classic computer science problem in disguise. It's
generally done with the Prisoner's Dilemma:
http://plato.stanford.edu/entries/prisoner-dilemma/
The game chosen doesn't much matter, but the idea is that there really shouldn't
be much strategy involved. In the Prisoner's Dilemma, it's generally agreed
that it's hard to beat a player that confesses every time. For the game of
Paper Rock Scissors, the winning strategy is to be purely random, as Benedikt
Huber explained on Ruby Talk:
You can't give any predictions on the next move of a random player.
Therefore you have a 1/3 prop. to choose a winning, losing
or drawing move.
To be fair, Paper Rock Scissors does have quite a bit of strategy theory these
days, but the conditions of that theory (mostly body language) are unavailable
to computer players. Entire books have been written on the subject, believe it
or not:
http://www.worldrps.com/
So random is the best we can do? Is that hard to build? Uh, no. Here's a
sample by Avi Bryant:
class AJBRandomPlayer < Player
def choose
[
aper, :scissors, :rock][rand(3)]
end
end
If we test that, we get the expected 50/50 results:
AJBRandomPlayer vs. JEGPaperPlayer
AJBRandomPlayer: 511.0
JEGPaperPlayer: 489.0
AJBRandomPlayer Wins
AJBRandomPlayer vs. JEGQueuePlayer
AJBRandomPlayer: 499.5
JEGQueuePlayer: 500.5
JEGQueuePlayer Wins
Of course, that's so uninteresting, we're probably beginning to wonder if
James's quiz selecting skills are on the fritz. Possibly, but interesting
solutions make me look good none-the-less. This week, Christian Neukirchen sent
in more than one of those:
CNBiasInverter: Choose so that your bias will be the inverted
opponent's bias.
CNIrrflug: Pick a random choice. If you win, use it again; else,
use a random choice.
CNStepAhead: Try to think a step ahead. If you win, use the choice
where you'd have lost. If you lose, you the choice where you'd
have won. Use the same on draw.
CNBiasFlipper: Always use the choice that hits what the opponent
said most or second-to-most often (if the most often choice is not
absolutely prefered).
CNBiasBreaker: Always use the choice that hits what the opponent
said most often.
CNMeanPlayer: Pick a random choice. If you win, use it again; else,
use the opponent's choice.
I really should show all of those here, but that would make for a ridiculously
large summary. Let's go with Christian's favorite:
class CNBiasInverter < Player
def initialize(opponent)
super
@biases = {:rock => 0, :scissors => 0,
aper => 0}
@hit = {:rock =>
aper,
aper => :scissors, :scissors => :rock}
end
def choose
n = ::Kernel.rand(@biases[:rock] + @biases[:scissors] +
@biases[
aper]).to_i
case n
when 0..@biases[:rock]
aper
when @biases[:rock]..@biases[:rock]+@biases[:scissors]
:rock
when @biases[:rock]+@biases[:scissors]..@biases[:rock]+
@biases[:scissors]+@biases[
aper]
:scissors
else
p @biases[:rock]+@biases[:scissors]..@biases[
aper]
abort n.to_s
end
end
def result( you, them, win_lose_or_draw )
@biases[them] += 1
end
end
initialize() sets up the a Hash for tracking the biases. (I don't believe @hit
is needed.) result() is the compliment to that. It adjusts the proper bias
count each time the opponent makes a selection.
choose() does all the interesting work. A random number is chosen between 0 and
the total of all the bias counts. That number is then associated with the
indicated bias by some clever use of ranges and the opposite of that bias is
returned as CNBiasInverter's choice.
In other words, as the opponent chooses more and more of a particular item, the
bias count for that item climbs. This will cause the semi-random choice to
drift towards the opposite of that favored move.
Let's compare with our baseline:
CNBiasInverter vs. JEGPaperPlayer
CNBiasInverter: 995.0
JEGPaperPlayer: 5.0
CNBiasInverter Wins
CNBiasInverter vs. JEGQueuePlayer
CNBiasInverter: 653.5
JEGQueuePlayer: 346.5
CNBiasInverter Wins
The results are getting better. But, of course, random is still trump:
AJBRandomPlayer vs. CNBiasInverter
AJBRandomPlayer: 509.5
CNBiasInverter: 490.5
AJBRandomPlayer Wins
There were many, many interesting strategies, like the one above. But random
remained the great equalizer. Which leads us to the critical question: What
exactly is the point of this exercise?
Cheating, of course!
With the Prisoner's Dilemma and this quiz, it's common the engineer the
environment to be ripe for cheating. Since there's no winning strategy
available, we'll need to bend the rules a little bit. That's because
programmers have enormous egos and can't stand to lose at anything!
(Note: Technically, no one even cheated. The definition of cheat that applies
here is, "to violate rules dishonestly." Go back and reread the quiz, if you
need to...)
What's the ultimate cheat? Well, here's the first by Bill Atkins:
class BACheater < Player
def initialize opponent
Object.const_get(opponent).send :define_method, :choose do
aper
end
end
def choose
:scissors
end
end
It doesn't get much simpler than that! Bill's initialize() method uses the
passed in name of the opponent to locate the correct Class object and redefine
the choose() method of that Class to something super easy to deal with. The
opponent is modified to always throw
aper and BACheater always throws
:scissors.
That's 100% successful against anything we've seen thus far. Worse, you're
player is permanently modified when it goes up against BACheater, leaving you
vulnerable to clever strategies like CNBiasInverter above:
AJBRandomPlayer vs. BACheater
AJBRandomPlayer: 0
BACheater: 1000
BACheater Wins
AJBRandomPlayer vs. CNBiasInverter
AJBRandomPlayer: 4.5
CNBiasInverter: 995.5
CNBiasInverter Wins
BACheater vs. CNBiasInverter
BACheater: 1000
CNBiasInverter: 0
BACheater Wins
Ouch!
Another cheat used by more than one player was to try and predict an opponent's
move, then respond with a counter. Here is Benedikt Huber's version:
KILLER = { :rock =>
aper,
aper => :scissors,
:scissors => :rock }
class BHCheatPlayer < Player
def initialize( opponent )
super
@opp = Object.const_get(opponent).new(self)
end
def choose
KILLER[@opp.choose]
end
def result(you,them,result)
@opp.result(them,you,result)
end
end
Again initialize() retrieves the Class object, but instead of modifying the
Class, it simply creates an internal copy of the opponent. result() forwards
each pick to the copied opponent, to keep it synchronized with the real
opponent. From there, choose() is obvious: See what the opponent is about to
do and counter.
It was pointed out on Ruby Talk that this doesn't demolish random players;
however, against any random strategy, this becomes a random player. Countering
a random choice is a still a random move, even if the choice isn't what the
opponent is about to do.
There were other great cheats. Jannis Harder's self-repairing program and
FGSpyPlayer (well commented) are both worth studying. Some cheating approaches
were also overlooked. For example, no one tried to modify the score, but it can
be done. There were also a lot of excellent non-cheating solutions. Which
leaves me with no choice but to say the lame but utterly true, "All the
submissions are worth a look!"
My thanks to all my fellow cheaters and the rule abiding players that tolerate
our antics.
Tomorrow's quiz was an actual job project of mine, years ago...
generally done with the Prisoner's Dilemma:
http://plato.stanford.edu/entries/prisoner-dilemma/
The game chosen doesn't much matter, but the idea is that there really shouldn't
be much strategy involved. In the Prisoner's Dilemma, it's generally agreed
that it's hard to beat a player that confesses every time. For the game of
Paper Rock Scissors, the winning strategy is to be purely random, as Benedikt
Huber explained on Ruby Talk:
You can't give any predictions on the next move of a random player.
Therefore you have a 1/3 prop. to choose a winning, losing
or drawing move.
To be fair, Paper Rock Scissors does have quite a bit of strategy theory these
days, but the conditions of that theory (mostly body language) are unavailable
to computer players. Entire books have been written on the subject, believe it
or not:
http://www.worldrps.com/
So random is the best we can do? Is that hard to build? Uh, no. Here's a
sample by Avi Bryant:
class AJBRandomPlayer < Player
def choose
[
end
end
If we test that, we get the expected 50/50 results:
AJBRandomPlayer vs. JEGPaperPlayer
AJBRandomPlayer: 511.0
JEGPaperPlayer: 489.0
AJBRandomPlayer Wins
AJBRandomPlayer vs. JEGQueuePlayer
AJBRandomPlayer: 499.5
JEGQueuePlayer: 500.5
JEGQueuePlayer Wins
Of course, that's so uninteresting, we're probably beginning to wonder if
James's quiz selecting skills are on the fritz. Possibly, but interesting
solutions make me look good none-the-less. This week, Christian Neukirchen sent
in more than one of those:
CNBiasInverter: Choose so that your bias will be the inverted
opponent's bias.
CNIrrflug: Pick a random choice. If you win, use it again; else,
use a random choice.
CNStepAhead: Try to think a step ahead. If you win, use the choice
where you'd have lost. If you lose, you the choice where you'd
have won. Use the same on draw.
CNBiasFlipper: Always use the choice that hits what the opponent
said most or second-to-most often (if the most often choice is not
absolutely prefered).
CNBiasBreaker: Always use the choice that hits what the opponent
said most often.
CNMeanPlayer: Pick a random choice. If you win, use it again; else,
use the opponent's choice.
I really should show all of those here, but that would make for a ridiculously
large summary. Let's go with Christian's favorite:
class CNBiasInverter < Player
def initialize(opponent)
super
@biases = {:rock => 0, :scissors => 0,
@hit = {:rock =>
end
def choose
n = ::Kernel.rand(@biases[:rock] + @biases[:scissors] +
@biases[
case n
when 0..@biases[:rock]
when @biases[:rock]..@biases[:rock]+@biases[:scissors]
:rock
when @biases[:rock]+@biases[:scissors]..@biases[:rock]+
@biases[:scissors]+@biases[
:scissors
else
p @biases[:rock]+@biases[:scissors]..@biases[
abort n.to_s
end
end
def result( you, them, win_lose_or_draw )
@biases[them] += 1
end
end
initialize() sets up the a Hash for tracking the biases. (I don't believe @hit
is needed.) result() is the compliment to that. It adjusts the proper bias
count each time the opponent makes a selection.
choose() does all the interesting work. A random number is chosen between 0 and
the total of all the bias counts. That number is then associated with the
indicated bias by some clever use of ranges and the opposite of that bias is
returned as CNBiasInverter's choice.
In other words, as the opponent chooses more and more of a particular item, the
bias count for that item climbs. This will cause the semi-random choice to
drift towards the opposite of that favored move.
Let's compare with our baseline:
CNBiasInverter vs. JEGPaperPlayer
CNBiasInverter: 995.0
JEGPaperPlayer: 5.0
CNBiasInverter Wins
CNBiasInverter vs. JEGQueuePlayer
CNBiasInverter: 653.5
JEGQueuePlayer: 346.5
CNBiasInverter Wins
The results are getting better. But, of course, random is still trump:
AJBRandomPlayer vs. CNBiasInverter
AJBRandomPlayer: 509.5
CNBiasInverter: 490.5
AJBRandomPlayer Wins
There were many, many interesting strategies, like the one above. But random
remained the great equalizer. Which leads us to the critical question: What
exactly is the point of this exercise?
Cheating, of course!
With the Prisoner's Dilemma and this quiz, it's common the engineer the
environment to be ripe for cheating. Since there's no winning strategy
available, we'll need to bend the rules a little bit. That's because
programmers have enormous egos and can't stand to lose at anything!
(Note: Technically, no one even cheated. The definition of cheat that applies
here is, "to violate rules dishonestly." Go back and reread the quiz, if you
need to...)
What's the ultimate cheat? Well, here's the first by Bill Atkins:
class BACheater < Player
def initialize opponent
Object.const_get(opponent).send :define_method, :choose do
end
end
def choose
:scissors
end
end
It doesn't get much simpler than that! Bill's initialize() method uses the
passed in name of the opponent to locate the correct Class object and redefine
the choose() method of that Class to something super easy to deal with. The
opponent is modified to always throw
:scissors.
That's 100% successful against anything we've seen thus far. Worse, you're
player is permanently modified when it goes up against BACheater, leaving you
vulnerable to clever strategies like CNBiasInverter above:
AJBRandomPlayer vs. BACheater
AJBRandomPlayer: 0
BACheater: 1000
BACheater Wins
AJBRandomPlayer vs. CNBiasInverter
AJBRandomPlayer: 4.5
CNBiasInverter: 995.5
CNBiasInverter Wins
BACheater vs. CNBiasInverter
BACheater: 1000
CNBiasInverter: 0
BACheater Wins
Ouch!
Another cheat used by more than one player was to try and predict an opponent's
move, then respond with a counter. Here is Benedikt Huber's version:
KILLER = { :rock =>
:scissors => :rock }
class BHCheatPlayer < Player
def initialize( opponent )
super
@opp = Object.const_get(opponent).new(self)
end
def choose
KILLER[@opp.choose]
end
def result(you,them,result)
@opp.result(them,you,result)
end
end
Again initialize() retrieves the Class object, but instead of modifying the
Class, it simply creates an internal copy of the opponent. result() forwards
each pick to the copied opponent, to keep it synchronized with the real
opponent. From there, choose() is obvious: See what the opponent is about to
do and counter.
It was pointed out on Ruby Talk that this doesn't demolish random players;
however, against any random strategy, this becomes a random player. Countering
a random choice is a still a random move, even if the choice isn't what the
opponent is about to do.
There were other great cheats. Jannis Harder's self-repairing program and
FGSpyPlayer (well commented) are both worth studying. Some cheating approaches
were also overlooked. For example, no one tried to modify the score, but it can
be done. There were also a lot of excellent non-cheating solutions. Which
leaves me with no choice but to say the lame but utterly true, "All the
submissions are worth a look!"
My thanks to all my fellow cheaters and the rule abiding players that tolerate
our antics.
Tomorrow's quiz was an actual job project of mine, years ago...