[QUIZ] Math Captcha (#48)

R

Ruby Quiz

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Gavin Kistner

Overview
--------

"What is fifty times 'a', if 'a' is three?" #=> 150

Write a Captcha system that uses english-based math questions to distinguish
humans from bots.

Background and Details
----------------------

A 'captcha' is an automated way for a computer (usually a web server) to try to
weed out bots, which don't have the 'intelligence' to answer a question that's
fairly straightforward for a human. Captcha is an acronym for "Completely
Automated Public Turing-test to tell Computers and Humans Apart"

The most common form of captcha is an image that has been munged in such a way
that it's supposed to be very hard for a computer to tell you what it says, but
easy for a human. Recent studies (or at least articles) claim that it's proven
quite possible to write OCR software that determines the right answer a large
percentage of the time. Although image-based captchas can be improved (for
example, by placing multiple colored words and asking the user to identify the
'cinnamon' word), they still have the fatal flaw of being inaccessible to the
visually impaired.

This quiz is to write a different kind of captcha system - one that asks the
user questions in plain text. The trick is to use mathematics questions with a
variety of forms and varying numbers, so that it should be difficult (though of
course not impossible) to write a bot to parse them.

For example, this questions of this form might be easy for a bot to parse:

"What is five plus two?"

while this question should be substantially harder:

"How much is fifteen-hundred twenty-three, less the amount of non-thumbs on
one human hand?"

A good balance between human comprehension, variety of question form, and ease
of computation is an interesting challenge.

The Rules
---------

Write a module that has two module methods: 'create_question' and
'check_answer'.

The create_question method should return a hash with two specific keys:

p MathCaptcha.create_question
#=> { :question => "Here is the string of the question", :answer_id => 17 }

The check_answer method is passed a hash with two specific keys, and should
return true if the supplied answer is correct, false if not.

p MathCaptcha.check_answer :answer => "Answer String", :answer_id => 17
#=> false

Extra Credit
------------

1) Ensure that your library is easily extensible, making it easy for someone
using your library to add new forms of question creation, and the answer that
goes with each form.

2) For automated testing and non-ruby usage, it would be nice to provide your
module with a command-line wrapper, with the following interface:
ruby math_captcha.rb
1424039 : What is the sum of the number of thumbs on a human and the number
of hooves on a horse?
ruby math_captcha.rb --id 1424039 --answer 7 false

ruby math_captcha.rb --id 1424039 --answer 6
true

3) Allow your 'create_question' method to take an integer difficulty argument.
Low difficulties (0 being the lowest) represent trivial questions that an
elementary school student might be able to answer, while higher difficulties
range into algebra, trigonometry, calculus, linear algebra, and beyond. (It's up
to you as to what the scale is.)

"Type the number that comes right before seventy-five."
"What is fifteen plus twelve minus six?"
"What is six x minus three i, if I said that i is two and x three?"
"What is two squared, cubed?"
"Is the cosine of zero one or zero?"
"What trigonometric function of an angle of a right triangle yields the
ratio of the adjacent side's length divided by the hypotenuse?"
"What is the derivative of 2x^2, when x is 3?"
"What is the dot product of the vectors [4 7] and [3 4]?"
"What is the cross product of the vectors [4 7] and [3 4]?"

4) Let your 'check_answer' method take a unique identifier (such as an IP
address) along with the answer, and always return false for that identifier
after a certain number of consecutive (or accumulated) failures have occurred.

A Tip - Generating English Numerals
-----------------------------------

Presumably parsing large numbers from english to computerese adds another
stumbling block for any bot writer. ("5 + 2" is slightly easier than "five plus
two" and a fair amount easier than "three-thousand-'n'-five twenty three plus
five-oh-five".

Ruby Quiz #25 has some nice code that you can appropriate for turning integers
into english: http://www.rubyquiz.com/quiz25.html
 
D

Dave Burt

http://rubyquiz.com/quiz25.html
The quiz says that if you read a bit farther down. ;)

:) I got down to Extra Credit #3 before thinking I'm not getting that much
extra credit...

Also I don't think my submitted solution to that quiz had any code relevant
to this exercise, although I have it sitting around in sub-release
condition.

Wish me luck finding time to put something together.

Cheers,
Dave
 
G

Gavin Kistner

My solution follows. I didn't do the extra credit for checking to see
if the same UserID/IP was spamming the system. I also didn't do the
extra credit for passing an argument for the difficulty of the
question. Instead, I created a framework where you categorize types
of captchas in an hierarchy, and you can ask for a specific type of
captcha by using the desired subclass.

For example, in my code below, I have:
class Captcha::Zoology < Captcha ... end
class Captcha::Math < Captcha
class Basic < Math ... end
class Algebra < Math ... end
end

This allows you to do:
Captcha.create_question # a question from any framework, while
Captcha::Zoology.create_question # only questions in this class
Captcha::Math.create_question # any question in Math or its subclasses
Captcha::Math::Basic.create_question # only Basic math questions

I'm not wild about the fact that I re-create the Marshal file after
every question creation or remove-retrieval, but it seemed the safest
way. I have no idea how this will work (or fail) in a multi-threaded
environment. I do like that I have the marshal file yank out
questions after a certain time limit, and (optionally) after the
answer has been checked. This keeps the marshal file quite tiny. The
persistence for AnswerStore could easily be abstracted out to use a
DB instead, if available.

I'm not wild about some of the specific captcha questions I created;
some of them seem to be annoyingly hard at times or (rarely)
confusing. But this framework makes it pretty easy to modify the
question generation, and add your own.

I'm most proud of the String#variation method (except the name).
Using regexp-like notation, it performs a sort of reverse-regexp,
building a random string based on some criteria. (I'm not a golfer,
but I also like how terse it turned out.)

Without further explanation, the code:


class Captcha
# Invalidate an answer as soon as it has been checked for?
REMOVE_ON_CHECK = true

# Returns a hash with two values:
# _question_:: A string with the question that the user should answer
# _answer_id_:: A unique ID for this question that should be
passed to
# #check_answer or #get_answers
def self.create_question
question, answers = factories.random.call
answer_id = AnswerStore.instance.store( answers )
return { :question => question, :answer_id => answer_id }
end

# _answer_id_:: The unique ID returned by #create_question
# _answer_:: The user's string or numeric answer to the question
def self.check_answer( info )
#TODO - implement userid persistence and checks
answer_id = info[ :answer_id ]
answer = info[ :answer ].to_s.downcase

store = AnswerStore.instance
valid_answers = if REMOVE_ON_CHECK
store.remove( answer_id )
else
store.retrieve( answer_id )
end
valid_answers = valid_answers.map{ |a| a.to_s.downcase }

valid_answers.include?( answer )
end

def self.get_answers( id )
warn "Hey, that's cheating!"
AnswerStore.instance.retrieve( id )
end

# Add the block to my store of question factories
def self.add_factory( &block )
( @factories ||= [] ) << block
end

# Keep track of the classes that inherit from me
def self.inherited( subklass )
( @subclasses ||= [] ) << subklass
end

# All the question factories in myself and subclasses
def self.factories
@factories ||= []
@subclasses ||= []
@factories + @subclasses.map{ |sub| sub.factories }.flatten
end

class AnswerStore
require 'singleton'
include Singleton

FILENAME = 'captcha_answers.marshal'
MINUTES_TO_STORE = 10

def initialize
if File.exists?( FILENAME )
@all_answers = File.open( FILENAME ){ |f| Marshal.load( f ) }
else
@all_answers = { :lastid=>0 }
end

# Purge any answers that are too old, both for security and
# to keep a small log size
@all_answers.delete_if { |id,answer|
next if id == :lastid
( Time.now - answer.time ) > MINUTES_TO_STORE * 60
}

warn "#{@all_answers.length} answers previously stored" if $DEBUG
end

# Serialize the answer(s), and return a unique ID for it
def store( *answers )
idx = @all_answers[ :lastid ] += 1
@all_answers[ idx ] = Answer.new( *answers )
serialize
idx
end

# Retrieve the correct answer(s)
def retrieve( answer_id )
answers = @all_answers[ answer_id ]
( answers && answers.possibilities ) || []
end

# Manually clear out a stored answer
#
# Returns the answer if it exists in the store, an empty array
otherwise
def remove( answer_id )
answers = retrieve( answer_id )
@all_answers.delete( answer_id )
serialize
answers
end

private
# Shove the current store state to disk
def serialize
File.open( FILENAME, 'wb' ){ |f| f << Marshal.dump
( @all_answers ) }
end

class Answer
attr_reader :possibilities, :time
def initialize( *possibilities )
@possibilities = possibilities.flatten
@time = Time.now
end
end
end
end

class String
def variation( values={} )
out = self.dup
while out.gsub!( /\(([^())?]+)\)(\?)?/ ){
( $2 && ( rand > 0.5 ) ) ? '' : $1.split( '|' ).random
}; end
out.gsub!( /:(#{values.keys.join('|')})\b/ ){ values[$1.intern] }
out.gsub!( /\s{2,}/, ' ' )
out
end
end

class Array
def random
self[ rand( self.length ) ]
end
end

class Integer
ONES = %w[ zero one two three four five six seven eight nine ]
TEENS = %w[ ten eleven twelve thirteen fourteen fifteen
sixteen seventeen eighteen nineteen ]
TENS = %w[ zero ten twenty thirty forty fifty
sixty seventy eighty ninety ]
MEGAS = %w[ none thousand million billion ]

# code by Glenn Parker;
# see http://www.ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-talk/135449
def to_english
places = to_s.split(//).collect {|s| s.to_i}.reverse
name = []
((places.length + 2) / 3).times do |p|
strings = Integer.trio(places[p * 3, 3])
name.push(MEGAS[p]) if strings.length > 0 and p > 0
name += strings
end
name.push(ONES[0]) unless name.length > 0
name.reverse.join(" ")
end

def to_digits
self.to_s.split('').collect{ |digit| digit.to_i.to_english }.join
('-')
end

def to_rand_english
rand < 0.5 ? to_english : to_digits
end

private

# code by Glenn Parker;
# see http://www.ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-talk/135449
def Integer.trio(places)
strings = []
if places[1] == 1
strings.push(TEENS[places[0]])
elsif places[1] and places[1] > 0
strings.push(places[0] == 0 ? TENS[places[1]] :
"#{TENS[places[1]]}-#{ONES[places[0]]}")
elsif places[0] > 0
strings.push(ONES[places[0]])
end
if places[2] and places[2] > 0
strings.push("hundred", ONES[places[2]])
end
strings
end

end


# Specific captchas follow, showing off categorization
class Captcha::Zoology < Captcha
add_factory {
q = "How many (wings|exhaust pipes|titanium teeth|TVs|wooden
knobs) "
q << "does a (standard|normal|regular) "
q << "(giraffe|cat|bear|dog|frog|cow|elephant) have?"
[ q.variation, '0', 'zero', 'none' ]
}
add_factory {
q = "How many (wings|legs|eyes) does a (standard|normal|regular) "
q << "(goose|bird|chicken|rooster|duck|swan) have?"
[ q.variation, 2, 'two' ]
}
end

class Captcha::Math < Captcha
class Basic < Math
add_factory {
q = "(How (much|many)|What) is (the (value|result) of)? "
q << ":num1 :eek:p :num2?"
num1 = rand( 90 ) + 9
num2 = rand( 30 ) + 2

plus = 'plus:added to:more than'.split(':')
minus = 'minus:less:taking away'.split(':')
times = 'times:multiplied by:x'.split(':')
op = [plus,minus,times].flatten.random
case true
when plus.include?( op )
answer = num1 + num2
when minus.include?( op )
answer = num1 - num2
when times.include?( op )
answer = num1 * num2
end
num1 = num1.to_rand_english
num2 = num2.to_rand_english
[ q.variation( :num1 => num1, :eek:p => op, :num2 => num2 ),
answer ]
}
add_factory {
num1 = rand( 990000 ) + 1000
num2 = rand( 990000 ) + 1000
answer = num1 + num2
num1 = num1.to_rand_english
num2 = num2.to_rand_english
[ "Add #{num1} (and|to) #{num2}.".variation, answer ]
}
end
class Algebra < Math
add_factory {
q = "Calculate :n1:x :eek:p :n2:y, (for|if (I say )?) "
q << ":x( is (set to )?|=):xV(,| and) :y( is (set to )?|=):yV."
n1 = rand( 20 ) + 9
n2 = rand( 10 ) + 2
x = %w|a x z r q t|.random
y = %w|c i y s m|.random
xV = rand( 5 )
yV = rand( 6 )

plus = 'plus:added to:more than'.split(':')
minus = 'minus:less:taking away'.split(':')
times = 'times:multiplied by:x'.split(':')
op = [plus,minus,times].flatten.random
case true
when plus.include?( op )
answer = n1*xV + n2*yV
when minus.include?( op )
answer = n1*xV - n2*yV
when times.include?( op )
answer = n1*xV * n2*yV
end
xV = xV.to_rand_english
yV = yV.to_rand_english
vars = { :n1=>n1,:eek:p=>op,:n2=>n2,:x=>x,:y=>y,:xV=>xV,:yV=>yV }
[ q.variation( vars ), answer ]
}
end
end

if __FILE__ == $0
if ARGV.empty?
q = Captcha::Math.create_question
puts "#{q[ :answer_id ]} : #{q[ :question ]}"
else
pieces = {}
nextarg = nil
ARGV.each{ |arg|
case arg
when /-i|--id/i then nextarg = :id
when /-a|--answer/i then nextarg = :answer
else pieces[ nextarg ] = arg
end
}

pieces = { :answer_id => pieces[:id], :answer => pieces[:answer] }
puts Captcha.check_answer( pieces )
end
end
 
Z

zimba

Hello,

It's a bit out of topic, but I wanted to express an idea I had today and
that I've never seen anywhere.

From what I've read on the Internet and in various news groups, Captcha
is considered as the ultimate method against computer bots. Like the
ones who fill your blog or wiki with unrelated links, and by that manner
rank their sites on Google.

The power of captcha, is to give some information that is easy to
understand for an human and very hard for a computer. That way, you can
differenciate both of them.

Now, to get to the point. IMHO what was not considered, it a fake site,
that would proxy the target's captcha. It would provide enough content,
so that an unexperienced user would want to reply to the captcha. That
way, it uses the human's brain :p to fulfill it's "work".

What do you think ?


Cheers,
... zimba
 
E

Edward Faulkner

--sdtB3X0nJg68CQEu
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Now, to get to the point. IMHO what was not considered, it a fake site,= =20
that would proxy the target's captcha. It would provide enough content,= =20
so that an unexperienced user would want to reply to the captcha. That=20
way, it uses the human's brain :p to fulfill it's "work".

It's been proposed before. See, for example, this:

http://www.boingboing.net/2004/01/27/solving_and_creating.html

regards,
Ed

--sdtB3X0nJg68CQEu
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDNyU3nhUz11p9MSARAsfWAKCl8VTzLfusI+W2VvAS7aLwRpP5igCg4aOs
lCMCqlJdKptjksc9nGyMGfw=
=H03G
-----END PGP SIGNATURE-----

--sdtB3X0nJg68CQEu--
 
J

James Edward Gray II

My solution follows.

My solution is below. Here are my random thoughts about it:

1. My code requires Glenn Parker's Ruby Quiz #25 solution. I did
make one minor change to it, which was to wrap the quiz specific code
in a:

if __FILE__ == $0
# ...
end

That allowed me to use it as a library.
2. I did the first two extra credits, but couldn't think of good
systems for the last two.
3. I'm in a "Write less code" kick this week, so I really tried to
pack a lot of punch into minimal code. I'm happy with the results.

Here's the code:

#!/usr/local/bin/ruby -w

require "erb"

# Glenn Parker's code from Ruby Quiz 25...
require "english_numerals"
class Integer
alias_method :to_en, :to_english
end

class Array
def insert_at_nil( obj )
if i = index(nil)
self = obj
i
else
self << obj
size - 1
end
end
end

module MathCaptcha
@@captchas = Array.new
@@answers = Array.new

def self.add_captcha( template, &validator )
@@captchas << Array[template, validator]
end

def self.create_question
raise "No captchas loaded." if @@captchas.empty?

captcha = @@captchas[rand(@@captchas.size)]

args = Array.new
class << args
def arg( value )
push(value)
value
end

def resolve( template )
ERB.new(template).result(binding)
end
end
question = args.resolve(captcha.first)
index = @@answers.insert_at_nil(Array[captcha.first, *args])

Hash[:question => question, :answer_id => index]
end

def self.check_answer( answer )
raise "Answer id required." unless answer.include? :answer_id

template, *args = @@answers[answer[:answer_id]]
raise "Answer not found." if template.nil?

validator = @@captchas.assoc(template).last
raise "Unable to match captcha." if validator.nil?

if validator[answer[:answer], *args]
@@answers[answer[:answer_id]] = nil
true
else
false
end
end

def self.load_answers( file )
@@answers = File.open(file) { |answers| Marshal.load(answers) }
end

def self.load_captchas( file )
code = File.read(file)
eval(code, binding)
end

def self.save_answers( file )
File.open(file, "w") { |answers| Marshal.dump(@@answers,
answers) }
end
end

if __FILE__ == $0
captchas = File.join(ENV["HOME"], ".math_captchas")
unless File.exists? captchas
File.open(captchas, "w") { |file| file << DATA.read }
end
MathCaptcha.load_captchas(captchas)

answers = File.join(ENV["HOME"], ".math_captcha_answers")
MathCaptcha.load_answers(answers) if File.exists? answers

if ARGV.empty?
question = MathCaptcha.create_question
puts "#{question[:answer_id]} : #{question[:question]}"
else
args = Hash.new
while ARGV.size >= 2 and ARGV.first =~ /^--\w+$/
key = ARGV.shift[2..-1].to_sym
value = ARGV.first =~ /^\d+$/ ? ARGV.shift.to_i :
ARGV.shift
args[key] = value
end

answer = MathCaptcha.check_answer(args)
puts answer
end

END { MathCaptcha.save_answers(answers) }
end

__END__
add_captcha(
"<%= arg(rand(10)).to_en.capitalize %> plus <%= arg(2).to_en %>?"
) do |answer, *opers|
if answer.is_a?(String) and answer =~ /^\d+$/
answer = answer.to_i.to_en
elsif answer.is_a?(Integer)
answer = answer.to_en
end
answer == opers.inject { |sum, var| sum + var }.to_en
end

__END__

James Edward Gray II
 
J

James Edward Gray II

My solution is below. Here are my random thoughts about it:

[snip 1 - 3]

4. I add captchas in plain Ruby code. A method is available in the
templates called arg(), to ensure an argument is passed on to your
validation block. Here's a sample captcha:
add_captcha(
"<%= arg(rand(10)).to_en.capitalize %> plus <%= arg(2).to_en %>?"
) do |answer, *opers|
if answer.is_a?(String) and answer =~ /^\d+$/
answer = answer.to_i.to_en
elsif answer.is_a?(Integer)
answer = answer.to_en
end
answer == opers.inject { |sum, var| sum + var }.to_en
end

James Edward Gray II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top