[QUIZ] English Numerals (#25)

R

Ruby Quiz

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Timothy Byrd

While we normally write numbers using Arabic (or since Quiz #22, Roman)
numerals, numbers can also be written out as English phrases.

For example:

7 == seven (the hard way)
42 == forty-two (a very important number)
2001 == two thousand and one (a space odyssey)
1999 == (party like it's) nineteen hundred and ninety-nine

So the quiz is a problem from a Pi Mu Epsilon (US national math club)
newsletter:

"When the integers 1 to 10_000_000_000 are written in the English language, then
sorted as strings, which odd number appears first in the list?"

Your mission, should you choose to accept it, is to:

- Create Ruby code to translate a number to it's English language form. (My
sample version works with integers up to 10**72-1.)

- Determine programmatically which odd number in 1..10_000_000_000 would sort
first if written in English. (Brute force is the obvious solution, but the
computer may have to think about it...)

- Would the answer change for a larger range of values, say 10**30?

- Do French and German Rubyists get a different answer than the Americans?

Extra credit:

-Add a Pinyin translator: http://www.mandarintools.com/numbers.html
 
P

Patrick Hurley

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Timothy Byrd

While we normally write numbers using Arabic (or since Quiz #22, Roman)
numerals, numbers can also be written out as English phrases.

For example:

7 == seven (the hard way)
42 == forty-two (a very important number)
2001 == two thousand and one (a space odyssey)
1999 == (party like it's) nineteen hundred and ninety-nine

So the quiz is a problem from a Pi Mu Epsilon (US national math club)
newsletter:

"When the integers 1 to 10_000_000_000 are written in the English language, then
sorted as strings, which odd number appears first in the list?"

Your mission, should you choose to accept it, is to:

- Create Ruby code to translate a number to it's English language form. (My
sample version works with integers up to 10**72-1.)

- Determine programmatically which odd number in 1..10_000_000_000 would sort
first if written in English. (Brute force is the obvious solution, but the
computer may have to think about it...)

- Would the answer change for a larger range of values, say 10**30?

- Do French and German Rubyists get a different answer than the Americans?

Extra credit:

-Add a Pinyin translator: http://www.mandarintools.com/numbers.html

Ok sounds interesting, is the rule, less than 2000

1000 -> ten hundred
1800 -> eighteen hundred
2000 -> two thousand

Also what are the rules for the "and"?

1050050 -> one million fifty thousand and fifty
or
1050050 -> one million and fifty thousand and fifty

Also we hyphenate when 20 < x > 100 (and not a multple of 10)?
 
T

Timothy Byrd

Ok sounds interesting, is the rule, less than 2000

Close. It's at least 1100 and less than 2000, so:

1000 -> one thousand
1100 -> eleven hundred
1900 -> nineteen hundred
2000 -> two thousand
Also what are the rules for the "and"?

I was assuming the "and" only goes in the final three digits and only
if not a multiple of 100, so your:
1050050 -> one million fifty thousand and fifty

is what I'd pick.
Also we hyphenate when 20 < x > 100 (and not a multple of 10)?

Yes, between 20 and 100 (non-inclusive). This applies to each group of
three digits, so:

57057057 -> fifty-seven million fifty-seven thousand and fifty-seven

Hope that clears things up.

BTW, here's a fun-fact on the original post:

7 == seven (the hard way)

In the dice game craps, a "hardway" bet is that a number
will come up as a pair rather than some other combination.
For example a six would have to be a pair of 3's rather than
5 & 1 or 4 & 2. Sevewn the hard way would require rolling a
pair of 3.5's, which is generally considered quite difficult.
(See http://stuffo.howstuffworks.com/craps7.htm )

-- Timothy
 
H

Hal Fulton

Hans said:
I was always taught there is no and (except at the decimal place), e.g.
one million fifty thousand fifty and five tenths. I don't remember who
taught this but I remember learning it disctinctly.

I was taught this, too. They stressed it, in fact, as though it
were something that actually mattered.
That hyphenation thing gives me a headache. What's wrong with
one hundred twenty five?

I have never seen "twenty five" in English -- AFAIK it's always
hyphenated.


Hal
 
J

Joe Van Dyk

Ruby said:
The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler
discussion for this quiz until 48 hours have passed from
the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you
can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= -=-=-=

by Timothy Byrd

While we normally write numbers using Arabic (or since
Quiz #22, Roman) numerals, numbers can also be written
out as English phrases.

For example:

7 == seven (the hard way)
42 == forty-two (a very important number)
2001 == two thousand and one (a space odyssey)
1999 == (party like it's) nineteen hundred and ninety-nine

Why is one "two thousand and one" and one "nineteen hundred and
ninety-nine"?
 
T

Timothy Byrd

I was always taught there is no and (except at the decimal
place), e.g. one million fifty thousand fifty and five tenths.
I don't remember who taught this but I remember learning it
disctinctly.

Well, numbers aren't really exact things, after all. :)

I wanted to put a little twist into the quiz, and "two thousand and
one" simply sounds right. (Though I had to break 1776 to get it.)

Here's a page with more info:

http://www.absoluteastronomy.com/encyclopedia/N/Na/Names_of_numbers_in_English.htm

-- Timothy
 
H

Hal Fulton

Timothy said:
Well, numbers aren't really exact things, after all. :)

I wanted to put a little twist into the quiz, and "two thousand and
one" simply sounds right. (Though I had to break 1776 to get it.)

Here's a page with more info:

http://www.absoluteastronomy.com/encyclopedia/N/Na/Names_of_numbers_in_English.htm

The "and" is not my only problem with that page.

For example, they think "one hundred" is singular? Please.

By the way, I hear that one hundred are expected at the
next Ruby Conf... ;)


Hal
 
T

Timothy Byrd

For example, they think "one hundred" is singular? Please.
By the way, I hear that one hundred are expected at the
next Ruby Conf... ;)

Well, "one hundred" is a good number. ;)

(Going even more off topic.) That just reminded me of something. In
the US, the media refer to corporations in the singular - "Microsoft is
a ...". But I seem to recall reading British articles that use
"Microsoft are going to ..."

-- Timothy
 
H

Hal Fulton

Timothy said:
Well, "one hundred" is a good number. ;)

(Going even more off topic.) That just reminded me of something. In
the US, the media refer to corporations in the singular - "Microsoft is
a ...". But I seem to recall reading British articles that use
"Microsoft are going to ..."

Yes. Don't get me started.

Those are called "collective nouns." They appear singular but
act as a plural and should be treated plurally. Thus "Microsoft
are" is correct, and "Microsoft is" is a failure of the US
education system.

I can see the rationale that a corporation is a single entity.
But when it's time for a pronoun, even Americans go for the
plural -- even in the same sentence:

"Microsoft are angry, and they plan to sue." (UK/correct)
"Microsoft is angry, and it plans to sue." (at least consistent)
"Microsoft is angry, and they plan to sue." (American)


Hal
 
P

Patrick Hurley

Well, numbers aren't really exact things, after all. :)

I wanted to put a little twist into the quiz, and "two thousand and
one" simply sounds right. (Though I had to break 1776 to get it.)

Here's a page with more info:

http://www.absoluteastronomy.com/encyclopedia/N/Na/Names_of_numbers_in_English.htm

-- Timothy

Ok almost done asking for information before I hide from my kids and
do the quiz :)

One last quesiton should'nt we have comma's between number clauses?

i.e.

1_111_111_111

one billion, one hundred eleven million, one hundred eleven thousand
and one hundred eleven

Patrick
 
T

Timothy Byrd

(Posting from home now)
One last quesiton should'nt we have comma's between number clauses?
i.e.
1_111_111_111
one billion, one hundred eleven million, one hundred eleven thousand
and one hundred eleven

My answer is: "if you like, go for it". Does it change the result?
(The answer to smallest odd number...)

Btw, I would have the last bit of that as "one hundred eleven thousand,
one hundred and eleven". Putting the "and" inside the last clause adds
a twist to the problem so we can see how people deal with an
inconsistency like this. Your way is like the 'comma', 'comma' & 'and'
form of English which is consistent, but 2_100 would give "two thousand
and one hundred" which sounds odd to me. (On the other hand, many
numbers don't sound right with my algorithm, either. And it's a valid
point of view to want to try to eliminate inconsistency in the first
place.)

-- Timothy
 
G

Glenn Parker

Note: I'm afraid I could not bring myself to code up some random
ill-defined method of expressing numbers in English, so I did it the way
I was taught in school, using hypens and absolutely no "ands" or commas.
I think I've got Strunk & White on my side. Regardless, I'll be
somewhat surprised if my answer comes out very different from others.

Brute force was not a very practical option when looking for the answer,
so I assumed it would be OK to do a little bit of analysis and simplify
the problem. If my analysis was wrong, then I'll probably lose badly on
this one. :) There may also be a much more elegant way to express all
of this, but I thought I would just give you a short brain dump here.

Every number from 1 to 10**10 can be expressed in English as a
concatenation of one or more independent phrases. Each phrase expresses
the value of a triplet of numerals from the original number. For
example, 56_106_021 uses three phrases: "fifty-six million" for the
millions triplet (56), "one hundred six thousand" for the thousands
triplet (106), and "twenty-one" for the ones triplet (021). I don't
allow "twelve hundred" for 1200, but I think it would only complicate
the logic slightly to handle this.

I proceed to find the lowest possible phrase for each of the four
triplets in the range of interest. It must be possible to express the
lowest number name using a subset of the lowest possible phrases for
each triplet. And since the desired number must be odd, valid subsets
will all end with the ones triplet phrase.

The ones triplet range has only 500 phrases: the odd numbers from 1 to
999. The thousands triplet has 1000 phrases: the multiples of 1000 from
1000 to 999_999. Likewise, the millions triplet has 1000 phrases. The
billions triplet has only ten phrases because we stop caring after
10_000_000_000 (instead of going to 999_999_999_999). Finding the four
lowest phrases for each triplet requires examining only 2510 numbers.
There are only eight combinations of the resulting four phrases that
represent an odd number.

The result: "eight billion eight hundred eight million eight hundred
eight thousand eight hundred eighty-five".



#!/usr/bin/ruby

class Integer

Ones = %w[ zero one two three four five six seven eight nine ]
Teen = %w[ ten eleven twelve thirteen fourteen fifteen
sixteen seventeen eighteen nineteen ]
Tens = %w[ zero ten twenty thirty forty fifty
sixty seventy eighty ninety ]
Mega = %w[ none thousand million billion ]

def to_english
places = to_s.split(//).collect {|s| s.to_i}.reverse
name = []
((places.length + 2) / 3).times do |p|
strings = Integer.trio(places[p * 3, 3])
name.push(Mega[p]) if strings.length > 0 and p > 0
name += strings
end
name.push(Ones[0]) unless name.length > 0
name.reverse.join(" ")
end

private

def Integer.trio(places)
strings = []
if places[1] == 1
strings.push(Teen[places[0]])
elsif places[1] and places[1] > 0
strings.push(places[0] == 0 ? Tens[places[1]] :
"#{Tens[places[1]]}-#{Ones[places[0]]}")
elsif places[0] > 0
strings.push(Ones[places[0]])
end
if places[2] and places[2] > 0
strings.push("hundred", Ones[places[2]])
end
strings
end

end

# If there are command-line args, just print out English names.
if ARGV.length > 0
ARGV.each {|arg| puts arg.to_i.to_english}
exit 0
end

# Return the name of the number in the specified range that is the
# lowest lexically.
def minimum_english(start, stop, incr)
min_name = start.to_english
start.step(stop, incr) do |i|
name = i.to_english
min_name = name if min_name > name
end
min_name
end

# Find the lowest phrase for each 3-digit cluster of place-values.
# The lowest overall string must be composed of elements from this list.
components =
[ minimum_english(10**9, 10**10, 10**9),
minimum_english(10**6, 10**9 - 1, 10**6),
minimum_english(10**3, 10**6 - 1, 10**3),
minimum_english(10**0, 10**3 - 1, 2) ]

$result = components[-1]
def search_combinations(list, selected = [])
if elem = (list = list.dup).shift
if list.empty?
# Always include the final element from list in the selection.
string = selected.dup.push(elem).join(" ")
$result = string if $result > string
else
search_combinations(list, selected)
search_combinations(list, selected.dup.push(elem))
end
end
$result
end

puts search_combinations(components)
 
G

Glenn Parker

Glenn said:
I'll be
somewhat surprised if my answer comes out very different from others.

OK, not _too_ surprised. Since "and" sorts earlier than any number in
English, using it can throw off the results by quite a bit.
 
M

Matthew D Moss

I decided to add a Japanese translator rather than pinyinwhatever...
since I know a smidgeon of it. Of course, my translator is probably
wrong in some cases, but what the heck. I didn't attack the
first-sorted-odd problem, but here's my number to english/japanese
translators.



class JapaneseTranslator
# My knowledge of counting Japanese is limited, so this may not
# be entirely correct; in particular, I don't know what rules
# to follow after 'hyaku man' (1,000,000).
# I also combine a digit with its group, such as 'gohyaku' rather
# than 'go hyaku'; I just like reading it better that way.

DIGITS = %w(zero ichi ni san shi go roku shichi hachi kyu)
GROUPS = %w(nothingtoseeheremovealong ju hyaku sen)
MAN = 10000

def to_spoken(val)
case val <=> 0
when -1
'- ' + to_spoken(-val)
when 0
DIGITS[0]
else
group(val, 0)
end
end

private

def group(val, level)
if val >= MAN
group(val / MAN, 0) + 'man ' + group(val % MAN, 0)
else
case val
when 0
''
when 1
level == 0 ? DIGITS[val] : GROUPS[level]
when 2...10
DIGITS[val] + (GROUPS[level] if level > 0).to_s
else
group(val / 10, level+1) + ' ' + group(val % 10, level)
end
end
end
end


class USEnglishTranslator
# Formal, US English. Optional 'and'. Will not produce things
# such as 'twelve hundred' but rather 'one thousand two hundred'.
# The use of 'and' is incomplete; it is sometimes missed.

DIGITS = %w(zero one two three four five six seven eight nine)
TEENS = %w(ten eleven twelve thirteen fourteen fifteen sixteen
seventeen eighteen nineteen)
TENS = %w(hello world twenty thirty forty fifty sixty seventy
eighty ninety)
GROUPS = %w(thousand million billion trillion quadrillion
quintillion sextillion septillion octillion nonillion
decillion)
K = 1000

def initialize(conjunction = true)
@conjunction = conjunction
end

def to_spoken(val)
case val <=> 0
when -1
'negative ' + to_spoken(-val)
when 0
DIGITS[0]
else
group(val, 0).flatten.join(' ')
end
end

private

def group(val, level)
x = group(val / K, level + 1) << GROUPS[level] if val >= K
x.to_a << under_1000(val % K, level)
end

def under_1000(val, level)
x = [DIGITS[val / 100]] << 'hundred' if val >= 100
x.to_a << under_100(val % 100, (level == 0 and not x.nil?))
end

def under_100(val, junction)
x = [('and' if @conjunction and junction)] # wyf?
case val
when 0
[]
when 1...10
x << DIGITS[val]
when 10...20
x << TEENS[val - 10]
else
d = val % 10
x << (TENS[val / 10] + ('-' + DIGITS[d] if d != 0).to_s)
end
end
end


class Integer
def to_spoken(translator = USEnglishTranslator.new)
translator.to_spoken(self).squeeze(' ').strip
end
end


SAMPLES = [ 0, 1, 2, 5, 10, 11, 14, 18, 20, 21, 29, 33, 42, 50, 87, 99,
100, 101, 110, 167, 199, 200, 201, 276, 300, 314, 500, 610,
1000, 1039, 1347, 2309, 3098, 23501, 32767, 70000, 5480283,
2435489238, 234100090000, -42, -2001 ]

TRANSLATORS = { 'US English' => USEnglishTranslator.new,
'Japanese' => JapaneseTranslator.new }


# main
TRANSLATORS.each do |lang, translator|
puts
puts lang
puts '-' * lang.length
SAMPLES.each do |val|
puts "%12d => %s" % [val, val.to_spoken(translator)]
end
end
 
D

Dave Burt

- "When the integers 1 to 10_000_000_000 are written in the English
language, then
sorted as strings, which odd number appears first in the list?"

This is a really interesting question. There's a little more creativity in
the answer than I feel comfortable delegating to my computer at this point,
so my answer is incomplete. I haven't "determined [the answer]
programmatically" as required, I've used a computer to help me find the
answer. I figure a proper brute force would be at least a couple of days in
the solving anyway.

My answer is that "a baker's dozen" comes immediately before "a billion, a
hundred and eight thousand, a hundred and eighty five". I haven't yet
enhanced my program to give me this result :)

- Would the answer change for a larger range of values, say 10**30? I don't
believe it would, as "billion" < ['trillion', 'quadrillion', 'quintillion',
'sextillion', 'septillion', 'octillion', 'nonillion', 'decillion',
'undecillion', 'duodecillion', 'tredecillion', 'quattuordecillion',
'quindecillion', 'sexdecillion', 'septendecillion', 'octodecillion',
'novemdecillion', 'vigintillion'].min

- Do French and German Rubyists get a different answer than the Americans?
I think the answer is either yes or no depending on whether your answer is
the number or the words. I believe the words for the French/German
interpretation are the same, but the meaning of the billion is different
(1_000_000_000_000 rather than 1_000_000_000).

Here is my helper program, in case anyone considers it relevant :)

Cheers,
Dave

# load my translation of the Perl modules Number::Spell and
# Lingua::EN::Numericalize (see CPAN)
require 'numeric-english'

# add to_english method to integers
class Integer; include Numeric::English end

a = []

# return the english representation of the given integer if it is less than
# the one stored in memo (otherwise return memo).
inject_proc = proc do |memo, int|
if int[0] == 0 # test low bit
memo # ignore even numbers
else
english = int.to_english
if english < memo
english
else
memo
end
end
end

# I'm not sure why I chose these partitions of the problem space. I think
it's
# something to do with independent sections of numbers. We group in threes.

# check numbers one to a million
a << (0..1_000_000).inject('zzz', &inject_proc)
# check numbers from a million to a billion
a << (0..1_000).map{|x| x * 1_000_000 + 1 }.inject('zzz', &inject_proc)
# check numbers from a billion to ten billion
a << (0..10).map{|x| x * 1_000_000_000 + 1 }.inject('zzz', &inject_proc)

# this result is a list of words which have to be combined in an interesting
# way that I haven't formalized. Sorry.
p a

__END__

Output:
["eight hundred eight thousand eight hundred eighty five", "eight hundred
eight
million five", "eight billion five"]
 
P

Patrick Hurley

I wrote a quick program to implement the to_en (to_english) version as
well. It supports using and and commas -- the and has a significant
effect on the lowest collated odd number returned. As the brute force
solution was rather brutish (I let it run for two nights without
reaching the correct solution), I came up with an "optimized"
solution. Below is the output and below that the code - the output is
too wide -- sorry, but the code is still properly formatted :)

Patrick

-------------- Output
-------------------------------------------------------------------

Using both commas and 'and'
eight billion and eighty-five

Repeat without 'and' or comma
eight billion eight hundred eight million eight hundred eight thousand
eight hundred eighty-five

-------------- Code
----------------------------------------------------------------------

class Integer
DIVISORS = [[100, ''],
[10, 'hundred'],
[1000, 'thousand'],
[1000, 'million'],
[1000, 'billion'],
[1000, 'trillion'],
[1000, 'quadrillion'],
[1000, 'quintillion'],
[1000, 'sextillion'],
[1000, 'septillion'],
[1000, 'octillion'],
[1000, 'nonillion'],
[1000, 'decillion'],
[1000, 'undecillion'],
[1000, 'duodecillion'],
[1000, 'tredecillion'],
[1000, 'quattuordecillion'],
[1000, 'quindecillion'],
[1000, 'sexdecillion'],
[1000, 'septendecillion'],
[1000, 'octodecillion'],
[1000, 'novemdecillion'],
[1000, 'vigintillion'],
[1000, 'unvigintillion'],
[1000, 'duovigintillion'],
[1000, 'trevigintillion'],
[1000, 'quattuorvigintillion'],
[1000, 'quinvigintillion'],
[1000, 'sexvigintillion'],
[1000, 'septvigintillion'],
[1000, 'octovigintillion'],
[1000, 'novemvigintillion'],
[1000, 'trigintillion'],
[1000, 'untrigintillion'],
[1000, 'duotrigintillion'],
[1000, 'tretrigintillion'],
[1000, 'quattuortrigintillion'],
[1000, 'quintrigintillion'],
[1000, 'sextrigintillion'],
[1000, 'septtrigintillion'],
[1000, 'octotrigintillion'],
[1000, 'novemtrigintillion'],
[1000, 'quadragintillion'],
[1000, 'unquadragintillion'],
[1000, 'duoquadragintillion'],
[1000, 'trequadragintillion'],
[1000, 'quattuorquadragintillion'],
[1000, 'quinquadragintillion'],
[1000, 'sexquadragintillion'],
[1000, 'septquadragintillion'],
[1000, 'octoquadragintillion'],
[1000, 'novemquadragintillion'],
[1000, 'quinquagintillion'],
[1000, 'unquinquagintillion'],
[1000, 'duoquinquagintillion'],
[1000, 'trequinquagintillion'],
[1000, 'quattuorquinquagintillion'],
[1000, 'quinquinquagintillion'],
[1000, 'sexquinquagintillion'],
[1000, 'septquinquagintillion'],
[1000, 'octoquinquagintillion'],
[1000, 'novemquinquagintillion'],
[1000, 'sexagintillion'],
[1000, 'unsexagintillion'],
[1000, 'duosexagintillion'],
[1000, 'tresexagintillion'],
[1000, 'quattuorsexagintillion'],
[1000, 'quinsexagintillion'],
[1000, 'sexsexagintillion'],
[1000, 'septsexagintillion'],
[1000, 'octosexagintillion'],
[1000, 'novemsexagintillion'],
[1000, 'septuagintillion'],
[1000, 'unseptuagintillion'],
[1000, 'duoseptuagintillion'],
[1000, 'treseptuagintillion'],
[1000, 'quattuorseptuagintillion'],
[1000, 'quinseptuagintillion'],
[1000, 'sexseptuagintillion'],
[1000, 'septseptuagintillion'],
[1000, 'octoseptuagintillion'],
[1000, 'novemseptuagintillion'],
[1000, 'octogintillion'],
[1000, 'unoctogintillion'],
[1000, 'duooctogintillion'],
[1000, 'treoctogintillion'],
[1000, 'quattuoroctogintillion'],
[1000, 'quinoctogintillion'],
[1000, 'sexoctogintillion'],
[1000, 'septoctogintillion'],
[1000, 'octooctogintillion'],
[1000, 'novemoctogintillion'],
[1000, 'nonagintillion'],
[1000, 'unnonagintillion'],
[1000, 'duononagintillion'],
[1000, 'trenonagintillion'],
[1000, 'quattuornonagintillion'],
[1000, 'quinnonagintillion'],
[1000, 'sexnonagintillion'],
[1000, 'septnonagintillion'],
[1000, 'octononagintillion'],
[1000, 'novemnonagintillion'],
[1000, 'centillion'],
]

ENGLISH_MAP = {
1 => 'one',
2 => 'two',
3 => 'three',
4 => 'four',
5 => 'five',
6 => 'six',
7 => 'seven',
8 => 'eight',
9 => 'nine',
10 => 'ten',
11 => 'eleven',
12 => 'twelve',
13 => 'thirteen',
14 => 'fourteen',
15 => 'fifteen',
16 => 'sixteen',
17 => 'seventeen',
18 => 'eighteen',
19 => 'nineteen',
20 => 'twenty',
30 => 'thirty',
40 => 'forty',
50 => 'fifty',
60 => 'sixty',
70 => 'seventy',
80 => 'eighty',
90 => 'ninety',
}

ENGLISH_TEXT = {
:and => 'and',
:comma => ',',
:zero => 'zero',
:minus => 'minus'
}

def Integer.insert_and?(number, result, recurse)
number > 0 &&
result.empty? &&
!recurse && !ENGLISH_TEXT[:and].empty?
end

def Integer.insert_comma?(index, result)
(index >= 2) &&
(
!ENGLISH_TEXT[:and].empty? &&
!result.empty? &&
!result[/^#{Regexp.escape(ENGLISH_TEXT[:and])}/]
) ||
(
ENGLISH_TEXT[:and].empty? &&
result[/\s\S+$/]
)
end

def Integer.to_en(number, recurse=false)
result = ''
DIVISORS.each_with_index do |(divisor, name), index|
break unless number > 0
remainder = number % divisor
number /= divisor
next if (remainder == 0)

# check for special case 1100...2000
# if (1 == number && (1..9) === remainder && index == 1 && !recurse)

# check for special case 1100...1999 skipping [2-9]0..
if ((1..9) === number && (1..9) === remainder && index == 1 && !recurse)
remainder += number * 10
number = 0
end

part = ''
case remainder
when 1..19
part = ENGLISH_MAP[remainder]
when 20..99
units = remainder % 10
part = ENGLISH_MAP[remainder - units]
part += '-' + ENGLISH_MAP[units] if (units != 0)
else
# Recurse
part = Integer.to_en(remainder, true)
end

# the following section is complicated by supporting
# commas and ands
full_part = ''
if insert_and?(number, result, recurse)
full_part += ENGLISH_TEXT[:and] + ' '
end

full_part += part
full_part += ' ' + name unless name.empty?

if insert_comma?(index, result)
full_part += ENGLISH_TEXT[:comma]
end

full_part += ' ' + result unless result.empty?
result = full_part
end
result
end

def to_en
Integer.to_en(self)
end
end

# assumptions based upon US number to words algorithm
# - the first digit of the number in question must begin with
# the lowest number name alphabetically (this is a pretty good
# prune all by itself
#
# - the number in question must be a member of the lowest
# divisor name (alphabetically)
#

# we know the number must be eight somethings
#number = EnglishNumber::ENGLISH_MAP.values.sort
#puts number.first

# ok so we must be in the eight billions
#names = EnglishNumber::DIVISORS.map { |d,name| name }
#names = names.reject { |n| n.size == 0 }
#names = names.sort
#puts names.first

# now we re-apply the same rules as above, but
# not forgetting to "odd" rule (meaning we cannot
# use eight on the final digit
# now we know the last digit will be a five
#number = EnglishNumber::ENGLISH_MAP.to_a.reject { |k,n| k % 2 == 0 }
#number = number.map { |k,name| name }
#number.sort
#puts number.first

# So putting it all together we check all entries in
# our heavily pruned tree of choices
# numbers ending in 5 and having a combination of
# 0 and 8's

def try_some_numbers
lowest = 'zzzz'
number = 5
prefix = 0
while number < 10_000_000_000
number = (prefix.to_s(2).gsub(/1/, '8') + '5').to_i
name = number.to_en
if name < lowest
lowest = name
end
prefix += 1
end
puts lowest
end

puts "Using both commas and 'and'"
try_some_numbers

puts "\nRepeat without 'and' or comma"
Integer::ENGLISH_TEXT[:comma] = ''
Integer::ENGLISH_TEXT[:and] = ''
try_some_numbers
 
T

Timothy Byrd

The writeup is coming later. For now, here is my solution.

I implemented both European (more logical - BetaMax?) and American
(will dominate - VHS?) numbers. Though I'd put in 'and's originally,
I also ran with them disabled by removing the comment mark in the
first line of find_for_interval(), so I could compare answers. First
I did brute force, then tried for something more elegant.

I have to admit that when I read Glenn Parker's solution, I discovered
my 'elegant' version was elegantly incorrect. It inspired to to fix
things and then refactor out a bit of the duplicated code.

My coding style here is very old-school & c-like. We'll see what I
learn of the Ruby Way during the write-up.

Answers:

Without 'and's or commas, I get:

english.rb 10_000_000_000

E: 808808885 - eight hundred eight million eight hundred eight
thousand eight hundred eighty-five

A: 8808808885 - eight billion eight hundred eight million eight
hundred eight thousand eight hundred eighty-five


english.rb 10_000_000_000_000

E: 8808808808885 - eight billion eight hundred eight milliard eight
hundred eight million eight hundred eight thousand eight hundred
eighty-five

A: 8808808885 - eight billion eight hundred eight million eight
hundred eight thousand eight hundred eighty-five


If I include the 'and's (still no commas), I get:

english.rb 10_000_000_000

E: 885 - eight hundred and eighty-five

A: 8000000885 - eight billion eight hundred and eighty-five


english.rb 10_000_000_000_000

E: 8000000000885 - eight billion eight hundred and eighty-five

A: 8000000885 - eight billion eight hundred and eighty-five


Code:

#############################################################

=begin
While we normally write numbers using Arabic (or since Quiz #22,
Roman) numerals, numbers can also be written out as English phrases.

For example:

7 == seven (the hard way)
42 == forty-two (a very important number)
2001 == two thousand and one (a space odyssey)
1999 == (party like it's) nineteen hundred and ninety-nine


So the quiz is a problem from a Pi Mu Epsilon (US national math club)
newsletter:


"When the integers 1 to 10_000_000_000 are written in the English
language, then sorted as strings, which odd number appears first in
the list?"

- Create Ruby code to translate a number to it's English language form.

- Determine programatically which odd number in 1..10_000_000_000
would sort first if written in English. (Brute force is the obvious
solution, but the computer may have to think about it...)

- Would the answer change for a larger range of values, say 10**30?

- Do French and German Rubyists get a different answer than the
Americans?

=end

module EnglishNumerals

Numbers = {
1 => 'one',
2 => 'two',
3 => 'three',
4 => 'four',
5 => 'five',
6 => 'six',
7 => 'seven',
8 => 'eight',
9 => 'nine',
10 => 'ten',
11 => 'eleven',
12 => 'twelve',
13 => 'thirteen',
14 => 'fourteen',
15 => 'fifteen',
16 => 'sixteen',
17 => 'seventeen',
18 => 'eighteen',
19 => 'nineteen',
20 => 'twenty',
30 => 'thirty',
40 => 'forty',
50 => 'fifty',
60 => 'sixty',
70 => 'seventy',
80 => 'eighty',
90 => 'ninety'
}

AmExponents = {
3 => 'thousand',
6 => 'million',
9 => 'billion',
12 => 'trillion',
15 => 'quadrillion',
18 => 'quintillion',
21 => 'sexillion',
24 => 'septillion',
27 => 'octillion',
30 => 'nonillion',
33 => 'decillion',
36 => 'undecillion',
39 => 'duodecillion',
42 => 'tredecillion',
45 => 'quattuordecillion',
48 => 'quindecillion',
51 => 'sexdecillion',
54 => 'septendecillion',
57 => 'octodecillion',
60 => 'novemdecillion',
63 => 'vigintillion',
66 => 'unvigintillion',
69 => 'duovigintillion'
}

EurExponents = {
3 => 'thousand',
6 => 'million',
9 => 'milliard',
12 => 'billion',
15 => 'billiard',
18 => 'trillion',
21 => 'trilliard',
24 => 'quadrillion',
27 => 'quadrilliard',
30 => 'quintillion',
33 => 'quintilliard',
36 => 'sextillion',
39 => 'sextilliard',
42 => 'septillion',
45 => 'septilliard',
48 => 'octillion',
51 => 'octilliard',
54 => 'noventillion',
57 => 'noventilliard',
60 => 'decillion',
63 => 'decilliard',
66 => 'undecillion',
69 => 'undecilliard'
}

Max_exponent = 69

def self.to_English_base(val, include_and = false)

result = ''

sep = include_and ? ' and ' : ' ';

if val >= 100 then
v1 = val / 100
result << ' ' << Numbers[v1] << ' hundred'
val -= v1 * 100
end

if val >= 20 then
v1 = val / 10
result << sep << Numbers[v1 * 10]
val -= v1 * 10
sep = '-'
end

if val > 0 then
result << sep << Numbers[val]
end

result
end

def self.to_English(val, eu_names = true, include_and = true)
val = val.to_i.abs

return "zero" if val == 0

include_and = false if val <= 100

exp_hash = eu_names ? EurExponents : AmExponents

exp = Max_exponent

result = ''

while val > 0
n = 10 ** exp

if exp == 3 && val >= 1100 && val < 2000 then
v1 = val / 100
val -= v1 * 100
result << to_English_base(v1, false) << ' hundred'
elsif val >= n
v1 = val / n
val -= v1 * n
result << to_English_base(v1, exp == 0 && include_and)
if exp > 0 then
result << ' ' << exp_hash[exp]
end
end

exp -= 3
end

result.strip
end

def self.to_American(val, include_and = true)
to_English(val, false, include_and)
end

def self.lowest_brute(val, eu_names)
lowStr = 'zero'
lowV = 0

(1..val).each { |v|

next if (v & 1) == 0

str = to_English(v, eu_names)
if str < lowStr then
lowV = v
lowStr = str
end
}

[ lowV, lowStr ]
end

def self.find_for_interval(val, exp, eu_names)

include_and = (0 == exp) # && false

lowStr = 'zero'
lowV = 0

if (exp <= 0) then
suffix = ''
elsif eu_names then
suffix = ' ' + EurExponents[exp]
else
suffix = ' ' + AmExponents[exp]
end

(1..val).each { |v|

# Only skip odd numbers if exp == 0
#
next if (0 == exp) && (v & 1) == 0

str = to_English(v, eu_names, include_and) + suffix
if str < lowStr then
lowV = v
lowStr = str
end
}

[ lowV, lowStr ]
end

def self.lowest_elegant(val, eu_names)

# get the first (exp == 0) interval
# smallest with 'and' in 1..999
# (cumulative answer)
#
exp = 0
interval_size = val < 1000 ? val : 999;
lowV, lowStr = find_for_interval(interval_size, exp, eu_names)

while (val = val / 1000) > 0

exp += 3

# intermediate interval to tack in front of number so far
#
interval_size = val < 1000 ? val : 999;
vInter, strInter =
find_for_interval(interval_size, exp, eu_names)

str = strInter + ' ' + lowStr
if str < lowStr then
lowStr = str
lowV = lowV + (vInter * 10 ** exp)
end
end

[ lowV, lowStr ]
end
end


# If no argument, then work in interactive mode
#
if !ARGV[0] || ARGV[0].to_i == 0 then

$<.each_line { |line|
puts EnglishNumerals.to_English(line.chomp)
puts EnglishNumerals.to_American(line.chomp)
}

else
val = ARGV[0].to_i

# Positive argument, try something clever...
#
if val > 0
vE, strE = EnglishNumerals.lowest_elegant(val, true)
vA, strA = EnglishNumerals.lowest_elegant(val, false)

# Negative argument, do a brute force solution
#
else
vE, strE = EnglishNumerals.lowest_brute(val.abs, true)
vA, strA = EnglishNumerals.lowest_brute(val.abs, false)
end

puts "E: #{vE} - #{strE}"
puts "A: #{vA} - #{strA}"
end

#############################################################
 
N

Nikolai Weibull

* Ruby Quiz (Mar 26, 2005 02:10):
- Create Ruby code to translate a number to it's English language
form. (My sample version works with integers up to 10**72-1.)

My Lisp module proves yet again to be capable of answering a Ruby Quiz:

require 'lisp/format'
ARGV.each{ |arg| puts Lisp.format("~R", arg.to_i) }

Enjoy,
nikolai
 
T

Timothy Byrd

When writing numbers using Arabic numerals, we customarily group them
into phrases or clauses of three digits each. These clauses are
identical and are separated by scaling factors such as 'thousand' and
'billion'. All the solutions took advantage of this repetition, though
I'd given irregular examples such as 'twelve hundred' or using 'and'
to separate the final tens and ones from the rest of the number.

Most of the solutions (though not mine, infamously) extended the
integer class to have a method for translating the number to English.

Eliah Hecht submitted the first solution. He extended the Integer
with a to_en method. Unlike the others, he only had an array for
the numbers 1..9 and defined methods to build items like seventeen or
sixty from their component syllables. His general method was to
convert the number to a string and split off the initial clause,
generate the English form of that with a degree suffix and concatenate
the result of a recursive call on the remaining part of the number.

He put in the 'and's, and his reasoned out answer of "eight billion
and eighty-five" was - I'm chagrined to say - more correct than my
"eight billion eight hundred and eighty-five". (Including 'and'
really screws up a recursive or iterative approach - or at least my
approach - to building the number, since you don't include the 'and'
without something in front of it that you may want to replace later.)

Glenn Parker gave the next solution, noting: "I'm afraid I could not
bring myself to code up some random ill-defined method of expressing
numbers in English, so I did it the way I was taught in school, using
hypens and absolutely no "ands" or commas. I think I've got Strunk &
White on my side. Regardless, I'll be somewhat surprised if my answer
comes out very different from others."

A spirited defense of the purity of the English language. And his
thoughts on finessing around the brute force approach are worth
quoting:

'Every number from 1 to 10**10 can be expressed in English as a
concatenation of one or more independent phrases. Each phrase
expresses the value of a triplet of numerals from the original number.
For example, 56_106_021 uses three phrases: "fifty-six million" for
the millions triplet (56), "one hundred six thousand" for the
thousands triplet (106), and "twenty-one" for the ones triplet (021).
I don't allow "twelve hundred" for 1200, but I think it would only
complicate the logic slightly to handle this.

'I proceed to find the lowest possible phrase for each of the four
triplets in the range of interest. It must be possible to express the
lowest number name using a subset of the lowest possible phrases for
each triplet. And since the desired number must be odd, valid subsets
will all end with the ones triplet phrase.

'The ones triplet range has only 500 phrases: the odd numbers from 1
to 999. The thousands triplet has 1000 phrases: the multiples of 1000
from 1000 to 999_999. Likewise, the millions triplet has 1000
phrases. The billions triplet has only ten phrases because we stop
caring after 10_000_000_000 (instead of going to 999_999_999_999).
Finding the four lowest phrases for each triplet requires examining
only 2510 numbers. There are only eight combinations of the resulting
four phrases that represent an odd number.

'The result: "eight billion eight hundred eight million eight hundred
eight thousand eight hundred eighty-five".'

His code is concise. He effectively splits up the number into clauses
or phrases in a single line by splitting it into an array of
individual digits, and then processes them by grabbing three at a time
from the bottom.

Matthew D Moss did a Japanese translator in addition to the English
one. He implemented his Integer.to_spoken to accept a translator as
an optional argument - used as a function object.

Dave Burt used code from "[his] translation of the Perl modules
Number::Spell and Lingua::EN::Numericalize (see CPAN)". He also
comments:

'My answer is that "a baker's dozen" comes immediately before "a
billion, a hundred and eight thousand, a hundred and eighty five". I
haven't yet enhanced my program to give me this result :)'

Patrick Hurley applied some logic to his solution, pruning his search
space down to: "numbers ending in 5 and having a combination of 0 and
8's".

Nikolai Weibull commented "My Lisp module proves yet again to be
capable of answering a Ruby Quiz".

As for me, I discovered how horribly slow my solution was. I was able
to improve it by an order of magnitude - from just over five seconds
to just over half a second. Most of the gain was from changing one
routine:

def self.to_English(val, eu_names = true, include_and = true)
v = val.to_i.abs
return "zero" if v == 0
include_and = false if v <= 100
exp_hash = eu_names ? EurExponents : AmExponents

a = []
(a.push v % 1000; v /= 1000) while v > 0

r = ''

# allow for numbers like 'twelve hundred'
#
if a[1] == 1 and a[0] >= 100 then
a[1] = 0
a[0] += 1000
end

a.each_with_index {|obj, i|
next if obj == 0
r = "#{to_English_base(obj, include_and && \
i == 0)} #{exp_hash}".strip + " #{r}"
}

r.strip
end

I used to counting down through the entire exponent table for each
number testing if the number were larger than 10**exponent. I now
chop the number up into three digit chunks and iterate up through the
chunks. (Note that I've also changed all my hashes to arrays, but
they only helped by a few percentage points.) At first I thought this
code wouldn't support numbers like 'twelve hundred', but it turn out
to take just a few lines. It's a textbook example that one
well-placed optimization at the end can make a world of difference.

-- Timothy

"The problem with defending the purity of the English language is that
English is about as pure as a cribhouse whore. We don't just borrow
words; on occasion, English has pursued other languages down alleyways
to beat them unconscious and rifle their pockets for new vocabulary."
-James D. Nicoll
 
J

James Edward Gray II

[snip Timothy's nice summary]

My big thanks to Timothy for running the quiz this week, start to
finish!

James Edward Gray II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top