# Re: word to digit module

Discussion in 'Python' started by Stephen Thorne, Dec 22, 2004.

1. ### Stephen ThorneGuest

On Wed, 22 Dec 2004 10:27:16 +0530, Gurpreet Sachdeva
<> wrote:
> Is there any module available that converts word like 'one', 'two',
> 'three' to corresponding digits 1, 2, 3??

This seemed like an interesting problem! So I decided to solve it.

I started with
http://www.python.org/pycon/dc2004/papers/42/ex1-C/ which allowed me
to create a nice test suite.
import num2eng
for i in range(40000):
e = num2eng.num2eng(i)
if toNumber(e) != i:
print e, i, toNumber(e)

once this all important test suite was created I was able to knock up
the following script. This is tested up to 'ninty nine thousand nine
hundred and ninty nine'. It won't do 'one hundred thousand', and isn't
exceptionally agile. If I were to go any higher than 'one hundred
thousand' I would probably pull out http://dparser.sf.net/ and write a
parser.

translation = {
'and':0,
'zero':0,
'one':1,
'two':2,
'three':3,
'four':4,
'five':5,
'six':6,
'seven':7,
'eight':8,
'nine':9,
'ten':10,
'eleven':11,
'twelve':12,
'thirteen':13,
'fourteen':14,
'fifteen':15,
'sixteen':16,
'seventeen':17,
'eighteen':18,
'nineteen':19,
'twenty':20,
'thirty':30,
'forty':40,
'fifty':50,
'sixty':60,
'seventy':70,
'eighty':80,
'ninety':90,
'hundred':100,
'thousand':1000,
}

def toNumber(s):
items = s.replace(',', '').split()
numbers = [translation.get(item.strip(), -1) for item in items if
item.strip()]
if -1 in numbers:
raise ValueError("Invalid string '%s'" % (s,))

if 1000 in numbers:
idx = numbers.index(1000)
hundreds = numbers[:idx]
numbers = numbers[idx+1:] + [1000*x for x in hundreds]

if 100 in numbers:
idx = numbers.index(100)
hundreds = numbers[:idx]
numbers = numbers[idx+1:] + [100*x for x in hundreds]

return sum(numbers)

Stephen Thorne

Stephen Thorne, Dec 22, 2004

2. ### John MachinGuest

Stephen Thorne wrote:
> On Wed, 22 Dec 2004 10:27:16 +0530, Gurpreet Sachdeva
> <> wrote:
> > Is there any module available that converts word like 'one', 'two',
> > 'three' to corresponding digits 1, 2, 3??

>
> This seemed like an interesting problem! So I decided to solve it.
>
> I started with
> http://www.python.org/pycon/dc2004/papers/42/ex1-C/ which allowed me
> to create a nice test suite.
> import num2eng
> for i in range(40000):
> e = num2eng.num2eng(i)
> if toNumber(e) != i:
> print e, i, toNumber(e)
>
> once this all important test suite was created I was able to knock up
> the following script. This is tested up to 'ninty nine thousand nine
> hundred and ninty nine'. It won't do 'one hundred thousand', and

isn't
> exceptionally agile. If I were to go any higher than 'one hundred
> thousand' I would probably pull out http://dparser.sf.net/ and write

a
> parser.
>

Parser?

The following appears to work, with appropriate dict entries for
'million', 'billion', etc:
[apologies for the dots, which attempt to the defeat the
indent-stuffing]
..def toNumber2(s):
.. items = s.replace(',', '').split()
.. numbers = [translation.get(item.strip(), -1) for item in items if
item.strip()]
.. stack = [0]
.. for num in numbers:
.. if num == -1:
.. raise ValueError("Invalid string '%s'" % (s,))
.. if num >= 100:
.. stack[-1] *= num
.. if num >= 1000:
.. stack.append(0)
.. else:
.. stack[-1] += num
.. return sum(stack)

John Machin, Dec 22, 2004

3. ### Scott David DanielsGuest

John Machin wrote:
> Stephen Thorne wrote:
> .def toNumber2(s):
> . items = s.replace(',', '').split()
> . numbers = [translation.get(item.strip(), -1) for item in items if
> item.strip()]
> . stack = [0]
> . for num in numbers:
> . if num == -1:
> . raise ValueError("Invalid string '%s'" % (s,))
> . if num >= 100:
> . stack[-1] *= num
> . if num >= 1000:
> . stack.append(0)
> . else:
> . stack[-1] += num
> . return sum(stack)
>

Can I play too?
Let's replace the top with some little bit of error handling:

def toNumber3(text):
s = text.replace(',', '').replace('-', '')# for twenty-three
items = s.split()
try:
numbers = [translation[item] for item in items]
except KeyError, e:
raise ValueError, "Invalid element %r in string %r" % (
e.args[0], text)
stack = [0]
for num in numbers:
if num >= 100:
stack[-1] *= num
if num >= 1000:
stack.append(0)
else:
stack[-1] += num
return sum(stack)

--Scott David Daniels

Scott David Daniels, Dec 22, 2004
4. ### M.E.FarmerGuest

Cool script just one little thing,
toNumber('One thousand') bites the dust.
Guess you should add another test, and s.lower()

Stephen Thorne wrote:
{code snip}
> def toNumber(s):

+ s = s.lower()
> items = s.replace(',', '').split()
> numbers = [translation.get(item.strip(), -1) for item in items if
> item.strip()]
> if -1 in numbers:
> raise ValueError("Invalid string '%s'" % (s,))
>
> if 1000 in numbers:
> idx = numbers.index(1000)
> hundreds = numbers[:idx]
> numbers = numbers[idx+1:] + [1000*x for x in hundreds]
>
> if 100 in numbers:
> idx = numbers.index(100)
> hundreds = numbers[:idx]
> numbers = numbers[idx+1:] + [100*x for x in hundreds]
>
> return sum(numbers)
>
> Stephen Thorne

M.E.Farmer

M.E.Farmer, Dec 22, 2004
5. ### Stephen ThorneGuest

On Wed, 22 Dec 2004 11:41:26 -0800, Scott David Daniels
<> wrote:
> John Machin wrote:
> > Stephen Thorne wrote:
> > .def toNumber2(s):
> > . items = s.replace(',', '').split()
> > . numbers = [translation.get(item.strip(), -1) for item in items if
> > item.strip()]
> > . stack = [0]
> > . for num in numbers:
> > . if num == -1:
> > . raise ValueError("Invalid string '%s'" % (s,))
> > . if num >= 100:
> > . stack[-1] *= num
> > . if num >= 1000:
> > . stack.append(0)
> > . else:
> > . stack[-1] += num
> > . return sum(stack)
> >

>
> Can I play too?
> Let's replace the top with some little bit of error handling:
>
> def toNumber3(text):
> s = text.replace(',', '').replace('-', '')# for twenty-three
> items = s.split()
> try:
> numbers = [translation[item] for item in items]
> except KeyError, e:
> raise ValueError, "Invalid element %r in string %r" % (
> e.args[0], text)
> stack = [0]
> for num in numbers:
> if num >= 100:
> stack[-1] *= num
> if num >= 1000:
> stack.append(0)
> else:
> stack[-1] += num
> return sum(stack)

Thankyou for you feedback, both of you.