Re: word to digit module

Discussion in 'Python' started by Stephen Thorne, Dec 22, 2004.

  1. On Wed, 22 Dec 2004 10:27:16 +0530, Gurpreet Sachdeva
    <> wrote:
    > Is there any module available that converts word like 'one', 'two',
    > 'three' to corresponding digits 1, 2, 3??


    This seemed like an interesting problem! So I decided to solve it.

    I started with
    http://www.python.org/pycon/dc2004/papers/42/ex1-C/ which allowed me
    to create a nice test suite.
    import num2eng
    for i in range(40000):
    e = num2eng.num2eng(i)
    if toNumber(e) != i:
    print e, i, toNumber(e)

    once this all important test suite was created I was able to knock up
    the following script. This is tested up to 'ninty nine thousand nine
    hundred and ninty nine'. It won't do 'one hundred thousand', and isn't
    exceptionally agile. If I were to go any higher than 'one hundred
    thousand' I would probably pull out http://dparser.sf.net/ and write a
    parser.

    translation = {
    'and':0,
    'zero':0,
    'one':1,
    'two':2,
    'three':3,
    'four':4,
    'five':5,
    'six':6,
    'seven':7,
    'eight':8,
    'nine':9,
    'ten':10,
    'eleven':11,
    'twelve':12,
    'thirteen':13,
    'fourteen':14,
    'fifteen':15,
    'sixteen':16,
    'seventeen':17,
    'eighteen':18,
    'nineteen':19,
    'twenty':20,
    'thirty':30,
    'forty':40,
    'fifty':50,
    'sixty':60,
    'seventy':70,
    'eighty':80,
    'ninety':90,
    'hundred':100,
    'thousand':1000,
    }

    def toNumber(s):
    items = s.replace(',', '').split()
    numbers = [translation.get(item.strip(), -1) for item in items if
    item.strip()]
    if -1 in numbers:
    raise ValueError("Invalid string '%s'" % (s,))

    if 1000 in numbers:
    idx = numbers.index(1000)
    hundreds = numbers[:idx]
    numbers = numbers[idx+1:] + [1000*x for x in hundreds]

    if 100 in numbers:
    idx = numbers.index(100)
    hundreds = numbers[:idx]
    numbers = numbers[idx+1:] + [100*x for x in hundreds]

    return sum(numbers)

    Stephen Thorne
     
    Stephen Thorne, Dec 22, 2004
    #1
    1. Advertising

  2. Stephen Thorne

    John Machin Guest

    Stephen Thorne wrote:
    > On Wed, 22 Dec 2004 10:27:16 +0530, Gurpreet Sachdeva
    > <> wrote:
    > > Is there any module available that converts word like 'one', 'two',
    > > 'three' to corresponding digits 1, 2, 3??

    >
    > This seemed like an interesting problem! So I decided to solve it.
    >
    > I started with
    > http://www.python.org/pycon/dc2004/papers/42/ex1-C/ which allowed me
    > to create a nice test suite.
    > import num2eng
    > for i in range(40000):
    > e = num2eng.num2eng(i)
    > if toNumber(e) != i:
    > print e, i, toNumber(e)
    >
    > once this all important test suite was created I was able to knock up
    > the following script. This is tested up to 'ninty nine thousand nine
    > hundred and ninty nine'. It won't do 'one hundred thousand', and

    isn't
    > exceptionally agile. If I were to go any higher than 'one hundred
    > thousand' I would probably pull out http://dparser.sf.net/ and write

    a
    > parser.
    >


    Parser?

    The following appears to work, with appropriate dict entries for
    'million', 'billion', etc:
    [apologies in advance if @#$% groups-beta.google stuffs the indenting]
    [apologies for the dots, which attempt to the defeat the
    indent-stuffing]
    ..def toNumber2(s):
    .. items = s.replace(',', '').split()
    .. numbers = [translation.get(item.strip(), -1) for item in items if
    item.strip()]
    .. stack = [0]
    .. for num in numbers:
    .. if num == -1:
    .. raise ValueError("Invalid string '%s'" % (s,))
    .. if num >= 100:
    .. stack[-1] *= num
    .. if num >= 1000:
    .. stack.append(0)
    .. else:
    .. stack[-1] += num
    .. return sum(stack)
     
    John Machin, Dec 22, 2004
    #2
    1. Advertising

  3. John Machin wrote:
    > Stephen Thorne wrote:
    > .def toNumber2(s):
    > . items = s.replace(',', '').split()
    > . numbers = [translation.get(item.strip(), -1) for item in items if
    > item.strip()]
    > . stack = [0]
    > . for num in numbers:
    > . if num == -1:
    > . raise ValueError("Invalid string '%s'" % (s,))
    > . if num >= 100:
    > . stack[-1] *= num
    > . if num >= 1000:
    > . stack.append(0)
    > . else:
    > . stack[-1] += num
    > . return sum(stack)
    >


    Can I play too?
    Let's replace the top with some little bit of error handling:

    def toNumber3(text):
    s = text.replace(',', '').replace('-', '')# for twenty-three
    items = s.split()
    try:
    numbers = [translation[item] for item in items]
    except KeyError, e:
    raise ValueError, "Invalid element %r in string %r" % (
    e.args[0], text)
    stack = [0]
    for num in numbers:
    if num >= 100:
    stack[-1] *= num
    if num >= 1000:
    stack.append(0)
    else:
    stack[-1] += num
    return sum(stack)

    --Scott David Daniels
     
    Scott David Daniels, Dec 22, 2004
    #3
  4. Stephen Thorne

    M.E.Farmer Guest

    Cool script just one little thing,
    toNumber('One thousand') bites the dust.
    Guess you should add another test, and s.lower() ;)

    Stephen Thorne wrote:
    {code snip}
    > def toNumber(s):

    + s = s.lower()
    > items = s.replace(',', '').split()
    > numbers = [translation.get(item.strip(), -1) for item in items if
    > item.strip()]
    > if -1 in numbers:
    > raise ValueError("Invalid string '%s'" % (s,))
    >
    > if 1000 in numbers:
    > idx = numbers.index(1000)
    > hundreds = numbers[:idx]
    > numbers = numbers[idx+1:] + [1000*x for x in hundreds]
    >
    > if 100 in numbers:
    > idx = numbers.index(100)
    > hundreds = numbers[:idx]
    > numbers = numbers[idx+1:] + [100*x for x in hundreds]
    >
    > return sum(numbers)
    >
    > Stephen Thorne


    M.E.Farmer
     
    M.E.Farmer, Dec 22, 2004
    #4
  5. On Wed, 22 Dec 2004 11:41:26 -0800, Scott David Daniels
    <> wrote:
    > John Machin wrote:
    > > Stephen Thorne wrote:
    > > .def toNumber2(s):
    > > . items = s.replace(',', '').split()
    > > . numbers = [translation.get(item.strip(), -1) for item in items if
    > > item.strip()]
    > > . stack = [0]
    > > . for num in numbers:
    > > . if num == -1:
    > > . raise ValueError("Invalid string '%s'" % (s,))
    > > . if num >= 100:
    > > . stack[-1] *= num
    > > . if num >= 1000:
    > > . stack.append(0)
    > > . else:
    > > . stack[-1] += num
    > > . return sum(stack)
    > >

    >
    > Can I play too?
    > Let's replace the top with some little bit of error handling:
    >
    > def toNumber3(text):
    > s = text.replace(',', '').replace('-', '')# for twenty-three
    > items = s.split()
    > try:
    > numbers = [translation[item] for item in items]
    > except KeyError, e:
    > raise ValueError, "Invalid element %r in string %r" % (
    > e.args[0], text)
    > stack = [0]
    > for num in numbers:
    > if num >= 100:
    > stack[-1] *= num
    > if num >= 1000:
    > stack.append(0)
    > else:
    > stack[-1] += num
    > return sum(stack)


    Thankyou for you feedback, both of you.
    http://thorne.id.au/users/stephen/scripts/eng2num.py contains your suggestions.

    Stephen.
     
    Stephen Thorne, Dec 22, 2004
    #5
  6. Stephen Thorne

    John Machin Guest

    John Machin, Dec 22, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fangs
    Replies:
    3
    Views:
    9,938
    darshana
    Oct 26, 2008
  2. championsleeper
    Replies:
    6
    Views:
    1,051
    championsleeper
    Apr 6, 2004
  3. Curt_C [MVP]

    Re: Generating 8 digit unique ID

    Curt_C [MVP], Apr 20, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    490
    Curt_C [MVP]
    Apr 20, 2004
  4. Gurpreet Sachdeva

    word to digit module

    Gurpreet Sachdeva, Dec 22, 2004, in forum: Python
    Replies:
    1
    Views:
    274
    M.E.Farmer
    Dec 22, 2004
  5. April
    Replies:
    16
    Views:
    209
    Ted Zlatanov
    Jul 7, 2008
Loading...

Share This Page