Simple arithmetic parser example

Discussion in 'C Programming' started by Francis Moreau, Dec 8, 2009.

  1. Hello,

    I would be interested see some piece of C code that parses arithmetic
    expressions.

    For example, the parser could parse : "(2+3) / 10 - 4" (respecting the
    precedence of each operators) and outputs the result.

    I know I can simply use yacc/bison to achieve this goal, but I'm not
    really interested in it.

    Could anybody point out some code ?

    Thanks
    Francis Moreau, Dec 8, 2009
    #1
    1. Advertising

  2. Francis Moreau

    Eric Sosman Guest

    Francis Moreau wrote:
    > Hello,
    >
    > I would be interested see some piece of C code that parses arithmetic
    > expressions.
    >
    > For example, the parser could parse : "(2+3) / 10 - 4" (respecting the
    > precedence of each operators) and outputs the result.
    >
    > I know I can simply use yacc/bison to achieve this goal, but I'm not
    > really interested in it.
    >
    > Could anybody point out some code ?


    You can find some algorithm explanations and code samples at:

    http://www.google.com/search?q=arithmetic expression parser

    --
    Eric Sosman
    lid
    Eric Sosman, Dec 8, 2009
    #2
    1. Advertising

  3. On 8 Dec, 12:31, Francis Moreau <> wrote:

    > I would be interested see some piece of C code that parses arithmetic
    > expressions.
    >
    > For example, the parser could parse : "(2+3) / 10 - 4" (respecting the
    > precedence of each operators) and outputs the result.
    >
    > I know I can simply use yacc/bison to achieve this goal, but I'm not
    > really interested in it.
    >
    > Could anybody point out some code ?


    there was a long rambling thread recently on the conversion of infix
    to postfix some of which used full-blown parsers to solve the problem.
    Try "How to convert infix to postfix"; it *is* excessivly long but
    does contain some code.
    Nick Keighley, Dec 8, 2009
    #3
  4. Francis Moreau <> writes:

    > I would be interested see some piece of C code that parses arithmetic
    > expressions.
    >
    > For example, the parser could parse : "(2+3) / 10 - 4" (respecting the
    > precedence of each operators) and outputs the result.
    >
    > I know I can simply use yacc/bison to achieve this goal, but I'm not
    > really interested in it.
    >
    > Could anybody point out some code ?


    This came up quite recently in a huge thread called "How to convert
    Infix notation to postfix notation". I posted an example of parsing
    using operator precedence tables as did, if I recall, a poster called
    Gene. If you can't find it, I'll post again.

    [The old code get corrected by posting here. The string type should
    have been called String.]

    --
    Ben.
    Ben Bacarisse, Dec 8, 2009
    #4
  5. Ben Bacarisse <> writes:

    > This came up quite recently in a huge thread called "How to convert
    > Infix notation to postfix notation". I posted an example of parsing
    > using operator precedence tables as did, if I recall, a poster called
    > Gene. If you can't find it, I'll post again.
    >
    > [The old code get corrected by posting here. The string type should
    > have been called String.]


    Message ID:


    --
    Ben.
    Ben Bacarisse, Dec 8, 2009
    #5
  6. In article <>,
    Francis Moreau <> wrote:
    >Hello,
    >
    >I would be interested see some piece of C code that parses arithmetic
    >expressions.


    First, read up on "recursive descent parsers". They're really easy
    and fun to write, and can be used for a million things.

    Here's a quick & dirty one I wrote some years ago. It handles the
    four basic functions (plus mod). No variables, no functions. It
    should do what you want, or at least get you started.

    Since all the operators were a single character, it was trivial to
    break the input into tokens with a simple regex.

    It doesn't do error handling (e.g. divide by zero) and it stops
    parsing on syntax error rather than reporting it. Adding these
    features is left as an exercise for the reader.


    #!/usr/bin/python

    '''Evaluate expressions.'''
    # Written by Ed Falk

    import re

    parse_re = re.compile('([-+*/%()])')

    def expr(str):
    '''Evaluate simple expression using the operators ()+-*/%'''
    tokens = [x.strip() for x in parse_re.split(str) if x]
    return __sum(tokens)

    def __sum(tokens):
    rval = __prod(tokens)
    while tokens:
    if tokens[0] == '+':
    tokens.pop(0)
    rval += __prod(tokens)
    elif tokens[0] == '-':
    tokens.pop(0)
    rval -= __prod(tokens)
    else: break;
    return rval

    def __prod(tokens):
    rval = __val(tokens)
    while tokens:
    if tokens[0] == '*':
    tokens.pop(0)
    rval *= __val(tokens)
    elif tokens[0] == '/':
    tokens.pop(0)
    rval /= __val(tokens)
    elif tokens[0] == '%':
    tokens.pop(0)
    rval %= __val(tokens)
    else: break;
    return rval

    def __val(tokens):
    if tokens[0] == '(':
    tokens.pop(0)
    rval = __sum(tokens)
    if tokens[0] != ')': raise ArithmeticError
    tokens.pop(0)
    return rval
    elif tokens[0] == '-':
    tokens.pop(0)
    return -__val(tokens)
    else:
    return float(tokens.pop(0))

    if __name__ == "__main__":
    print expr("1")
    print expr("2+2")
    print expr("1+7/8")
    print expr(".875*64")
    print expr("1 + 2 * 3/1.5")
    print expr("(1+2)*(3+4)")
    print expr("-(1+2)*(3+4)")
    print expr("(1+2)*-(3+4)")
    print expr("22/7")
    print expr("355/113")
    --
    -Ed Falk,
    http://thespamdiaries.blogspot.com/
    Edward A. Falk, Dec 8, 2009
    #6
  7. Francis Moreau

    Phil Carmody Guest

    (Edward A. Falk) writes:
    > In article <>,
    > Francis Moreau <> wrote:
    >>Hello,
    >>
    >>I would be interested see some piece of C code that parses arithmetic
    >>expressions.

    >
    > First, read up on "recursive descent parsers". They're really easy
    > and fun to write, and can be used for a million things.
    >
    > Here's a quick & dirty one I wrote some years ago. It handles the
    > four basic functions (plus mod). No variables, no functions. It
    > should do what you want, or at least get you started.
    >
    > Since all the operators were a single character, it was trivial to
    > break the input into tokens with a simple regex.


    Regular expressions aren't C. You'd need a 3rd-party library to do
    that. (Or roll your own.)

    > It doesn't do error handling (e.g. divide by zero) and it stops
    > parsing on syntax error rather than reporting it. Adding these
    > features is left as an exercise for the reader.
    >
    > #!/usr/bin/python


    Well, I think we agreed that was legal C

    > '''Evaluate expressions.'''


    But that isn't.

    > rval = __prod(tokens)
    > if tokens[0] == '+':


    I don't have many positive things to say about the white-space-
    incontinent python, but am glad to see that other languages have
    made the same 'mistake' as C when it comes to the length of the
    assignment and equality-test operator tokens. I have never quite
    understood the vehemence of those who take the contrary PoV.

    Phil
    --
    Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1
    Phil Carmody, Dec 8, 2009
    #7
  8. On Dec 8, 3:00 pm, Ben Bacarisse <> wrote:
    > Ben Bacarisse <> writes:
    > > This came up quite recently in a huge thread called "How to convert
    > > Infix notation to postfix notation".  I posted an example of parsing
    > > using operator precedence tables as did, if I recall, a poster called
    > > Gene.  If you can't find it, I'll post again.

    >
    > > [The old code get corrected by posting here.  The string type should
    > > have been called String.]

    >
    > Message ID:
    >
    >


    Thank you Ben, I'll look into this code and the loooong thread you
    pointed out.
    Francis Moreau, Dec 9, 2009
    #8
  9. On Dec 8, 10:02 pm, (Edward A. Falk) wrote:
    > In article <..com>,
    > Francis Moreau  <> wrote:
    >
    > >Hello,

    >
    > >I would be interested see some piece of C code that parses arithmetic
    > >expressions.

    >
    > First, read up on "recursive descent parsers".  They're really easy
    > and fun to write, and can be used for a million things.
    >


    Indeed, looks quite simple although the recursivity may be a bad idea
    for long expressions.

    > Here's a quick & dirty one I wrote some years ago.  It handles the
    > four basic functions (plus mod).  No variables, no functions.  It
    > should do what you want, or at least get you started.
    >


    Thanks
    Francis Moreau, Dec 9, 2009
    #9
  10. On 8 Dec, 23:59, Phil Carmody <> wrote:
    > (Edward A. Falk) writes:
    >
    >
    >
    >
    >
    > > In article <>,
    > > Francis Moreau  <> wrote:
    > >>Hello,

    >
    > >>I would be interested see some piece of C code that parses arithmetic
    > >>expressions.

    >
    > > First, read up on "recursive descent parsers".  They're really easy
    > > and fun to write, and can be used for a million things.

    >
    > > Here's a quick & dirty one I wrote some years ago.  It handles the
    > > four basic functions (plus mod).  No variables, no functions.  It
    > > should do what you want, or at least get you started.

    >
    > > Since all the operators were a single character, it was trivial to
    > > break the input into tokens with a simple regex.

    >
    > Regular expressions aren't C. You'd need a 3rd-party library to do
    > that. (Or roll your own.)
    >
    > > It doesn't do error handling (e.g. divide by zero) and it stops
    > > parsing on syntax error rather than reporting it.  Adding these
    > > features is left as an exercise for the reader.

    >
    > > #!/usr/bin/python

    >
    > Well, I think we agreed that was legal C
    >
    > > '''Evaluate expressions.'''

    >
    > But that isn't.
    >
    > >   rval = __prod(tokens)
    > >     if tokens[0] == '+':

    >
    > I don't have many positive things to say about the white-space-
    > incontinent python,


    since it looks like my pseudo-code I quite like it. Though I'm pretty
    sure they read my mind...

    > but am glad to see that other languages have
    > made the same 'mistake' as C when it comes to the length of the
    > assignment and equality-test operator tokens.


    'cos python lifted all its operators from C. I've also thought making
    the decision based on length was bizzare.

    > I have never quite
    > understood the vehemence of those who take the contrary PoV.


    because the mathematics people have been using = as an equality
    operator since Al-Khwarizmi was a lad (I exagerate to make a point).
    Everyone knows assignment should be indicated <- (or, at a push, :=).
    It took me ages to train my brain to the C-way of doing things.
    Nick Keighley, Dec 9, 2009
    #10
  11. Francis Moreau <> writes:

    > On Dec 8, 10:02 pm, (Edward A. Falk) wrote:
    >> In article <>,
    >> Francis Moreau  <> wrote:

    <snip>
    >> >I would be interested see some piece of C code that parses arithmetic
    >> >expressions.

    >>
    >> First, read up on "recursive descent parsers".  They're really easy
    >> and fun to write, and can be used for a million things.

    >
    > Indeed, looks quite simple although the recursivity may be a bad idea
    > for long expressions.


    Expressions are inherently recursive so trying to avoid a recursive
    parser complicates the work. If the code is not recursive, it will
    have to push and pop state using an explicit stack anyway.

    The recursion is not (usually) linked to the length of the expression
    but to its depth so there is little danger of running out of stack
    unless (a) you have virtually no stack (as in some embedded systems)
    or (b) you have to parse mechanically generated expressions with
    unbounded depth.

    If (a) is your problem, look up LL(1) parsers.

    <snip>
    --
    Ben.
    Ben Bacarisse, Dec 9, 2009
    #11
  12. Francis Moreau

    Nick Guest

    Francis Moreau <> writes:

    > On Dec 8, 10:02 pm, (Edward A. Falk) wrote:
    >> In article <>,
    >> Francis Moreau  <> wrote:
    >>
    >> >Hello,

    >>
    >> >I would be interested see some piece of C code that parses arithmetic
    >> >expressions.

    >>
    >> First, read up on "recursive descent parsers".  They're really easy
    >> and fun to write, and can be used for a million things.
    >>

    >
    > Indeed, looks quite simple although the recursivity may be a bad idea
    > for long expressions.


    Don't forget that you've got two, related, tasks. One is to parse the
    input into a series of "tokens" (numbers, operators etc) and the second
    is to do things with the tokens.

    I tend to use a pair of stacks myself. It's a technique I learnt from
    reading the dissembly of the Sinclair Spectrum ROM.

    You put numbers on one stack and operators on another. When you come to
    push an operator you first see if the top operator on the stack has
    higher precedence than the one you are about to push. If so, you pop
    and execute it first (popping values from the number stack and putting the
    answer there), repeating if necessary.

    When finished, start popping and operating on the operator stack. If
    either stack runs out unexpectedly, or something is left at the end, the
    expresion was malformed.
    --
    Online waterways route planner: http://canalplan.org.uk
    development version: http://canalplan.eu
    Nick, Dec 9, 2009
    #12
  13. On Dec 9, 2:55 pm, Ben Bacarisse <> wrote:
    >
    > Expressions are inherently recursive so trying to avoid a recursive
    > parser complicates the work.  If the code is not recursive, it will
    > have to push and pop state using an explicit stack anyway.


    I agree but recursive descent parser looks doing more function calls
    than the one you implemented at a first glance.

    Actually I was wondering which parser would be the most efficient (in
    term of speed an memory consumption) according to the simple grammar
    (basic arithmetic expressions) I want to parse.

    >
    > The recursion is not (usually) linked to the length of the expression
    > but to its depth so there is little danger of running out of stack
    > unless (a) you have virtually no stack (as in some embedded systems)
    > or (b) you have to parse mechanically generated expressions with
    > unbounded depth.
    >
    > If (a) is your problem, look up LL(1) parsers.


    I have no problem at all, I'm just wondering which parser would be the
    best candidate for this purpose.

    In fact a good exercice would be to implement the different possible
    parsers and see how they are in practice...
    Francis Moreau, Dec 10, 2009
    #13
  14. On Dec 9, 7:54 pm, Nick <> wrote:
    > Francis Moreau <> writes:
    > > On Dec 8, 10:02 pm, (Edward A. Falk) wrote:
    > >> In article <>,
    > >> Francis Moreau  <> wrote:

    >
    > >> >Hello,

    >
    > >> >I would be interested see some piece of C code that parses arithmetic
    > >> >expressions.

    >
    > >> First, read up on "recursive descent parsers".  They're really easy
    > >> and fun to write, and can be used for a million things.

    >
    > > Indeed, looks quite simple although the recursivity may be a bad idea
    > > for long expressions.

    >
    > Don't forget that you've got two, related, tasks.  One is to parse the
    > input into a series of "tokens" (numbers, operators etc) and the second
    > is to do things with the tokens.
    >
    > I tend to use a pair of stacks myself.  It's a technique I learnt from
    > reading the dissembly of the Sinclair Spectrum ROM.  
    >
    > You put numbers on one stack and operators on another.  When you come to
    > push an operator you first see if the top operator on the stack has
    > higher precedence than the one you are about to push.  If so, you pop
    > and execute it first (popping values from the number stack and putting the
    > answer there), repeating if necessary.
    >
    > When finished, start popping and operating on the operator stack.  If
    > either stack runs out unexpectedly, or something is left at the end, the
    > expresion was malformed.


    It sounds that you implement the Shunting-yard algorithm, don't you ?
    Francis Moreau, Dec 10, 2009
    #14
  15. Francis Moreau

    Nick Guest

    Francis Moreau <> writes:

    > On Dec 9, 7:54 pm, Nick <> wrote:
    >> Francis Moreau <> writes:
    >> > On Dec 8, 10:02 pm, (Edward A. Falk) wrote:
    >> >> In article <>,
    >> >> Francis Moreau  <> wrote:

    >>
    >> >> >Hello,

    >>
    >> >> >I would be interested see some piece of C code that parses arithmetic
    >> >> >expressions.

    >>
    >> >> First, read up on "recursive descent parsers".  They're really easy
    >> >> and fun to write, and can be used for a million things.

    >>
    >> > Indeed, looks quite simple although the recursivity may be a bad idea
    >> > for long expressions.

    >>
    >> Don't forget that you've got two, related, tasks.  One is to parse the
    >> input into a series of "tokens" (numbers, operators etc) and the second
    >> is to do things with the tokens.
    >>
    >> I tend to use a pair of stacks myself.  It's a technique I learnt from
    >> reading the dissembly of the Sinclair Spectrum ROM.  
    >>
    >> You put numbers on one stack and operators on another.  When you come to
    >> push an operator you first see if the top operator on the stack has
    >> higher precedence than the one you are about to push.  If so, you pop
    >> and execute it first (popping values from the number stack and putting the
    >> answer there), repeating if necessary.
    >>
    >> When finished, start popping and operating on the operator stack.  If
    >> either stack runs out unexpectedly, or something is left at the end, the
    >> expresion was malformed.

    >
    > It sounds that you implement the Shunting-yard algorithm, don't you ?


    Having looked it up (I'd not heard of the name) it's effectively that,
    but with the subsequent evaluation of the RPN version folded into the
    processing. So in 3*5+2, the shunting yard would do:
    3 -> output [now 3]
    * -> stack [now *]
    5 -> output [now 3 5]
    + pop operator stack, push +. Output now [3 5 *], operator stack [+]
    2 -> output [now 3 5 * 2]
    pop operator stack. output 3 5 * 2 +

    Mine (well, the Spectrum's) algorithm does:
    3 -> value stack [now 3]
    * -> operator stack [now *]
    5 -> value stack [now 3 5]
    + pop operator stack, execute '*' (pop value stack twice, do *, push
    result), + -> operator stack. Value stack now [8], operator stack [+]
    2 -> value stack [now 8 2]
    pop operator stack. Execute +, result 10.

    You lose the benefit of having a "compiled" version of the expression
    for later use, but save having two separate phases.
    --
    Online waterways route planner: http://canalplan.org.uk
    development version: http://canalplan.eu
    Nick, Dec 10, 2009
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Doe
    Replies:
    0
    Views:
    524
    John Doe
    Apr 17, 2005
  2. Jona
    Replies:
    0
    Views:
    260
  3. joshc
    Replies:
    5
    Views:
    540
    Keith Thompson
    Mar 31, 2005
  4. Replies:
    6
    Views:
    951
    user923005
    Nov 20, 2007
  5. Cyde Weys
    Replies:
    7
    Views:
    112
    Clayton L. Scott
    Feb 24, 2004
Loading...

Share This Page