Equation parsing

Discussion in 'C Programming' started by gamehack, Feb 3, 2006.

  1. gamehack

    gamehack Guest

    Hi all,

    I was thinking about parsing equations but I can't think of any generic
    approach. Basically I have a struct called math_term which is something
    like:
    struct math_term {
    char sign;
    int constant;
    int x;
    int y;
    int xpower;
    int ypower;
    }

    For example say the user inputs this:
    6x^2-8x^3+5
    Then this would be transformed to 3 structs
    6x^2: sign = '+'; constant = 6; x = 1; y = 0; xpower = 2; ypower = 0;
    -8x^3: sign = '-'; constant = 8; x = 1; y = 0; xpower = 3; ypower = 0;
    +5: sign = '+'; constant = 5; x = 0; y = 0; xpower = 0; ypower = 0;

    My problem is getting the input into my structured form. Also I was
    thinking of implementing brackets but I wasn't sure what datastructure
    to use to hold all the terms. A linked list probably? Or a bin tree? I
    wasn't able to figure out how to handle the case
    (6x^2 + 3x)/(9x^3)

    Any pointers on how to proceed from here are greatly appreciated. I
    just need the general idea how to go about the problem and I'll figure
    out how to implement it. Thanks
     
    gamehack, Feb 3, 2006
    #1
    1. Advertising

  2. gamehack

    Jordan Abel Guest

    On 2006-02-03, gamehack <> wrote:
    > Hi all,
    >
    > I was thinking about parsing equations but I can't think of any generic
    > approach. Basically I have a struct called math_term which is something
    > like:
    > struct math_term {
    > char sign;
    > int constant;
    > int x;
    > int y;
    > int xpower;
    > int ypower;
    >}
    >
    > For example say the user inputs this:
    > 6x^2-8x^3+5
    > Then this would be transformed to 3 structs
    > 6x^2: sign = '+'; constant = 6; x = 1; y = 0; xpower = 2; ypower = 0;
    > -8x^3: sign = '-'; constant = 8; x = 1; y = 0; xpower = 3; ypower = 0;
    > +5: sign = '+'; constant = 5; x = 0; y = 0; xpower = 0; ypower = 0;
    >
    > My problem is getting the input into my structured form. Also I was
    > thinking of implementing brackets but I wasn't sure what datastructure
    > to use to hold all the terms. A linked list probably? Or a bin tree? I
    > wasn't able to figure out how to handle the case
    > (6x^2 + 3x)/(9x^3)
    >
    > Any pointers on how to proceed from here are greatly appreciated. I
    > just need the general idea how to go about the problem and I'll figure
    > out how to implement it. Thanks


    There's a specific kind of tree that's useful for this, and you'd
    basically have a node that is either a number or an operator with
    further nodes below it

    (warning, beware the fixed-font ascii diagrams)

    for 6x^2-8x^3+5 you'd have

    +
    .'.
    - 5
    .' .
    * *
    ..'. .'.
    6 ^ 8 ^
    .'. .'.
    x 2 x 3



    For (6x^2+3x)/(9x^3), you'd get this

    /
    . ' .
    + *
    . ' . .'.
    * * 9 ^
    ..'. .'. .'.
    6 ^ 3 x x 3
    .'.
    x 2
     
    Jordan Abel, Feb 3, 2006
    #2
    1. Advertising

  3. gamehack wrote:
    >
    > Hi all,
    >
    > I was thinking about parsing equations but I can't think of any generic
    > approach. Basically I have a struct called math_term which is something
    > like:
    > struct math_term {
    > char sign;
    > int constant;
    > int x;
    > int y;
    > int xpower;
    > int ypower;
    > }
    >
    > For example say the user inputs this:
    > 6x^2-8x^3+5
    > Then this would be transformed to 3 structs
    > 6x^2: sign = '+'; constant = 6; x = 1; y = 0; xpower = 2; ypower = 0;
    > -8x^3: sign = '-'; constant = 8; x = 1; y = 0; xpower = 3; ypower = 0;
    > +5: sign = '+'; constant = 5; x = 0; y = 0; xpower = 0; ypower = 0;
    >
    > My problem is getting the input into my structured form. Also I was
    > thinking of implementing brackets but I wasn't sure what datastructure
    > to use to hold all the terms. A linked list probably? Or a bin tree? I
    > wasn't able to figure out how to handle the case
    > (6x^2 + 3x)/(9x^3)
    >
    > Any pointers on how to proceed from here are greatly appreciated. I
    > just need the general idea how to go about the problem and I'll figure
    > out how to implement it. Thanks


    Have a look at the "dragon book":

    "Compilers, principles, techniques, and tools"
    by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman
    (Addison-Wesley Pub. Co., 1986)

    They discuss all the methods.

    I personally employ recursive descent, using a stack that holds pointers
    to the sub-expressions and operators. Others prefer a tree structure. Yet
    others like the "operator precedence grammar". There is a notation called
    Backus-Naur Format (BNF) for expressing how parsers work. It is described
    in the dragon book. A mini-Fortran parser I wrote uses these rules:

    ----------------------------- Backus-Naur Rules for mini FORTRAN
    NOTATION:
    | -> "or",
    + -> "unlimited repetitions"
    Q -> "empty set"
    & -> + | -
    % -> * | /

    NUMBERS:
    fp# -> {-|Q}{digit.digit+ | .digit digit+} exponent
    exponent -> {dDeE {&|Q} digit {digit|Q} {digit|Q} | Q}

    FORMULAS:
    assignment -> id = expression
    id -> name | name {+ Forth }+ --curly braces balance!
    name -> letter {letter|digit}+
    arglist -> ( expression {,expression}+ )
    function -> id arglist
    expression -> term | expression & term
    term -> factor | term % factor
    factor -> id | fp# | ( expr ) | f^f | function
    ------------------------------------------ end Backus-Naur Rules

    Consider an "expression": the rules say it can be either a "term" or
    a "term" joined to an "expression" (that is, a sub-expression) by a +
    or - . This is a natural for recursion, since after finding a + or -
    and peeling off the leading term the expression-parsing function can
    then call itself using the pointers to the sub-expression (everything
    to the right of the + or - ) as input.

    Doing things in this order is "left-to-right" or LR parsing. There is
    no rule that you couldn't do it from right to left.

    But as I note, this is not the only way to proceed, and lots of people
    prefer to do it other ways.

    --
    Julian V. Noble
    Professor Emeritus of Physics

    http://galileo.phys.virginia.edu/~jvn/

    "As democracy is perfected, the office of president represents, more and
    more closely, the inner soul of the people. On some great and glorious
    day the plain folks of the land will reach their heart's desire at last
    and the White House will be adorned by a downright moron."

    --- H. L. Mencken (1880 - 1956)
     
    Julian V. Noble, Feb 3, 2006
    #3
  4. /*
    The most flexible approach I can think of is the language Prolog, where new
    operators can be defined by programs - e.g. in a Prolog program you can
    extend the language by defining new operators e.g. 'blah' or '=>>>", while
    in say C++ you have to stick to re-defining the existing operators.

    In any language expressions can be though of as trees
    infix op prefix op postfix op
    / \ \ /
    exp1 exp2 exp exp
    or at the lowest level, very simple terminal one-node trees e.g.
    identifier (e.g. x, y, i, val), or literal ( 1, 1.0, 1E-4, "Fred"), or
    function call ( func( exp1,..., expn) )

    First define fixity or associativity of your ops... */
    enum fixity {
    fx, /* prefix operator f where (operated on) where x stands for an
    expression of lower precedence*/
    fy, /* prefix operator f where y stands for an expression of equal or
    lower precedence
    e.g. f=unary - f=or unary + parsed as +(+(-exp)) */
    xfx, /*infix operator f where precedence of left & right expression is
    lower precedence
    e.g. >,>,>=,<= i.e. exp > exp > exp is illegal
    xfy, /* right associative infix op f, a f (b f c) */
    yfx, /* left associative infix ops e.g f=- and (a-b)-c */
    /*there are no yfy operators because these would be ambiguous*/
    xf, /*postfix op f on argument of lower precedence*/
    yf /*postfix op f on argument of equal or lower precedence*/

    }
    /*
    The pecedence of a terminal symbol (literal constant or an identifier or a
    function call) is zero.
    The precedence of an expression is the precedence of the operator "at the
    top of the expession tree"
    */

    struct op_def {
    char *op; /* e.g. "+", "-", */
    int precedence;
    enum fixity; /* e.g. fx, fy, xfx, xfy, yfx, xf, yf*/
    };
    /*
    then get your recursive greedy parser to generate expression trees and carry
    around the highest precedence of an expression it is willing to accept in
    the current context - if it has finished parsing an expression and it sees
    the
    next symbol is an op then look at the op table for the fixity and precedence
    of the op at the top of the tree you have just parsed and the op that is
    next in the input stream and decide whether to return of continue, and if
    you continue decide which op goes at the top of the new tree
    */


    "gamehack" <> wrote in message
    news:...
    > Hi all,
    >
    > I was thinking about parsing equations but I can't think of any generic
    > approach. Basically I have a struct called math_term which is something
    > like:
    > struct math_term {
    > char sign;
    > int constant;
    > int x;
    > int y;
    > int xpower;
    > int ypower;
    > }
    >
    > For example say the user inputs this:
    > 6x^2-8x^3+5
    > Then this would be transformed to 3 structs
    > 6x^2: sign = '+'; constant = 6; x = 1; y = 0; xpower = 2; ypower = 0;
    > -8x^3: sign = '-'; constant = 8; x = 1; y = 0; xpower = 3; ypower = 0;
    > +5: sign = '+'; constant = 5; x = 0; y = 0; xpower = 0; ypower = 0;
    >
    > My problem is getting the input into my structured form. Also I was
    > thinking of implementing brackets but I wasn't sure what datastructure
    > to use to hold all the terms. A linked list probably? Or a bin tree? I
    > wasn't able to figure out how to handle the case
    > (6x^2 + 3x)/(9x^3)
    >
    > Any pointers on how to proceed from here are greatly appreciated. I
    > just need the general idea how to go about the problem and I'll figure
    > out how to implement it. Thanks
    >
     
    Paul Connolly, Feb 3, 2006
    #4
  5. gamehack

    gamehack Guest

    Julian V. Noble wrote:
    > gamehack wrote:
    > >
    > > Hi all,
    > >
    > > I was thinking about parsing equations but I can't think of any generic
    > > approach. Basically I have a struct called math_term which is something
    > > like:
    > > struct math_term {
    > > char sign;
    > > int constant;
    > > int x;
    > > int y;
    > > int xpower;
    > > int ypower;
    > > }
    > >
    > > For example say the user inputs this:
    > > 6x^2-8x^3+5
    > > Then this would be transformed to 3 structs
    > > 6x^2: sign = '+'; constant = 6; x = 1; y = 0; xpower = 2; ypower = 0;
    > > -8x^3: sign = '-'; constant = 8; x = 1; y = 0; xpower = 3; ypower = 0;
    > > +5: sign = '+'; constant = 5; x = 0; y = 0; xpower = 0; ypower = 0;
    > >
    > > My problem is getting the input into my structured form. Also I was
    > > thinking of implementing brackets but I wasn't sure what datastructure
    > > to use to hold all the terms. A linked list probably? Or a bin tree? I
    > > wasn't able to figure out how to handle the case
    > > (6x^2 + 3x)/(9x^3)
    > >
    > > Any pointers on how to proceed from here are greatly appreciated. I
    > > just need the general idea how to go about the problem and I'll figure
    > > out how to implement it. Thanks

    >
    > Have a look at the "dragon book":
    >
    > "Compilers, principles, techniques, and tools"
    > by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman
    > (Addison-Wesley Pub. Co., 1986)
    >
    > They discuss all the methods.
    >
    > I personally employ recursive descent, using a stack that holds pointers
    > to the sub-expressions and operators. Others prefer a tree structure. Yet
    > others like the "operator precedence grammar". There is a notation called
    > Backus-Naur Format (BNF) for expressing how parsers work. It is described
    > in the dragon book. A mini-Fortran parser I wrote uses these rules:
    >
    > ----------------------------- Backus-Naur Rules for mini FORTRAN
    > NOTATION:
    > | -> "or",
    > + -> "unlimited repetitions"
    > Q -> "empty set"
    > & -> + | -
    > % -> * | /
    >
    > NUMBERS:
    > fp# -> {-|Q}{digit.digit+ | .digit digit+} exponent
    > exponent -> {dDeE {&|Q} digit {digit|Q} {digit|Q} | Q}
    >
    > FORMULAS:
    > assignment -> id = expression
    > id -> name | name {+ Forth }+ --curly braces balance!
    > name -> letter {letter|digit}+
    > arglist -> ( expression {,expression}+ )
    > function -> id arglist
    > expression -> term | expression & term
    > term -> factor | term % factor
    > factor -> id | fp# | ( expr ) | f^f | function
    > ------------------------------------------ end Backus-Naur Rules
    >
    > Consider an "expression": the rules say it can be either a "term" or
    > a "term" joined to an "expression" (that is, a sub-expression) by a +
    > or - . This is a natural for recursion, since after finding a + or -
    > and peeling off the leading term the expression-parsing function can
    > then call itself using the pointers to the sub-expression (everything
    > to the right of the + or - ) as input.
    >
    > Doing things in this order is "left-to-right" or LR parsing. There is
    > no rule that you couldn't do it from right to left.
    >
    > But as I note, this is not the only way to proceed, and lots of people
    > prefer to do it other ways.
    >
    > --
    > Julian V. Noble
    > Professor Emeritus of Physics
    >
    > http://galileo.phys.virginia.edu/~jvn/
    >
    > "As democracy is perfected, the office of president represents, more and
    > more closely, the inner soul of the people. On some great and glorious
    > day the plain folks of the land will reach their heart's desire at last
    > and the White House will be adorned by a downright moron."
    >
    > --- H. L. Mencken (1880 - 1956)


    Do you have any online tutorials that explain how to do the parsing
    with a stack and/or a tree? Thanks a lot

    PS. I did google but couldn't find anything relevant
     
    gamehack, Feb 3, 2006
    #5
  6. enum fixity; /* e.g. fx, fy, xfx, xfy, yfx, xf, yf*/

    should have been
    enum fixity fixity_of_op; /* e.g. fx, fy, xfx, xfy, yfx, xf, yf*/

    "Paul Connolly" <> wrote in message
    news:dEQEf.248068$...
    > /*
    > The most flexible approach I can think of is the language Prolog, where
    > new
    > operators can be defined by programs - e.g. in a Prolog program you can
    > extend the language by defining new operators e.g. 'blah' or '=>>>", while
    > in say C++ you have to stick to re-defining the existing operators.
    >
    > In any language expressions can be though of as trees
    > infix op prefix op postfix op
    > / \ \ /
    > exp1 exp2 exp exp
    > or at the lowest level, very simple terminal one-node trees e.g.
    > identifier (e.g. x, y, i, val), or literal ( 1, 1.0, 1E-4, "Fred"), or
    > function call ( func( exp1,..., expn) )
    >
    > First define fixity or associativity of your ops... */
    > enum fixity {
    > fx, /* prefix operator f where (operated on) where x stands for an
    > expression of lower precedence*/
    > fy, /* prefix operator f where y stands for an expression of equal or
    > lower precedence
    > e.g. f=unary - f=or unary + parsed as +(+(-exp)) */
    > xfx, /*infix operator f where precedence of left & right expression is
    > lower precedence
    > e.g. >,>,>=,<= i.e. exp > exp > exp is illegal
    > xfy, /* right associative infix op f, a f (b f c) */
    > yfx, /* left associative infix ops e.g f=- and (a-b)-c */
    > /*there are no yfy operators because these would be ambiguous*/
    > xf, /*postfix op f on argument of lower precedence*/
    > yf /*postfix op f on argument of equal or lower precedence*/
    >
    > }
    > /*
    > The pecedence of a terminal symbol (literal constant or an identifier or a
    > function call) is zero.
    > The precedence of an expression is the precedence of the operator "at the
    > top of the expession tree"
    > */
    >
    > struct op_def {
    > char *op; /* e.g. "+", "-", */
    > int precedence;
    > enum fixity; /* e.g. fx, fy, xfx, xfy, yfx, xf, yf*/
    > };
    > /*
    > then get your recursive greedy parser to generate expression trees and
    > carry
    > around the highest precedence of an expression it is willing to accept in
    > the current context - if it has finished parsing an expression and it sees
    > the
    > next symbol is an op then look at the op table for the fixity and
    > precedence of the op at the top of the tree you have just parsed and the
    > op that is next in the input stream and decide whether to return of
    > continue, and if you continue decide which op goes at the top of the new
    > tree
    > */
    >
    >
    > "gamehack" <> wrote in message
    > news:...
    >> Hi all,
    >>
    >> I was thinking about parsing equations but I can't think of any generic
    >> approach. Basically I have a struct called math_term which is something
    >> like:
    >> struct math_term {
    >> char sign;
    >> int constant;
    >> int x;
    >> int y;
    >> int xpower;
    >> int ypower;
    >> }
    >>
    >> For example say the user inputs this:
    >> 6x^2-8x^3+5
    >> Then this would be transformed to 3 structs
    >> 6x^2: sign = '+'; constant = 6; x = 1; y = 0; xpower = 2; ypower = 0;
    >> -8x^3: sign = '-'; constant = 8; x = 1; y = 0; xpower = 3; ypower = 0;
    >> +5: sign = '+'; constant = 5; x = 0; y = 0; xpower = 0; ypower = 0;
    >>
    >> My problem is getting the input into my structured form. Also I was
    >> thinking of implementing brackets but I wasn't sure what datastructure
    >> to use to hold all the terms. A linked list probably? Or a bin tree? I
    >> wasn't able to figure out how to handle the case
    >> (6x^2 + 3x)/(9x^3)
    >>
    >> Any pointers on how to proceed from here are greatly appreciated. I
    >> just need the general idea how to go about the problem and I'll figure
    >> out how to implement it. Thanks
    >>

    >
    >
    >
     
    Paul Connolly, Feb 3, 2006
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shukla, Sunil Kumar

    parallel CRC equation generator

    Shukla, Sunil Kumar, May 31, 2005, in forum: VHDL
    Replies:
    2
    Views:
    1,164
    Pete Fraser
    May 31, 2005
  2. Weng Tianxiang
    Replies:
    12
    Views:
    1,406
  3. Marty Underwood

    String to equation

    Marty Underwood, Feb 18, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    448
    Marty Underwood
    Feb 18, 2004
  4. David Zimmerman
    Replies:
    0
    Views:
    927
    David Zimmerman
    Jun 27, 2003
  5. pete kirkham
    Replies:
    0
    Views:
    2,015
    pete kirkham
    Jun 27, 2003
Loading...

Share This Page