[Snippet] a Recursive Descent Parser via TDD - recursiveDescentParser.h

Discussion in 'C++' started by Phlip, Aug 2, 2004.

  1. Phlip

    Phlip Guest

    C++ newsgroupies:

    I wrote a parser to solve math expressions like "3.0 ^(4 - 5)", or "3 / 8".

    It's bad luck to name an interface after its implementation. This file could
    have been "expressionSolver.h". But naturally, if I set out to implement
    recursive descent parsing, that tainted my filename selection.

    Below my sig is recursiveDescentParser.h, a math expression solver. Using
    the test framework, and test suite appearing in parallel posts, and
    Test-Driven Development, I produced this little parser. Extending it to use
    variables, or your favorite math notation, should be relatively easy.

    The primary interface is parse("1 + 3"). This returns a type that can
    contain either a double or a beef, which is an error message string.

    The code is 'inline' due to laziness; no speed gain is implied.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


    #ifndef RECURSIVE_DESCENT_PARSER_H_
    # define RECURSIVE_DESCENT_PARSER_H_

    # include <ctype.h>
    # include <stdexcept>
    # include <string>
    # include <assert.h>
    # include <math.h>

    namespace parser {

    using std::string;

    class
    Fallible_double_t
    {

    // contains either a double or a beef

    double m_value;
    public:
    const string m_beef;

    Fallible_double_t(string beef): m_value(0.0), m_beef(beef) {}
    Fallible_double_t(double value): m_value(value) {}

    bool getValid() { return m_beef.size() == 0; }

    operator double()
    {
    assert(getValid());
    return m_value;
    }
    }; // named after Barton & Nackman's Fallible type


    inline void
    eatBlanks(char const * &z)
    {
    while (isspace(*z))
    ++z;
    }


    inline Fallible_double_t
    parseTerminal(char const * &z)
    {
    eatBlanks(z);
    char * end = NULL;
    double result = strtod(z, &end);

    if (z == end)
    return Fallible_double_t("syntax error");

    z = end;
    eatBlanks(z);
    return result;
    }

    inline Fallible_double_t parseTerms(char const * &z);

    # define bounce_(x_) if (!x_.getValid()) return x_;


    inline Fallible_double_t
    parseParentheses(char const * &z)
    {
    eatBlanks(z);

    if (*z == '(')
    {
    ++z;
    Fallible_double_t contents = parseTerms(z);
    bounce_(contents);

    if (*z != ')')
    return Fallible_double_t("");

    ++z;
    eatBlanks(z);
    return contents;
    }
    return parseTerminal(z);
    }



    inline Fallible_double_t
    parseExponentiation(char const * &z)
    {
    Fallible_double_t arg_ = parseParentheses(z);
    bounce_(arg_);
    double arg (arg_);

    while (*z == '^')
    {
    char op (*z++);
    Fallible_double_t factor = parseParentheses(z);
    bounce_(factor);
    return pow(arg, factor);
    }

    return arg;
    }


    inline Fallible_double_t
    parseFactors(char const * &z)
    {
    Fallible_double_t arg_ = parseExponentiation(z);
    bounce_(arg_);
    double arg (arg_);

    while (*z == '*' || *z == '/')
    {
    char op (*z++);
    Fallible_double_t factor = parseExponentiation(z);
    bounce_(factor);

    switch(op)
    {
    case '*': arg *= factor; break;
    case '/':
    if (factor == 0.0)
    return Fallible_double_t("division by zero error");

    arg /= factor;
    break;
    }
    }

    return arg;
    }


    inline Fallible_double_t
    parseTerms(char const * &z)
    {
    Fallible_double_t arg_ = parseFactors(z);
    bounce_(arg_);
    double arg (arg_);

    while (*z == '+' || *z == '-')
    {
    char op (*z);
    Fallible_double_t factor = parseFactors(++z);
    bounce_(factor);

    switch(op)
    {
    case '+': arg += factor; break;
    case '-': arg -= factor; break;
    }
    }
    return arg;
    }


    inline Fallible_double_t
    parse(char const * input)
    {
    char const * end = input;
    Fallible_double_t got = parseTerms(end);

    if (got.getValid() && input + strlen(input) != end)
    return Fallible_double_t("incomplete expression"); // oh, you were
    so close!

    return got;
    }

    } // namespace parser

    #endif // ndef RECURSIVE_DESCENT_PARSER_H_
    Phlip, Aug 2, 2004
    #1
    1. Advertising

  2. Phlip

    David Hilsee Guest

    "Phlip" <> wrote in message
    news:d2xPc.1931$...

    > class
    > Fallible_double_t
    > {
    >
    > // contains either a double or a beef
    >
    > double m_value;
    > public:
    > const string m_beef;
    >
    > Fallible_double_t(string beef): m_value(0.0), m_beef(beef) {}
    > Fallible_double_t(double value): m_value(value) {}
    >
    > bool getValid() { return m_beef.size() == 0; }
    >
    > operator double()
    > {
    > assert(getValid());
    > return m_value;
    > }
    > }; // named after Barton & Nackman's Fallible type


    I think I have some beef (har har) with this approach. I did some quick
    research on Fallible, and it doesn't look like it was ever meant to replace
    exceptions. In fact, none of the examples I saw ever contained a string for
    information, just an extra boolean to indicate validity. That seems to be
    because the fallible type is only supposed to replace cases where the caller
    would return a "special" sentinel value to indicate failure. This class,
    however, suspiciously resembles a cross between a return value and an
    exception, and it is never thrown by the code. Instead of using a
    "fallible" class, I would prefer a separate exception class (or multiple
    exception classes) to indicate failure.

    --
    David Hilsee
    David Hilsee, Aug 2, 2004
    #2
    1. Advertising

  3. Phlip

    Phlip Guest

    David Hilsee wrote:

    > Phlip wrote:
    >
    > > class
    > > Fallible_double_t
    > > {
    > >
    > > // contains either a double or a beef
    > >
    > > double m_value;
    > > public:
    > > const string m_beef;
    > >
    > > Fallible_double_t(string beef): m_value(0.0), m_beef(beef) {}
    > > Fallible_double_t(double value): m_value(value) {}
    > >
    > > bool getValid() { return m_beef.size() == 0; }
    > >
    > > operator double()
    > > {
    > > assert(getValid());
    > > return m_value;
    > > }
    > > }; // named after Barton & Nackman's Fallible type

    >
    > I think I have some beef (har har) with this approach. I did some quick
    > research on Fallible, and it doesn't look like it was ever meant to

    replace
    > exceptions. In fact, none of the examples I saw ever contained a string

    for
    > information, just an extra boolean to indicate validity. That seems to be
    > because the fallible type is only supposed to replace cases where the

    caller
    > would return a "special" sentinel value to indicate failure. This class,
    > however, suspiciously resembles a cross between a return value and an
    > exception, and it is never thrown by the code. Instead of using a
    > "fallible" class, I would prefer a separate exception class (or multiple
    > exception classes) to indicate failure.


    I started with exceptions, and did not like their effect on my VC++ Output
    panel.

    My tests use OutputDebugString to write "All tests passed" or "tests failed"
    into the VC++ output panel, each time I ran the test suite (which was >50
    times). However, when I wrote the equivalent assertion to ASSERT_BEEF to
    catch a thrown exception, to prove it was thrown, the output panel insisted
    on reporting the "first chance exception" squeaking that VC++ always reports
    when a C++ exception occurs. These appeared >after< "All tests passed",
    which pushed it up to where I couldn't see it.

    If I'm going to run the tests >50 times, then navigating into the output
    panel and scrolling to see the "All tests passed" would slow me down. (Not
    much, because my ASSERTs cause a breakpoint when they fail.) Because the
    point of TDD is rapidly predicting each test result, I switched from
    exceptions to fallibility only to see the "All tests passed" easier.

    However, in an evaluator, the return value is always either legit or wrong,
    so exceptions are not necessarily the perfect way to do this. I totally
    agree they would have simplified the code.

    One could instrument the Fallible class to throw an exception from its own
    faulting constructor, to retrofit exceptions, or toggle them.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
    Phlip, Aug 3, 2004
    #3
  4. Phlip

    David Hilsee Guest

    "Phlip" <> wrote in message
    news:mlCPc.48$...
    <snip>
    > I started with exceptions, and did not like their effect on my VC++ Output
    > panel.
    >
    > My tests use OutputDebugString to write "All tests passed" or "tests

    failed"
    > into the VC++ output panel, each time I ran the test suite (which was >50
    > times). However, when I wrote the equivalent assertion to ASSERT_BEEF to
    > catch a thrown exception, to prove it was thrown, the output panel

    insisted
    > on reporting the "first chance exception" squeaking that VC++ always

    reports
    > when a C++ exception occurs. These appeared >after< "All tests passed",
    > which pushed it up to where I couldn't see it.
    >
    > If I'm going to run the tests >50 times, then navigating into the output
    > panel and scrolling to see the "All tests passed" would slow me down. (Not
    > much, because my ASSERTs cause a breakpoint when they fail.) Because the
    > point of TDD is rapidly predicting each test result, I switched from
    > exceptions to fallibility only to see the "All tests passed" easier.


    In that case, I probably would have changed the code that was used for
    testing. It sounds like it has some issues that need to be addressed.

    > However, in an evaluator, the return value is always either legit or

    wrong,
    > so exceptions are not necessarily the perfect way to do this. I totally
    > agree they would have simplified the code.


    Yes, that's true, but in the battle concerning error codes versus
    exceptions, exceptions seem to have won, especially when extra information
    about the error is required. At the very least, they are the traditional
    way errors are reported in C++.

    > One could instrument the Fallible class to throw an exception from its own
    > faulting constructor, to retrofit exceptions, or toggle them.


    True, but the code might still need to be revised if it were required to
    throw exceptions of multiple types.

    Anyway, I just wanted to understand why you chose an exception-like return
    value over a thrown exception. If it's because the test code couldn't
    handle exceptions very well, that's fine. I wanted to make sure I wasn't
    missing something that was deeper than that. Thanks for the snippets.

    --
    David Hilsee
    David Hilsee, Aug 4, 2004
    #4
  5. Phlip

    Phlip Guest

    David Hilsee wrote:

    > Phlip wrote:


    > > If I'm going to run the tests >50 times, then navigating into the output
    > > panel and scrolling to see the "All tests passed" would slow me down.

    (Not
    > > much, because my ASSERTs cause a breakpoint when they fail.) Because the
    > > point of TDD is rapidly predicting each test result, I switched from
    > > exceptions to fallibility only to see the "All tests passed" easier.

    >
    > In that case, I probably would have changed the code that was used for
    > testing. It sounds like it has some issues that need to be addressed.


    Install and run it and see if you can.

    I want "All test passed" to appear in the output window. This conflicts with
    how VC++ reports irrelevant exceptions. The tests themselves are not
    implicated.

    To see what I mean, add a 'throw' statement to the constructor of
    Fallible_double_t. Add a catch inside parse(), and make it return an
    (alternately constructed) Fallible_double_t, to satisfy the interface the
    tests expect. (The alternative is to change all the tests to catch - they
    still are not themselves implicated.)

    You will see that the exceptions scroll the "All tests passed" up in the
    output window.

    > > However, in an evaluator, the return value is always either legit or
    > > wrong,
    > > so exceptions are not necessarily the perfect way to do this. I totally
    > > agree they would have simplified the code.

    >
    > Yes, that's true, but in the battle concerning error codes versus
    > exceptions, exceptions seem to have won, especially when extra information
    > about the error is required. At the very least, they are the traditional
    > way errors are reported in C++.


    Tough. I already noted one can add them as an option. I'm hardly setting a
    "bad example" here.

    > Anyway, I just wanted to understand why you chose an exception-like return
    > value over a thrown exception. If it's because the test code couldn't
    > handle exceptions very well, that's fine. I wanted to make sure I wasn't
    > missing something that was deeper than that. Thanks for the snippets.


    The test code could handle them fine. The ASSERT_BEEF macro formerly
    expressed a catch statement, and caught what it expected. The IDE could not
    handle exceptions cleanly. One important goal of automated unit tests is
    silence - they report pass or fail, without reporting any extraneous crud.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
    Phlip, Aug 4, 2004
    #5
  6. Phlip

    David Hilsee Guest

    "Phlip" <> wrote in message
    news:s0_Pc.252$...
    > David Hilsee wrote:
    >
    > > Phlip wrote:

    >
    > > > If I'm going to run the tests >50 times, then navigating into the

    output
    > > > panel and scrolling to see the "All tests passed" would slow me down.

    > (Not
    > > > much, because my ASSERTs cause a breakpoint when they fail.) Because

    the
    > > > point of TDD is rapidly predicting each test result, I switched from
    > > > exceptions to fallibility only to see the "All tests passed" easier.

    > >
    > > In that case, I probably would have changed the code that was used for
    > > testing. It sounds like it has some issues that need to be addressed.

    >
    > Install and run it and see if you can.
    >
    > I want "All test passed" to appear in the output window. This conflicts

    with
    > how VC++ reports irrelevant exceptions. The tests themselves are not
    > implicated.
    >
    > To see what I mean, add a 'throw' statement to the constructor of
    > Fallible_double_t. Add a catch inside parse(), and make it return an
    > (alternately constructed) Fallible_double_t, to satisfy the interface the
    > tests expect. (The alternative is to change all the tests to catch - they
    > still are not themselves implicated.)
    >
    > You will see that the exceptions scroll the "All tests passed" up in the
    > output window.


    We must be using different versions of VS.Net, or have different settings
    set, or something. I just created a VS.Net 2003 console project, modified
    the fallible's constructor to optionally throw based on a new bool
    parameter, and modified the code so it passes true for that parameter
    everywhere except when the beef is actually returned from the parser. I do
    see what you're saying about the "first-chance exception" output, but it
    only appeared when I was running it in the debugger, and it didn't really
    affect the visibility of "All tests passed!" because the exception messages
    were displayed before the "All tests passed!" message was displayed. Is it
    necessary to run the test cases using the debugger?

    None of this is all that important. I just thought it might be interesting
    to see what the same code would look like with exceptions and TDD combined,
    because exceptions are the traditional way errors are reported.

    --
    David Hilsee
    David Hilsee, Aug 4, 2004
    #6
  7. Phlip

    Phlip Guest

    David Hilsee wrote:

    > We must be using different versions of VS.Net, or have different settings
    > set, or something. I just created a VS.Net 2003 console project, modified
    > the fallible's constructor to optionally throw based on a new bool
    > parameter, and modified the code so it passes true for that parameter
    > everywhere except when the beef is actually returned from the parser. I

    do
    > see what you're saying about the "first-chance exception" output, but it
    > only appeared when I was running it in the debugger, and it didn't really
    > affect the visibility of "All tests passed!" because the exception

    messages
    > were displayed before the "All tests passed!" message was displayed.


    Outrageous! VS.NET changed the order that OutputDebugString's internal
    buffer flushes! In VS6, "All tests passed!" appears, then the silly
    exception messages appear _after_ it, and push it up.

    (None of this is an argument for or against exceptions themselves...)

    > Is it
    > necessary to run the test cases using the debugger?


    The example shows Test Driven Development in action. I wrote the parser by
    (essentially) writing every assertion you see, from top to bottom, in order,
    and getting them to pass. That required me to add an assertion, run, predict
    failure, get failure, upgrade the code just a little, run, predict success,
    get success (essentially), refactor the code, run, predict success, get
    success, and proceed to the next assertion. As the design emerged, some
    assertions started passing the first time.

    This requires me to do anything, including adjust the style of a "perfect"
    program, to get a rapid response on each run.

    > None of this is all that important. I just thought it might be

    interesting
    > to see what the same code would look like with exceptions and TDD

    combined,
    > because exceptions are the traditional way errors are reported.


    Good show. However, exceptions and TDD are orthogonal. Plenty of TDD
    projects use exceptions, and use tests to force them to exist and behave.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
    Phlip, Aug 5, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Phlip
    Replies:
    0
    Views:
    462
    Phlip
    Aug 2, 2004
  2. Phlip
    Replies:
    0
    Views:
    423
    Phlip
    Aug 2, 2004
  3. Replies:
    4
    Views:
    454
    Ben Sizer
    Oct 2, 2006
  4. Just Another Victim of the Ambient Morality

    Is pyparsing really a recursive descent parser?

    Just Another Victim of the Ambient Morality, Nov 2, 2007, in forum: Python
    Replies:
    39
    Views:
    1,392
    Kay Schluehr
    Nov 9, 2007
  5. Robert
    Replies:
    1
    Views:
    580
    Puppet_Sock
    Apr 14, 2008
Loading...

Share This Page