[Snippet] a Recursive Descent Parser via TDD - recursiveDescentParser.h

P

Phlip

C++ newsgroupies:

I wrote a parser to solve math expressions like "3.0 ^(4 - 5)", or "3 / 8".

It's bad luck to name an interface after its implementation. This file could
have been "expressionSolver.h". But naturally, if I set out to implement
recursive descent parsing, that tainted my filename selection.

Below my sig is recursiveDescentParser.h, a math expression solver. Using
the test framework, and test suite appearing in parallel posts, and
Test-Driven Development, I produced this little parser. Extending it to use
variables, or your favorite math notation, should be relatively easy.

The primary interface is parse("1 + 3"). This returns a type that can
contain either a double or a beef, which is an error message string.

The code is 'inline' due to laziness; no speed gain is implied.

--
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces


#ifndef RECURSIVE_DESCENT_PARSER_H_
# define RECURSIVE_DESCENT_PARSER_H_

# include <ctype.h>
# include <stdexcept>
# include <string>
# include <assert.h>
# include <math.h>

namespace parser {

using std::string;

class
Fallible_double_t
{

// contains either a double or a beef

double m_value;
public:
const string m_beef;

Fallible_double_t(string beef): m_value(0.0), m_beef(beef) {}
Fallible_double_t(double value): m_value(value) {}

bool getValid() { return m_beef.size() == 0; }

operator double()
{
assert(getValid());
return m_value;
}
}; // named after Barton & Nackman's Fallible type


inline void
eatBlanks(char const * &z)
{
while (isspace(*z))
++z;
}


inline Fallible_double_t
parseTerminal(char const * &z)
{
eatBlanks(z);
char * end = NULL;
double result = strtod(z, &end);

if (z == end)
return Fallible_double_t("syntax error");

z = end;
eatBlanks(z);
return result;
}

inline Fallible_double_t parseTerms(char const * &z);

# define bounce_(x_) if (!x_.getValid()) return x_;


inline Fallible_double_t
parseParentheses(char const * &z)
{
eatBlanks(z);

if (*z == '(')
{
++z;
Fallible_double_t contents = parseTerms(z);
bounce_(contents);

if (*z != ')')
return Fallible_double_t("");

++z;
eatBlanks(z);
return contents;
}
return parseTerminal(z);
}



inline Fallible_double_t
parseExponentiation(char const * &z)
{
Fallible_double_t arg_ = parseParentheses(z);
bounce_(arg_);
double arg (arg_);

while (*z == '^')
{
char op (*z++);
Fallible_double_t factor = parseParentheses(z);
bounce_(factor);
return pow(arg, factor);
}

return arg;
}


inline Fallible_double_t
parseFactors(char const * &z)
{
Fallible_double_t arg_ = parseExponentiation(z);
bounce_(arg_);
double arg (arg_);

while (*z == '*' || *z == '/')
{
char op (*z++);
Fallible_double_t factor = parseExponentiation(z);
bounce_(factor);

switch(op)
{
case '*': arg *= factor; break;
case '/':
if (factor == 0.0)
return Fallible_double_t("division by zero error");

arg /= factor;
break;
}
}

return arg;
}


inline Fallible_double_t
parseTerms(char const * &z)
{
Fallible_double_t arg_ = parseFactors(z);
bounce_(arg_);
double arg (arg_);

while (*z == '+' || *z == '-')
{
char op (*z);
Fallible_double_t factor = parseFactors(++z);
bounce_(factor);

switch(op)
{
case '+': arg += factor; break;
case '-': arg -= factor; break;
}
}
return arg;
}


inline Fallible_double_t
parse(char const * input)
{
char const * end = input;
Fallible_double_t got = parseTerms(end);

if (got.getValid() && input + strlen(input) != end)
return Fallible_double_t("incomplete expression"); // oh, you were
so close!

return got;
}

} // namespace parser

#endif // ndef RECURSIVE_DESCENT_PARSER_H_
 
D

David Hilsee

class
Fallible_double_t
{

// contains either a double or a beef

double m_value;
public:
const string m_beef;

Fallible_double_t(string beef): m_value(0.0), m_beef(beef) {}
Fallible_double_t(double value): m_value(value) {}

bool getValid() { return m_beef.size() == 0; }

operator double()
{
assert(getValid());
return m_value;
}
}; // named after Barton & Nackman's Fallible type

I think I have some beef (har har) with this approach. I did some quick
research on Fallible, and it doesn't look like it was ever meant to replace
exceptions. In fact, none of the examples I saw ever contained a string for
information, just an extra boolean to indicate validity. That seems to be
because the fallible type is only supposed to replace cases where the caller
would return a "special" sentinel value to indicate failure. This class,
however, suspiciously resembles a cross between a return value and an
exception, and it is never thrown by the code. Instead of using a
"fallible" class, I would prefer a separate exception class (or multiple
exception classes) to indicate failure.
 
P

Phlip

David said:
I think I have some beef (har har) with this approach. I did some quick
research on Fallible, and it doesn't look like it was ever meant to replace
exceptions. In fact, none of the examples I saw ever contained a string for
information, just an extra boolean to indicate validity. That seems to be
because the fallible type is only supposed to replace cases where the caller
would return a "special" sentinel value to indicate failure. This class,
however, suspiciously resembles a cross between a return value and an
exception, and it is never thrown by the code. Instead of using a
"fallible" class, I would prefer a separate exception class (or multiple
exception classes) to indicate failure.

I started with exceptions, and did not like their effect on my VC++ Output
panel.

My tests use OutputDebugString to write "All tests passed" or "tests failed"
into the VC++ output panel, each time I ran the test suite (which was >50
times). However, when I wrote the equivalent assertion to ASSERT_BEEF to
catch a thrown exception, to prove it was thrown, the output panel insisted
on reporting the "first chance exception" squeaking that VC++ always reports
when a C++ exception occurs. These appeared >after< "All tests passed",
which pushed it up to where I couldn't see it.

If I'm going to run the tests >50 times, then navigating into the output
panel and scrolling to see the "All tests passed" would slow me down. (Not
much, because my ASSERTs cause a breakpoint when they fail.) Because the
point of TDD is rapidly predicting each test result, I switched from
exceptions to fallibility only to see the "All tests passed" easier.

However, in an evaluator, the return value is always either legit or wrong,
so exceptions are not necessarily the perfect way to do this. I totally
agree they would have simplified the code.

One could instrument the Fallible class to throw an exception from its own
faulting constructor, to retrofit exceptions, or toggle them.
 
D

David Hilsee

I started with exceptions, and did not like their effect on my VC++ Output
panel.

My tests use OutputDebugString to write "All tests passed" or "tests failed"
into the VC++ output panel, each time I ran the test suite (which was >50
times). However, when I wrote the equivalent assertion to ASSERT_BEEF to
catch a thrown exception, to prove it was thrown, the output panel insisted
on reporting the "first chance exception" squeaking that VC++ always reports
when a C++ exception occurs. These appeared >after< "All tests passed",
which pushed it up to where I couldn't see it.

If I'm going to run the tests >50 times, then navigating into the output
panel and scrolling to see the "All tests passed" would slow me down. (Not
much, because my ASSERTs cause a breakpoint when they fail.) Because the
point of TDD is rapidly predicting each test result, I switched from
exceptions to fallibility only to see the "All tests passed" easier.

In that case, I probably would have changed the code that was used for
testing. It sounds like it has some issues that need to be addressed.
However, in an evaluator, the return value is always either legit or wrong,
so exceptions are not necessarily the perfect way to do this. I totally
agree they would have simplified the code.

Yes, that's true, but in the battle concerning error codes versus
exceptions, exceptions seem to have won, especially when extra information
about the error is required. At the very least, they are the traditional
way errors are reported in C++.
One could instrument the Fallible class to throw an exception from its own
faulting constructor, to retrofit exceptions, or toggle them.

True, but the code might still need to be revised if it were required to
throw exceptions of multiple types.

Anyway, I just wanted to understand why you chose an exception-like return
value over a thrown exception. If it's because the test code couldn't
handle exceptions very well, that's fine. I wanted to make sure I wasn't
missing something that was deeper than that. Thanks for the snippets.
 
P

Phlip

David said:
Phlip wrote:

In that case, I probably would have changed the code that was used for
testing. It sounds like it has some issues that need to be addressed.

Install and run it and see if you can.

I want "All test passed" to appear in the output window. This conflicts with
how VC++ reports irrelevant exceptions. The tests themselves are not
implicated.

To see what I mean, add a 'throw' statement to the constructor of
Fallible_double_t. Add a catch inside parse(), and make it return an
(alternately constructed) Fallible_double_t, to satisfy the interface the
tests expect. (The alternative is to change all the tests to catch - they
still are not themselves implicated.)

You will see that the exceptions scroll the "All tests passed" up in the
output window.
Yes, that's true, but in the battle concerning error codes versus
exceptions, exceptions seem to have won, especially when extra information
about the error is required. At the very least, they are the traditional
way errors are reported in C++.

Tough. I already noted one can add them as an option. I'm hardly setting a
"bad example" here.
Anyway, I just wanted to understand why you chose an exception-like return
value over a thrown exception. If it's because the test code couldn't
handle exceptions very well, that's fine. I wanted to make sure I wasn't
missing something that was deeper than that. Thanks for the snippets.

The test code could handle them fine. The ASSERT_BEEF macro formerly
expressed a catch statement, and caught what it expected. The IDE could not
handle exceptions cleanly. One important goal of automated unit tests is
silence - they report pass or fail, without reporting any extraneous crud.
 
D

David Hilsee

Phlip said:
Install and run it and see if you can.

I want "All test passed" to appear in the output window. This conflicts with
how VC++ reports irrelevant exceptions. The tests themselves are not
implicated.

To see what I mean, add a 'throw' statement to the constructor of
Fallible_double_t. Add a catch inside parse(), and make it return an
(alternately constructed) Fallible_double_t, to satisfy the interface the
tests expect. (The alternative is to change all the tests to catch - they
still are not themselves implicated.)

You will see that the exceptions scroll the "All tests passed" up in the
output window.

We must be using different versions of VS.Net, or have different settings
set, or something. I just created a VS.Net 2003 console project, modified
the fallible's constructor to optionally throw based on a new bool
parameter, and modified the code so it passes true for that parameter
everywhere except when the beef is actually returned from the parser. I do
see what you're saying about the "first-chance exception" output, but it
only appeared when I was running it in the debugger, and it didn't really
affect the visibility of "All tests passed!" because the exception messages
were displayed before the "All tests passed!" message was displayed. Is it
necessary to run the test cases using the debugger?

None of this is all that important. I just thought it might be interesting
to see what the same code would look like with exceptions and TDD combined,
because exceptions are the traditional way errors are reported.
 
P

Phlip

David said:
We must be using different versions of VS.Net, or have different settings
set, or something. I just created a VS.Net 2003 console project, modified
the fallible's constructor to optionally throw based on a new bool
parameter, and modified the code so it passes true for that parameter
everywhere except when the beef is actually returned from the parser. I do
see what you're saying about the "first-chance exception" output, but it
only appeared when I was running it in the debugger, and it didn't really
affect the visibility of "All tests passed!" because the exception messages
were displayed before the "All tests passed!" message was displayed.

Outrageous! VS.NET changed the order that OutputDebugString's internal
buffer flushes! In VS6, "All tests passed!" appears, then the silly
exception messages appear _after_ it, and push it up.

(None of this is an argument for or against exceptions themselves...)
Is it
necessary to run the test cases using the debugger?

The example shows Test Driven Development in action. I wrote the parser by
(essentially) writing every assertion you see, from top to bottom, in order,
and getting them to pass. That required me to add an assertion, run, predict
failure, get failure, upgrade the code just a little, run, predict success,
get success (essentially), refactor the code, run, predict success, get
success, and proceed to the next assertion. As the design emerged, some
assertions started passing the first time.

This requires me to do anything, including adjust the style of a "perfect"
program, to get a rapid response on each run.
None of this is all that important. I just thought it might be interesting
to see what the same code would look like with exceptions and TDD combined,
because exceptions are the traditional way errors are reported.

Good show. However, exceptions and TDD are orthogonal. Plenty of TDD
projects use exceptions, and use tests to force them to exist and behave.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top