[ANN] RedParse 0.8.4 Released

C

Caleb Clausen

RedParse version 0.8.4 has been released!

* http://github.com/coatl/redparse

RedParse is a ruby parser written in pure ruby. Instead of YACC or
ANTLR, it's parse tool is a home-brewed "compiler-interpreter". (The
tool is LALR(1)-equivalent and the 'parse language' is pretty nice,
even in it's current crude form.)

My intent is to have a completely correct parser for ruby, in 100%
ruby. Currently, RedParse can parse all known ruby 1.8 constructions
correctly. There might be some problems with unparsing or otherwise
working with texts in a character set other than ascii. Some of the
new ruby 1.9 constructions are supported in 1.9 mode. For more
details on known problems, see below.

== Installation:
Type this to install the gem:
gem install redparse
Or, download the tarball from rubyforge:
http://rubyforge.org/frs/download.php/68399/redparse-0.8.4.tar.gz

== Benefits:

* Pure ruby, through and through. No part is written in C, YACC,
ANTLR, lisp, assembly, intercal, befunge or any other language
except ruby.
* Pretty AST trees (at least, I think so).
* AST trees closely mirror the actual structure of source code.
* unparser is built in
* ParseTree format output too, if you want that.
* Did I mention that there's no YACC at all? YACC grammars are
notoriously difficult to modify, (I've never successfully done it)
but I've found it easy, at times even pleasant to modify the parse
rules of this grammar as necessary.
* Relatively small parser: 70 rules in 240 lines
(vs (by my count) 320 rules in 2200 lines for MRI 1.8.7. This is
by no means a fair comparison, tho, since RubyLexer does a lot
more than MRI's lexer, and MRI's 2200 lines include its
actions (which occupy somewhere under 3100 lines in RedParse).
Also, what is a rule? I counted most things which required a
separate action in MRI's parser, I'm not sure if that's fair.
On the other hand, RedParse rules require no separate actions
anywhere.In the end, I still think RedParse is still much easier to
understand than MRI's parse.y.)
* "loosey-goosey" parser happily parses many expressions which normal
ruby considers errors.

== Drawbacks:

* Pathetically, ridiculously slow (ok, compiler-compilers are hard...)
* Error handling is very minimal right now.
* No warnings at all.
* Unit test takes a fairly long time.
* Lots of warnings printed during unit test.
* Debugging parse rules is not straightforward.
* Incomplete support for ruby 1.9.
* "loosey-goosey" parser happily parses many expressions which normal
ruby considers errors.

== Known problems with the parser:
* Encoding of the input is not stored anywhere in resulting parse tree.
* Ascii, binary, utf-8, and euc encodings are supported, but sjis is not.

== Known problems with the unparser:
* On unparse, here documents are converted into regular strings. For the most
part, these are exactly equivalent to the original. However, whatever tokens
appeared between the here document header and body will now show up on a
different line. If one of those tokens was __LINE__, it will have a
different value in the unparsed code than it had originally.
* some floating-point literals don't survive parse/unparse roundtrip intact,
due to bugs in MRI 1.8's Float#to_s/String#to_f.
* unparsing of trees whose input was in a character set other than ascii may
not work.

== Known problems with ParseTree creator
* Major:
* converting non-ascii encoded parsetrees to ParseTree format doesn't work
* Minor:
* :begin is not always emitted in the same places as ParseTree does:
* return begin; f; end
* string nodes don't always come out the same way as in ParseTree...
but what I emit is equivalent.
* %W"is #{"Slim #{2?"W":"S"}"}#{xx}."
* silly empty case nodes aren't always optimized to nop like in ParseTree.

== Changes:

### 0.8.4 / 21dec2009
* 5 Major Enhancements:
* OpNode and related modules are now classes
* parse results are now cached -> substantial speedup on reparse
* moderate performance improvements for regular parser too
* inspect now dumps node trees in more readable tree-like output
* tests now ignore (with a warning) differences in just a :begin node

* 18 Minor Enhancements:
* single code path utility now converts bare => in calls and between [ and ]
* reworked the way ternary rescue is parsed
* new build script & gemspec
* better way to deal with default of :rubyversion parser option
* various fixes to xform_tree! rewriter utility (still doesn't work, tho)
* improvements to constructors to make creating nodes by hand more pleasant
* parser now creates nodes via Node.create
* use AssignmentRhsListStart/EndToken to delimit right hand sides
* lhs* and rhs* should be considered unary ops again
* when parens in assign lhs, treat unary* and single arg like no parens
* VarNode#ident is now kept in a slot, not an ivar
* force body of a block to always be a SequenceNode
* added RedParse::Nodes; include it to get all the redparse node classes
* have each node class remember a list of its slot names
* added aliases and accessors in various nodes to make the api nicer
* moved some utilities into the support libraries where they belong
* slight improvements to parser compiler
* added a version of depthwalk which just visits the Nodes of the tree

* 18 Bugfixes:
* parser now runs under MRI 1.9
* (more?) accurate version of Float#to_s, for 1.8
* minor tweaks to #unparse
* value of () is nil, not false
* get redparse/version.rb relative to current directory from gemspec
* when comparing trees, more insignificant differences are ignored
* Node#deep_copy makes more faithful copies now
* node marshalling should be more reliable
* tweaks to parse_tree support to improve conformance
* support automagicness of integer&regexp in flipflop (in parse_tree output)
* parse_tree's placement of :begin nodes is somewhat better emulated
* always put parse inputs into binary mode
* changed some operators (lhs, rescue3 unary* rhs*) to proper precedence
* numeric literals inserted directly in parsetrees should be autoquoted
* ensure @lhs_parens set in AssignNode when it should be
* make sure ListInNode is extended into arrays added to Nodes via writers
* permit empty symbol LiteralNode to be made
* fixed bad permissions in gem file

* 9 Changes To Tests:
* test Node trees surviving Marshal/Ron round-trip and deep_copy unscathed
* tests for many of the new 1.9 syntax constructions
* parse_tree server process now started in a more portable way
* lots of new test cases
* rp-locatetest now has docs on how to use it
* keep track of problematic files if even the slightest problem occurs
* enable/disable fuzzing with ENV var rather than comments
* make sure inputs are unchanged by parse
* better organized some of the known failing testcases
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top