V
vikkous
I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.
The Reg RubyForge project: http://rubyforge.org/projects/reg/
The Reg Tarball:
http://rubyforge.org/frs/download.php/4199/reg-0.4.0.tar.bz2
Reg is best thought of in analogy to regular expressions; Regexps are
special data structures for matching Strings; Regs are special data
structures for matching ANY type of ruby data (Strings included, using
Regexps).
This table compares syntax of reg and regexp for various constructs.
Keep
in mind that all Regs are ordinary ruby expressions. The special syntax
is acheived by overriding ruby operators.
These abbreviations are used:
re,re1,re2 represent arbitrary regexp subexpressions,
r,r1,r2 represent arbitrary reg subexpressions
s,t represent any single character (perhaps appropriately escaped, if
the char is magical)
reg regexp #description
+[r1,r2,r3] /re1re2re3/ #sequence
-[r1,r2] (re1re2) #subsequence
r.lit \re #escaping a magical
regproc{r} #{re} #dynamic inclusion
r1|r2 or :OR (re1|re2) or [st] #alternation
~r [^s] #negation (for scalar r and s)
r+0 re* #zero or more matches
r+1 re+ #one or more matches
r-1 re? #zero or one matches
r*n re{n} #exactly n matches
r*(n..m) re{n,m} #at least n, at most m matches
r-n re{n,} #at least n matches
r+m re{,m} #at most m matches
OB . #a single item
OBS .* #zero or more items
BR[1,2] \1,\2 #backreference ***
r>>x or sub sub,gsub #search and replace ***
here are features of reg that don't have an equivalent in regexp
r.la #lookahead ***
~-[] #subsequence negation w/lookahead ***
& or :AND #all alternatives match
^ or :XOR #exactly one of alternatives matches
+{r1=>r2} #hash matcher
-{name=>r} #object matcher
obj.reg #turn any ruby object into a reg that matches if
obj.=== succeeds
/re/.sym #a symbol regex
proceq(klass){rcode} #a proc{} that responds to === by invoking the
proc's call
OBS as un-anchor #opposite of ^ and $ when placed at edges of a
reg array (kinda cheesy)
name=r #named subexpressions
recursive matches via regvariables®constants ***
*** = not implemented yet.
Reg is kind of hard to wrap your mind around, so here are some
examples:
Matches array containing exactly 2 elements; 1st is another array, 2nd
is integer:
+[Array,Integer]
Like above, but 1st is array of arrays of symbol
+[+[+[Symbol.reg+0]+0],Integer]
Matches array of at least 3 consecutive symbols and nothing else:
+[Symbol.reg+3]
Matches array with at least 3 symbols in it somewhere:
+[OBS, Symbol.reg+3, OBS]
Matches array of at most 6 strings starting with 'g'
+[/^g/-6] #no .reg necessary for regexp
Matches array of between 5 and 9 hashes containing a key :k pointing to
something non-nil:
+[ +{:k=>~nil.reg}*(5..9) ]
Matches an object with Integer instance variable @k and property (ie
method) foobar that returns a string with 'baz' somewhere in it:
-{
k=>Integer, :foobar=>/baz/}
Matches array of 6 hashes with 6 as a value of at least one key,
followed by 18 objects with an attribute @s which is a String:
+[ +{OB=>6}*6, -{
s=>String}*18 ]
Status:
Some highly nested vector reg constructions still don't work quite
right. (For examples, search on eat_unworking in regtest.rb.) A number
of features are unimplemented at this point, most notably
backreferences and substitutions.
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.
The Reg RubyForge project: http://rubyforge.org/projects/reg/
The Reg Tarball:
http://rubyforge.org/frs/download.php/4199/reg-0.4.0.tar.bz2
Reg is best thought of in analogy to regular expressions; Regexps are
special data structures for matching Strings; Regs are special data
structures for matching ANY type of ruby data (Strings included, using
Regexps).
This table compares syntax of reg and regexp for various constructs.
Keep
in mind that all Regs are ordinary ruby expressions. The special syntax
is acheived by overriding ruby operators.
These abbreviations are used:
re,re1,re2 represent arbitrary regexp subexpressions,
r,r1,r2 represent arbitrary reg subexpressions
s,t represent any single character (perhaps appropriately escaped, if
the char is magical)
reg regexp #description
+[r1,r2,r3] /re1re2re3/ #sequence
-[r1,r2] (re1re2) #subsequence
r.lit \re #escaping a magical
regproc{r} #{re} #dynamic inclusion
r1|r2 or :OR (re1|re2) or [st] #alternation
~r [^s] #negation (for scalar r and s)
r+0 re* #zero or more matches
r+1 re+ #one or more matches
r-1 re? #zero or one matches
r*n re{n} #exactly n matches
r*(n..m) re{n,m} #at least n, at most m matches
r-n re{n,} #at least n matches
r+m re{,m} #at most m matches
OB . #a single item
OBS .* #zero or more items
BR[1,2] \1,\2 #backreference ***
r>>x or sub sub,gsub #search and replace ***
here are features of reg that don't have an equivalent in regexp
r.la #lookahead ***
~-[] #subsequence negation w/lookahead ***
& or :AND #all alternatives match
^ or :XOR #exactly one of alternatives matches
+{r1=>r2} #hash matcher
-{name=>r} #object matcher
obj.reg #turn any ruby object into a reg that matches if
obj.=== succeeds
/re/.sym #a symbol regex
proceq(klass){rcode} #a proc{} that responds to === by invoking the
proc's call
OBS as un-anchor #opposite of ^ and $ when placed at edges of a
reg array (kinda cheesy)
name=r #named subexpressions
recursive matches via regvariables®constants ***
*** = not implemented yet.
Reg is kind of hard to wrap your mind around, so here are some
examples:
Matches array containing exactly 2 elements; 1st is another array, 2nd
is integer:
+[Array,Integer]
Like above, but 1st is array of arrays of symbol
+[+[+[Symbol.reg+0]+0],Integer]
Matches array of at least 3 consecutive symbols and nothing else:
+[Symbol.reg+3]
Matches array with at least 3 symbols in it somewhere:
+[OBS, Symbol.reg+3, OBS]
Matches array of at most 6 strings starting with 'g'
+[/^g/-6] #no .reg necessary for regexp
Matches array of between 5 and 9 hashes containing a key :k pointing to
something non-nil:
+[ +{:k=>~nil.reg}*(5..9) ]
Matches an object with Integer instance variable @k and property (ie
method) foobar that returns a string with 'baz' somewhere in it:
-{
Matches array of 6 hashes with 6 as a value of at least one key,
followed by 18 objects with an attribute @s which is a String:
+[ +{OB=>6}*6, -{
Status:
Some highly nested vector reg constructions still don't work quite
right. (For examples, search on eat_unworking in regtest.rb.) A number
of features are unimplemented at this point, most notably
backreferences and substitutions.