ANN: pyparsing 1.4.11 released

P

Paul McGuire

I have just uploaded version 1.4.11 of pyparsing to SourceForge. It
has been a pretty full 2 months since the last release, with
contributions from new users, old users, and also some help from the
Google Highly-Open Participation contest. I think there are some very
interesting new features in this release. Please check it out!

(Please note - if you download and install the Windows binary, this
will
NOT include the HTML doc or examples directory. TO get these, you
will
need to download the docs or one of the source distributions.)

The pyparsing wiki is at http://pyparsing.wikispaces.com.

Here are the notes for 1.4.11:

Version 1.4.11 - February 10, 2008
----------------------------------
- With help from Robert A. Clark, this version of pyparsing
is compatible with Python 3.0a3. Thanks for the help,
Robert!

- Added WordStart and WordEnd positional classes, to support
expressions that must occur at the start or end of a word.
Proposed by piranha on the pyparsing wiki, good idea!

- Added matchOnlyAtCol helper parser action, to simplify
parsing log or data files that have optional fields that are
column dependent. Inspired by a discussion thread with
hubritic on comp.lang.python.

- Added withAttribute.ANY_VALUE as a match-all value when using
withAttribute. Used to ensure that an attribute is present,
without having to match on the actual attribute value.

- Added get() method to ParseResults, similar to dict.get().
Suggested by new pyparsing user, Alejandro Dubrovksy, thanks!

- Added '==' short-cut to see if a given string matches a
pyparsing expression. For instance, you can now write:

integer = Word(nums)
if "123" == integer:
# do something

print [ x for x in "123 234 asld".split() if x==integer ]
# prints ['123', '234']

- Simplified the use of nestedExpr when using an expression for
the opening or closing delimiters. Now the content expression
will not have to explicitly negate closing delimiters. Found
while working with dfinnie on GHOP Task #277, thanks!

- Fixed bug when defining ignorable expressions that are
later enclosed in a wrapper expression (such as ZeroOrMore,
OneOrMore, etc.) - found while working with Prabhu
Gurumurthy, thanks Prahbu!

- Fixed bug in withAttribute in which keys were automatically
converted to lowercase, making it impossible to match XML
attributes with uppercase characters in them. Using with-
Attribute requires that you reference attributes in all
lowercase if parsing HTML, and in correct case when parsing
XML.

- Changed '<<' operator on Forward to return None, since this
is really used as a pseudo-assignment operator, not as a
left-shift operator. By returning None, it is easier to
catch faulty statements such as a << b | c, where precedence
of operations causes the '|' operation to be performed
*after* inserting b into a, so no alternation is actually
implemented. The correct form is a << (b | c). With this
change, an error will be reported instead of silently
clipping the alternative term. (Note: this may break some
existing code, but if it does, the code had a silent bug in
it anyway.) Proposed by wcbarksdale on the pyparsing wiki,
thanks!

- Several unit tests were added to pyparsing's regression
suite, courtesy of the Google Highly-Open Participation
Contest. Thanks to all who administered and took part in
this event!


========================================
Pyparsing is a pure-Python class library for quickly developing
recursive-descent parsers. Parser grammars are assembled directly in
the calling Python code, using classes such as Literal, Word,
OneOrMore, Optional, etc., combined with operators '+', '|', and '^'
for And, MatchFirst, and Or. No separate code-generation or external
files are required. Pyparsing can be used in many cases in place of
regular expressions, with shorter learning curve and greater
readability and maintainability. Pyparsing comes with a number of
parsing examples, including:
- "Hello, World!" (English, Korean, Greek, and Spanish)
- chemical formulas
- configuration file parser
- web page URL extractor
- 5-function arithmetic expression parser
- subset of CORBA IDL
- chess portable game notation
- simple SQL parser
- search query parser
- EBNF parser/compiler
- Python value string parser (lists, dicts, tuples, with nesting)
(safe alternative to eval)
- HTML tag stripper
- S-expression parser
- macro substitution preprocessor
 
B

bearophileHUGS

Paul McGuire:
- Added '==' short-cut to see if a given string matches a
pyparsing expression. For instance, you can now write:

integer = Word(nums)
if "123" == integer:
# do something

print [ x for x in "123 234 asld".split() if x==integer ]
# prints ['123', '234']

Maybe you can use the "in" instead of "==", meaning that a certain
string conforms to a certain pattern, that defines an implicit class
of possibilities, so with the "in" you look if the string is present
in that class of acceptable patterns, instead of being equal to that
class.

integers = Word(nums)
if "123" in integers:
# do something

print [x for x in "123 234 asld".split() if x in integers]
# prints ['123', '234']

Bye,
bearophile
 
P

Paul McGuire

Maybe you can use the "in" instead of "==", meaning that a certain
string conforms to a certain pattern, that defines an implicit class
of possibilities, so with the "in" you look if the string is present
in that class of acceptable patterns, instead of being equal to that
class.

integers = Word(nums)
if "123" in integers:
    # do something

I understand your interpretation, but in the pyparsing world thus far,
something named 'integers' would be written as

integers = OneOrMore( Word(nums) )

I think your counterpoint is that, by introducing the concept of using
an operator to perform a matching operation, that that pyparsing
expression is no longer an item being used in a parser, but is an
object representing a class of all possible matching strings, and so
'in' would be more suitable here.

I considered whether overloading the '==' operator might be overdoing
things in the first place. The alternative is to do something like
adding a match method as re's do, as in:

integer.match("123")

and this could return a ParseResults object or None, which would be
suitable for boolean testing.

Perhaps I was seduced by too much cleverness to add another operator
for this concept. Perhaps I was taken by a fit of Perlishness. I
really tried to consider what other uses there might be for '==' with
respect to ParseElement objects, and could only think of improbable
contrivances. Ultimately, I took the leap and went with '==' - we'll
see how this plays out among the pyparsers out there.

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top