ANN: pyparsing 1.4.11 released

Paul McGuire · Feb 11, 2008

I have just uploaded version 1.4.11 of pyparsing to SourceForge. It
has been a pretty full 2 months since the last release, with
contributions from new users, old users, and also some help from the
Google Highly-Open Participation contest. I think there are some very
interesting new features in this release. Please check it out!

(Please note - if you download and install the Windows binary, this
will
NOT include the HTML doc or examples directory. TO get these, you
will
need to download the docs or one of the source distributions.)

The pyparsing wiki is at http://pyparsing.wikispaces.com.

Here are the notes for 1.4.11:

Version 1.4.11 - February 10, 2008
----------------------------------
- With help from Robert A. Clark, this version of pyparsing
is compatible with Python 3.0a3. Thanks for the help,
Robert!

- Added WordStart and WordEnd positional classes, to support
expressions that must occur at the start or end of a word.
Proposed by piranha on the pyparsing wiki, good idea!

- Added matchOnlyAtCol helper parser action, to simplify
parsing log or data files that have optional fields that are
column dependent. Inspired by a discussion thread with
hubritic on comp.lang.python.

- Added withAttribute.ANY_VALUE as a match-all value when using
withAttribute. Used to ensure that an attribute is present,
without having to match on the actual attribute value.

- Added get() method to ParseResults, similar to dict.get().
Suggested by new pyparsing user, Alejandro Dubrovksy, thanks!

- Added '==' short-cut to see if a given string matches a
pyparsing expression. For instance, you can now write:

integer = Word(nums)
if "123" == integer:
# do something

print [ x for x in "123 234 asld".split() if x==integer ]
# prints ['123', '234']

- Simplified the use of nestedExpr when using an expression for
the opening or closing delimiters. Now the content expression
will not have to explicitly negate closing delimiters. Found
while working with dfinnie on GHOP Task #277, thanks!

- Fixed bug when defining ignorable expressions that are
later enclosed in a wrapper expression (such as ZeroOrMore,
OneOrMore, etc.) - found while working with Prabhu
Gurumurthy, thanks Prahbu!

- Fixed bug in withAttribute in which keys were automatically
converted to lowercase, making it impossible to match XML
attributes with uppercase characters in them. Using with-
Attribute requires that you reference attributes in all
lowercase if parsing HTML, and in correct case when parsing
XML.

- Changed '<<' operator on Forward to return None, since this
is really used as a pseudo-assignment operator, not as a
left-shift operator. By returning None, it is easier to
catch faulty statements such as a << b | c, where precedence
of operations causes the '|' operation to be performed
*after* inserting b into a, so no alternation is actually
implemented. The correct form is a << (b | c). With this
change, an error will be reported instead of silently
clipping the alternative term. (Note: this may break some
existing code, but if it does, the code had a silent bug in
it anyway.) Proposed by wcbarksdale on the pyparsing wiki,
thanks!

- Several unit tests were added to pyparsing's regression
suite, courtesy of the Google Highly-Open Participation
Contest. Thanks to all who administered and took part in
this event!

========================================
Pyparsing is a pure-Python class library for quickly developing
recursive-descent parsers. Parser grammars are assembled directly in
the calling Python code, using classes such as Literal, Word,
OneOrMore, Optional, etc., combined with operators '+', '|', and '^'
for And, MatchFirst, and Or. No separate code-generation or external
files are required. Pyparsing can be used in many cases in place of
regular expressions, with shorter learning curve and greater
readability and maintainability. Pyparsing comes with a number of
parsing examples, including:
- "Hello, World!" (English, Korean, Greek, and Spanish)
- chemical formulas
- configuration file parser
- web page URL extractor
- 5-function arithmetic expression parser
- subset of CORBA IDL
- chess portable game notation
- simple SQL parser
- search query parser
- EBNF parser/compiler
- Python value string parser (lists, dicts, tuples, with nesting)
(safe alternative to eval)
- HTML tag stripper
- S-expression parser
- macro substitution preprocessor

bearophileHUGS · Feb 11, 2008

Paul McGuire:

- Added '==' short-cut to see if a given string matches a
pyparsing expression. For instance, you can now write:

integer = Word(nums)
if "123" == integer:
# do something

print [ x for x in "123 234 asld".split() if x==integer ]
# prints ['123', '234']

Maybe you can use the "in" instead of "==", meaning that a certain
string conforms to a certain pattern, that defines an implicit class
of possibilities, so with the "in" you look if the string is present
in that class of acceptable patterns, instead of being equal to that
class.

integers = Word(nums)
if "123" in integers:
# do something

print [x for x in "123 234 asld".split() if x in integers]
# prints ['123', '234']

Bye,
bearophile

Paul McGuire · Feb 11, 2008

Maybe you can use the "in" instead of "==", meaning that a certain
string conforms to a certain pattern, that defines an implicit class
of possibilities, so with the "in" you look if the string is present
in that class of acceptable patterns, instead of being equal to that
class.

integers = Word(nums)
if "123" in integers:
# do something

I understand your interpretation, but in the pyparsing world thus far,
something named 'integers' would be written as

integers = OneOrMore( Word(nums) )

I think your counterpoint is that, by introducing the concept of using
an operator to perform a matching operation, that that pyparsing
expression is no longer an item being used in a parser, but is an
object representing a class of all possible matching strings, and so
'in' would be more suitable here.

I considered whether overloading the '==' operator might be overdoing
things in the first place. The alternative is to do something like
adding a match method as re's do, as in:

integer.match("123")

and this could return a ParseResults object or None, which would be
suitable for boolean testing.

Perhaps I was seduced by too much cleverness to add another operator
for this concept. Perhaps I was taken by a fit of Perlishness. I
really tried to consider what other uses there might be for '==' with
respect to ParseElement objects, and could only think of improbable
contrivances. Ultimately, I took the leap and went with '==' - we'll
see how this plays out among the pyparsers out there.

-- Paul

ANN: pyparsing 1.5.6 released!	1	Jul 1, 2011
[ANN] pyparsing 1.5.3 released	0	Jun 25, 2010
Ann: Pyparsing 1.5.0 released	0	Jun 1, 2008
[ANN] pyparsing 2.0.1 released - compatible with Python 2.6 and later	1	Jul 20, 2013
ANN: pyparsing 1.5.1 released	4	Oct 18, 2008
ANN: pyparsing 1.5.2 released!	0	Apr 20, 2009
ANN: pyparsing 1.4.8 released	0	Oct 7, 2007
[ANN] pyparsing 1.4.5 released	2	Dec 23, 2006

ANN: pyparsing 1.4.11 released

Paul McGuire

bearophileHUGS

Paul McGuire

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads