pyparsing

=?iso-8859-2?q?Bo=B9tjan_Jerko?= · May 13, 2004

Hello !

I am trying to understand pyparsing. Here is a little test program
to check Optional subclass:

from pyparsing import Word,nums,Literal,Optional

lbrack=Literal("[").suppress()
rbrack=Literal("]").suppress()
ddot=Literal(":").suppress()
start = Word(nums+".")
step = Word(nums+".")
end = Word(nums+".")

sequence=lbrack+start+Optional(ddot+step)+ddot+end+rbrack

tokens = sequence.parseString("[0:0.1:1]")
print tokens

tokens1 = sequence.parseString("[1:2]")
print tokens1

It works on tokens, but the error message is showed on
the second string ("[1:2]"). I don't get it. I did use
Optional for ddot and step so I guess they are optional.

Any hints what I am doing wrong?

The versions are pyparsing 1.1.2 and Python 2.3.3.

Thanks,

B.

Daniel 'Dang' Griffith · May 13, 2004

Hello !

I am trying to understand pyparsing. Here is a little test program
to check Optional subclass:

from pyparsing import Word,nums,Literal,Optional

lbrack=Literal("[").suppress()
rbrack=Literal("]").suppress()
ddot=Literal(":").suppress()
start = Word(nums+".")
step = Word(nums+".")
end = Word(nums+".")

sequence=lbrack+start+Optional(ddot+step)+ddot+end+rbrack

tokens = sequence.parseString("[0:0.1:1]")
print tokens

tokens1 = sequence.parseString("[1:2]")
print tokens1

It works on tokens, but the error message is showed on
the second string ("[1:2]"). I don't get it. I did use
Optional for ddot and step so I guess they are optional.

Any hints what I am doing wrong?

The versions are pyparsing 1.1.2 and Python 2.3.3.

Thanks,

B.

I don't see anything "obviously" wrong to me, but changing it thusly
seems to resolve the problem (I added a few intermediate rules to
make it more obvious):

pref = lbrack + start
midf = ddot + step
suff = ddot + end + rbrack
sequence = pref + midf + suff | pref + suff

I've run into "this kind of thing" now and again, and have always
been able to resolve it by reorganizing my rules.

--dang

Paul McGuire · May 13, 2004

Bo¹tjan Jerko said:
Hello !

I am trying to understand pyparsing. Here is a little test program
to check Optional subclass:

from pyparsing import Word,nums,Literal,Optional

lbrack=Literal("[").suppress()
rbrack=Literal("]").suppress()
ddot=Literal(":").suppress()
start = Word(nums+".")
step = Word(nums+".")
end = Word(nums+".")

sequence=lbrack+start+Optional(ddot+step)+ddot+end+rbrack

tokens = sequence.parseString("[0:0.1:1]")
print tokens

tokens1 = sequence.parseString("[1:2]")
print tokens1

It works on tokens, but the error message is showed on
the second string ("[1:2]"). I don't get it. I did use
Optional for ddot and step so I guess they are optional.

Any hints what I am doing wrong?

The versions are pyparsing 1.1.2 and Python 2.3.3.

Thanks,

B.

Bostjan -

Here's how pyparsing is processing your input strings:

[0:0.1:1]
[ = lbrack
0 = start
:0.1 = ddot + step (Optional match)
: = ddot
1 = end
] = rbrack

[1:2]
[ = lbrack
1 = start
:2 = ddot + step (Optional match)
] = oops! expected ddot -> failure

Dang Griffith proposed one alternative construct, here's another, perhaps
more explicit:
lbrack + ( ( ddot + step + ddot + end ) | (ddot + end) ) + rbrack

Note that the order of the inner construct is important, so as to not match
ddot+end before trying ddot+step+ddot+end; '|' is a greedy matching
operator, creating a MatchFirst object from pyparsing's class library. You
could avoid this confusion by using '^', which generates an Or object:
lbrack + ( (ddot + end) ^ ( ddot + step + ddot + end ) ) + rbrack
This will evaluate both subconstructs, and choose the longer of the two.

Or you can use another pyparsing helper, the delimited list
lbrack + delimitedlist( Word(nums+"."), delim=":") + rbrack
This implicitly suppresses delimiters, so that all you will get back are
["1","0.1","1"] in the first case and ["1","2"] in the second.

Happy pyparsing!
-- Paul

Paul McGuire · May 14, 2004

Dang Griffith proposed one alternative construct, here's another, perhaps

more explicit:
lbrack + ( ( ddot + step + ddot + end ) | (ddot + end) ) + rbrack

should be:
lbrack + start + ( ( ddot + step + ddot + end ) | (ddot + end) ) +
rbrack

Note that the order of the inner construct is important, so as to not match
ddot+end before trying ddot+step+ddot+end; '|' is a greedy matching
operator, creating a MatchFirst object from pyparsing's class library. You
could avoid this confusion by using '^', which generates an Or object:
lbrack + ( (ddot + end) ^ ( ddot + step + ddot + end ) ) + rbrack

should be:
lbrack + start + ( (ddot + end) ^ ( ddot + step + ddot + end ) ) +
rbrack

This will evaluate both subconstructs, and choose the longer of the two.

Or you can use another pyparsing helper, the delimited list
lbrack + delimitedlist( Word(nums+"."), delim=":") + rbrack

at least this one is correct! No, wait, I mis-cased delimitedList!
should be:
lbrack + delimitedList( Word(nums+"."), delim=":") + rbrack

This implicitly suppresses delimiters, so that all you will get back are
["1","0.1","1"] in the first case and ["1","2"] in the second.

Happy pyparsing!
-- Paul

Sorry for the sloppiness,
-- Paul

=?iso-8859-2?q?Bo=B9tjan_Jerko?= · May 14, 2004

Paul,

thanks for the explanation.

Bo¹tjan

Dang Griffith proposed one alternative construct, here's another, perhaps
more explicit:
lbrack + ( ( ddot + step + ddot + end ) | (ddot + end) ) +
rbrack

Click to expand...

should be:
lbrack + start + ( ( ddot + step + ddot + end ) | (ddot + end)
) +
rbrack

Note that the order of the inner construct is important, so as to
not match
ddot+end before trying ddot+step+ddot+end; '|' is a greedy matching
operator, creating a MatchFirst object from pyparsing's class
library. You
could avoid this confusion by using '^', which generates an Or
object: lbrack + ( (ddot + end) ^ ( ddot + step + ddot + end )
) + rbrack

Click to expand...

should be:
lbrack + start + ( (ddot + end) ^ ( ddot + step + ddot + end )
) +
rbrack

This will evaluate both subconstructs, and choose the longer of the
two.

Or you can use another pyparsing helper, the delimited list
lbrack + delimitedlist( Word(nums+"."), delim=":") + rbrack

Click to expand...

at least this one is correct! No, wait, I mis-cased delimitedList!
should be:
lbrack + delimitedList( Word(nums+"."), delim=":") + rbrack

This implicitly suppresses delimiters, so that all you will get
back are ["1","0.1","1"] in the first case and ["1","2"] in the
second.

Happy pyparsing!
-- Paul

Click to expand...

Sorry for the sloppiness,
-- Paul

pyparsing listAllMatches problem	2	Sep 9, 2006
pyparsing problem	3	Jul 1, 2008
ANN: pyparsing 1.5.6 released!	1	Jul 1, 2011
help with pyparsing	3	Dec 10, 2007
Problem using Optional pyparsing	2	Aug 16, 2007
[ANN] pyparsing 1.5.3 released	0	Jun 25, 2010
Pyparsing...	2	Sep 21, 2004
ANN: pyparsing 1.4.11 released	2	Feb 11, 2008

pyparsing

=?iso-8859-2?q?Bo=B9tjan_Jerko?=

Daniel 'Dang' Griffith

Paul McGuire

Paul McGuire

=?iso-8859-2?q?Bo=B9tjan_Jerko?=

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads