M
mark.green
Hi folks,
I've been trying to write a PLY parser and have run into a bit of
bother.
At the moment, I have a RESERVEDWORD token which matches all reserved
words and then alters the token type to match the reserved word that
was detected. I also have an IDENTIFIER token which matches
identifiers that are not reserved words.
The problem is, if I put RESERVEDWORD before IDENTIFIER, then
identifiers that happen to begin with reserved words are wrongly lexed
as the reserved word followed by an identifier. For example, because
"if" is a RESERVEDWORD, the string "ifollowyou" is wrongly lexed as the
RESERVEDWORD "if" followed by IDENTIFIER "ollowyou", rather than just
as the IDENTIFIER "ifollowyou".
If I put IDENTIFIER first, though, every single reserved word in the
input is lexed as an IDENTIFIER.
Is there any way I can tell PLY that it should only return a
RESERVEDWORD in the correct circumstances? If PLY can't do this, can
any of the other Python parser generators? (It seems that Lex can..)
Thanks!
I've been trying to write a PLY parser and have run into a bit of
bother.
At the moment, I have a RESERVEDWORD token which matches all reserved
words and then alters the token type to match the reserved word that
was detected. I also have an IDENTIFIER token which matches
identifiers that are not reserved words.
The problem is, if I put RESERVEDWORD before IDENTIFIER, then
identifiers that happen to begin with reserved words are wrongly lexed
as the reserved word followed by an identifier. For example, because
"if" is a RESERVEDWORD, the string "ifollowyou" is wrongly lexed as the
RESERVEDWORD "if" followed by IDENTIFIER "ollowyou", rather than just
as the IDENTIFIER "ifollowyou".
If I put IDENTIFIER first, though, every single reserved word in the
input is lexed as an IDENTIFIER.
Is there any way I can tell PLY that it should only return a
RESERVEDWORD in the correct circumstances? If PLY can't do this, can
any of the other Python parser generators? (It seems that Lex can..)
Thanks!