Treetop problem

M

Marco Guiseppe

Hello everyone,

I've run into a problem with Treetop:

http://pastie.caboo.se/147490

It seems to appear when the first character of the string parsed is a
valid match for both rules, and Treetop is unable to resolve the
ambiguity. One of the tests will always fail, depending on which comes
first in the 'test' rule.

Any ideas about what's going on here?

Thanks,

MG
 
C

Clifford Heath

Marco said:
It seems to appear when the first character of the string parsed is a
valid match for both rules, and Treetop is unable to resolve the
ambiguity.

Packrat parsers are greedy. When a rule succeeds, or when a repeated
element (*, +) repeats too many times, or when an optional rule is
taken, that decision will never be reversed - it's been memoized.
You must prevent it happening (perhaps by using lookahead) in the
first place.

In this case, you can do that by writing:

rule string
[A-z] [A-z0-9]*
end

BTW, did you realize that there are non-alphabetic characters in the
range A-z? Did you mean to say [A-Za-z]?

Clifford Heath.
 
M

Marco Guiseppe

Clifford said:
You must prevent it happening (perhaps by using lookahead) in the
first place.

In this case, you can do that by writing:

Thanks for your reply - I had a sneaking suspicion this was a problem
with the type of parser, not Treetop specifically. Your suggestion won't
solve my problem though, as I need to match strings where the first
character is a digit.

Thanks for pointing out the non-alphabetical chars in A-z, didn't know
I'd missed that.

MG
 
D

Day

[Note: parts of this message were removed to make it a legal post.]

Your suggestion won't solve my problem though, as I need to match strings
where the first character is a digit.


You'll need to define your String rule, then, to fail on things that are
100% digits. I've never worked with Treetop, but you want something that
matches /^\D$/ as well as [A-Za-z0-9]+. Someone with more regexp check me,
but is my thinking clear, anyway? Headed in the right direction?


Ben
 
C

Clifford Heath

Day said:
you want something that matches /^\D$/ as well as [A-Za-z0-9]+.

There's no \D in Treetop character classes, unlike Ruby regexps.
but is my thinking clear, anyway?

Your thinking is clear, but Marco's isn't.

Marco, think about what you're asking. You want to match a
string of digits as a string, but also as a number. How is
Treetop supposed to know which? Is it that with a number,
the following character is not alphanumeric? If so, you must
tell Treetop that:

rule number
[0-9.]+ & ![A-Za-z0-9.]
end

Clifford Heath.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top