R
Roedy Green
Consider a simple finite state automaton to parse property files.
They look like this:
# a comment
keyword=value
I want to categorise each fragment of text as either comment, keyword
or value. Now throw in a complication. Inside any of those three
things might be literals of the form \uffff
I find myself creating all kinds of rinky dink mechanisms to handle
the literals. I wondered if there is a clean way to do it.
There are two problems.
1) It is clumsy to invent three literal states one for in comment, one
inkeyword and one invalue just so it can remember what it was doing.
Yet whole idea of a finite state automaton in that the memory of the
system is supposed to be encapsulated in the state.
2) you leave the literal state based on a count, not the presence of
some delimiter. I could create 5 states to mark progress down the
literal, but this seems a bit nuts.
They look like this:
# a comment
keyword=value
I want to categorise each fragment of text as either comment, keyword
or value. Now throw in a complication. Inside any of those three
things might be literals of the form \uffff
I find myself creating all kinds of rinky dink mechanisms to handle
the literals. I wondered if there is a clean way to do it.
There are two problems.
1) It is clumsy to invent three literal states one for in comment, one
inkeyword and one invalue just so it can remember what it was doing.
Yet whole idea of a finite state automaton in that the memory of the
system is supposed to be encapsulated in the state.
2) you leave the literal state based on a count, not the presence of
some delimiter. I could create 5 states to mark progress down the
literal, but this seems a bit nuts.