Mode Oriented Lexical Analyser Generator

  • Thread starter =?iso-8859-1?B?RnJhbmstUmVu6SBTY2jkZmVy?=
  • Start date
?

=?iso-8859-1?B?RnJhbmstUmVu6SBTY2jkZmVy?=

penSource Project 'Quex': http://quex.sf.net

Last weekend, the lexical analyser generator 'Quex' has been released
on SourceForge. Quex provides advanced features for mode definitions
and event handling. Among its features are

-- Creation of a complete C++ environment for lexical analyser
engines.

-- Modes management:
-- mode transitions can be restricted (e.g. "disallow mode A
to enter any mode but F, and G")
-- mode transition events can be equipped with event handlers.
-- modes can be inherited from each other
(thus overtaking pattern-action pairs and event handlers)

-- The indentation event facilitating lexical analysis for
indentation based languages.

For simple lexical analyser, quex provides convinient shorthands. The
mode definition of a lexical anylser for language consisting of
'print', a identifier, number, and assignment can be defined in few
lines as follows:

mode SOME_MODE {
"print" => TKN_PRINT;
[_a-z]+ => TKN_IDENTIFIER(Lexeme);
[0-9]+ => TKN_NUMBER(atoi(Lexeme));
"=" => TKN_ASSIGNMENT;
}
 
I

Ivan Vecerina

: penSource Project 'Quex': http://quex.sf.net
:
: Last weekend, the lexical analyser generator 'Quex' has been released
: on SourceForge. Quex provides advanced features for mode definitions
: and event handling.

How does quex compare to existing and broadly used
solutions such as flex or boost::spirit ?
[ I use flex, and it does support modes.
Indentation is no big deal to support as a single pattern
like <line_start>[\t ]* --> check length of result ]



Regards,
Ivan
 
?

=?iso-8859-1?B?RnJhbmstUmVu6SBTY2jkZmVy?=

Quex provides nice handling of lexer modes. In flex modes can only be
inclusive or exclusive. Quex provides 'real' inheritance
relationships. That is, you can have for example a mode COMMON for the
detection of comments and EndOfFile. This mode does what is always to
be done in such cases. Then different modes such as 'ALGORITHM' or
'FORMAT_STRING' can derive from this mode, thus inheriting this
behavior and *ensuring* that also these modes behave according to
COMMON. This reduces code duplication and makes the functionality very
transparent.

Also, quex provides the ability to define event handlers for mode
transitions and the indentation event. For more features, see the
documentation which is downloadable from http://quex.sf.net.

Quex still uses flex to produce the 'core' engine. This has the
advantage that intermediate files can be used as a bases to work with
flex, if your boss wants to rely on traditional tools. There is some
work in progress, though, to develop a new engine generator which is
more suited for unicode handling.

Best Regards

Frank.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top