rex: howto use the lexer class?

F

fdelente

Hello.

I'm trying to get rex to parse my inputs. After reading some of the sample
files provided with rex, I created this simple(?) file:

file: test.rex------------------------------------------------------------

# -*- ruby -*-
##########################################################################

class Lexer
macro
BLANKS \s+
DIGITS \d+
LETTERS [a-zA-Z]+
rule
{BLANKS}
{LETTERS} { puts "ID: '@{text}'"; [ :ID, text ] }
{DIGITS} { puts "NUMBER: '@{text}'"; [ :NUMBER, text.to_f ] }
.|\n { puts "text: '@{text}'"; [ text, text ] }
inner
end

##########################################################################
lexer=Lexer.new
while 1
str=$stdin.gets.strip
puts "str=@{str}"
lexer.scan_str(str)
puts "--------------------------------------------------------------------------"
end

end of file: test.rex-----------------------------------------------------

After 'rex test.rex', and 'ruby -Ku test.rex.rb', I always get errors like

test.rex.rb:60:in scan_evaluate': can not match: '2' (Lexer::ScanError)

when I type input.

Can anybody tell me why? Thanks.
 
N

nicholasmabry

Hey Fabrice,

The problem you're seeing is due to rex's assumption that you are
generating a parser in tandem with your lexer. The generated method
Lexer::scan_str looks like this:

def scan_str( str )
scan_evaluate str
do_parse
end

While scan_evaluate(str) is the method generated by your token
definitions, do_parse() depends on a racc grammar having been defined
and initialized. The bad news is that the default scan_str() won't
work for your purposes. The good news is that scan_evaluate() will. If
you examine your generated test.rex.rb file, you'll see that
scan_evaluate() identifies your tokens and pushes them one by one into
a queue named @rex_tokens. To pull them out of the queue, simply call
next_token(). Here's a quick replacement for the bottom of your token
definition file:

lexer=Lexer.new
while 1
str=$stdin.gets.strip
puts "str=#{str}"

# Here we're scanning the string for tokens
lexer.scan_evaluate(str)

# And then printing each one out to stdout
while token = lexer.next_token
p token
end
puts
"--------------------------------------------------------------------------"
end

The only other minor change I made was to "@{str}". The ruby string
interpolation escape sequence is actually "#{ }". Let us know if you
have more questions. Happy lexing!

-Nick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top