Grammars (mini-scripting languages)

M

Michael Judge

I'm working on implementing mini-scripting languages for two different
projects, so I'm building a framework that could handle the task
generically.

Does this seem like a good way to approach it?
1. Store each command's matching regular expression and ruby code
within the database. (sample fixture below)
2. For each line in script:
Test line against each command's corresponding regular
expression
If matched, execute the command's ruby code using an
instance_eval.

My thoughts are:
1. Storing executable code in the database is a security problem
2. instance_eval is slow
3. The alternative (a big if/elsif tree) would span many pages and
be unweildy.

Have a better suggestion?


Sample code:
def compile
syntax.each do |line|
command = commands.find { |c| c.match? line }
raise "Command not found that can process '#{line}'" if command.nil?
instance_eval command.ruby
end
end


Sample commands fixture:

label:
id: 1
name: label
regexp: ^Q (.*)$
ruby: puts "$1\n"

single-punch:
id: 2
name: single-punch
regexp: ^X-(\d+) (.*)$
ruby: puts " o $2\n"

multiple-punch:
id: 3
name: multiple-punch
regexp: ^M-(\d+) (.*)$
ruby: puts " [ ] $2\n"

blank-line:
id: 4
name: blank-line
regexp: ^\s*$
ruby: # Do nothing
 
R

Ross Bamford

Does this seem like a good way to approach it?
1. Store each command's matching regular expression and ruby code
within the database. (sample fixture below)
2. For each line in script:
Test line against each command's corresponding regular
expression
If matched, execute the command's ruby code using an
instance_eval.

My thoughts are:
1. Storing executable code in the database is a security problem
2. instance_eval is slow
3. The alternative (a big if/elsif tree) would span many pages and
be unweildy.

Have a better suggestion?

Not necessarily better, but how about something like:

class Commands
def Q(args)
puts args
end

def X(args)
if args =~ /^(\d+) (.*)$/
puts " o #{$2}"
end
end

def M(args)
if args =~ /^(\d+) (.*)$/
puts " [ ] #{$2}"
end
end

def dispatch(line)
if line =~ /([QXM])-?(.*)/
send($1.intern, $2)
else
raise "Invalid input: #{line}"
end
end
end

s = <<EOS
Q Just a label
X-23 Single punched
M-11 Multi punched
J-12 Bad input
EOS

cmds = Commands.new
s.each { |c| cmds.dispatch(c) }

(Obviously I guessed a bit with the input format).
Output:

Just a label
o Single punched
[ ] Multi punched
-:22:in `dispatch': Invalid input: J-12 Bad input (RuntimeError)
from -:35
from -:35
 
M

Michael Judge

Ross said:
1. Storing executable code in the database is a security problem
2. instance_eval is slow
3. The alternative (a big if/elsif tree) would span many pages and
be unweildy.

Have a better suggestion?

Not necessarily better, but how about something like:

class Commands
[snip]
def dispatch(line)
if line =~ /([QXM])-?(.*)/
send($1.intern, $2)
else
raise "Invalid input: #{line}"
end
end
end

That's neat, Ross. I wasn't familiar with the send command. Looks like
the consequence is that the grammar has to fit an easy regular
expression or you'd be duplicating it in the match and again in the
method definition... Well, that's not necessarily a bad thing.
Consistency is good too.

I need to think about this.

What do programmers normally do when they have a case statement that's
30 or more items long? Previously I've just left it as a case statement
and spent the life of the project ticked at it.
 
R

Ross Bamford

Ross said:
1. Storing executable code in the database is a security problem
2. instance_eval is slow
3. The alternative (a big if/elsif tree) would span many pages and
be unweildy.

Have a better suggestion?

Not necessarily better, but how about something like:

class Commands
[snip]
def dispatch(line)
if line =~ /([QXM])-?(.*)/
send($1.intern, $2)
else
raise "Invalid input: #{line}"
end
end
end

That's neat, Ross. I wasn't familiar with the send command. Looks like
the consequence is that the grammar has to fit an easy regular
expression or you'd be duplicating it in the match and again in the
method definition... Well, that's not necessarily a bad thing.
Consistency is good too.

Agreed, but it did bug me a bit, too :) Depending on the actual input
format you could optimise that away though I think, e.g.

class Commands
def M(args)
# Notice that regexp here is now responsible for
# for handling the '-' after the initial letter.
if args =~ /^-(\d+) (.*)$/
puts " [ ] #{$2}"
end
end

def dispatch(line)
begin
send(line.slice!(0,1), line)
rescue NoMethodError
raise "Invalid input: #{line}"
end
end
end

That way you're doing only one match per dispatch, and validating
implicitly (Ruby will raise NoMethodError if the command is bad). Since
we're forcing only a single letter, it shouldn't be possible for people
to input e.g. 'exit-666 1' or something to breach security.

A win with this approach I think is that it keeps everything where it
should be, i.e. the commands themselves are responsible for processing
their arguments, however they see fit. Also, you can easily add new
commands at runtime, simply by definining a new method. There's no
'command registry' anywhere.

One other change I'd make to my previous post would be to make the
command methods private.
What do programmers normally do when they have a case statement that's
30 or more items long? Previously I've just left it as a case statement
and spent the life of the project ticked at it.

Ordinarily I think I'd consider it a code smell (or maybe a "design
smell"?). Maybe I'd fix it, maybe not, but like you I'd at least _want_
to :).
 
A

Alec Ross

Michael said:
Ross said:
1. Storing executable code in the database is a security problem
2. instance_eval is slow
3. The alternative (a big if/elsif tree) would span many pages and
be unweildy.

Have a better suggestion?

Not necessarily better, but how about something like:

class Commands
[snip]
def dispatch(line)
if line =~ /([QXM])-?(.*)/
send($1.intern, $2)
else
raise "Invalid input: #{line}"
end
end
end

That's neat, Ross. I wasn't familiar with the send command. Looks like
the consequence is that the grammar has to fit an easy regular
expression or you'd be duplicating it in the match and again in the
method definition... Well, that's not necessarily a bad thing.
Consistency is good too.

I need to think about this.

What do programmers normally do when they have a case statement that's
30 or more items long? Previously I've just left it as a case statement
and spent the life of the project ticked at it.

My inclination is generally to drive the execution by table lookup.
 
H

Harley Pebley

What do programmers normally do when they have a case statement that's
30 or more items long? Previously I've just left it as a case statement
and spent the life of the project ticked at it.

I don't let them in the code in the first place. A situation that would
need a case/switch that's more than about 5 +/-2 item long gets redesigned
during initial implementation.

When I take over a code-base that has something like that in it, it gets
redesigned the first time I have to touch that case statement.

Don't live with broken windows.

Regards,
Harley Pebley
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top