state machine in ruby

S

snacktime

So I'm refactoring a very ugly piece of client code that needs to
implement some fairly complicated error correction over a line based
tcp protocol. There are about 10 different error scenarios I have to
detect and respond differently to. Think error correction over serial
lines, that's actually where this was derived from and now it's
layered over a tcp connection. Most of the scenarios involve several
back and forth messages between client and host. Most of the
scenarios are very similar, so if you think of it as a tree I might
not know what scenario I am dealing with until I've gotten 2-3 levels
deep, and at any point in the tree I need to know where I am and what
the possible branches are I can take based on the next response from
the host.

Any suggestions on how to implement this cleanly?

Chris
 
R

Robert Klemme

snacktime said:
So I'm refactoring a very ugly piece of client code that needs to
implement some fairly complicated error correction over a line based
tcp protocol. There are about 10 different error scenarios I have to
detect and respond differently to. Think error correction over serial
lines, that's actually where this was derived from and now it's
layered over a tcp connection. Most of the scenarios involve several
back and forth messages between client and host. Most of the
scenarios are very similar, so if you think of it as a tree I might
not know what scenario I am dealing with until I've gotten 2-3 levels
deep, and at any point in the tree I need to know where I am and what
the possible branches are I can take based on the next response from
the host.

Any suggestions on how to implement this cleanly?

It seems you have answered it in the subject already, don't you? One other
solution came to mind, something like a decision tree when you mentioned
that you need several steps to determine what's going on. A bit like a mini
expert system. Only 0.01 EUR today...

Kind regards

robert
 
F

Francis Cianfrocca

snacktime said:
So I'm refactoring a very ugly piece of client code that needs to
implement some fairly complicated error correction over a line based
tcp protocol. There are about 10 different error scenarios I have to
detect and respond differently to. Think error correction over serial
lines, that's actually where this was derived from and now it's
layered over a tcp connection. Most of the scenarios involve several
back and forth messages between client and host. Most of the
scenarios are very similar, so if you think of it as a tree I might
not know what scenario I am dealing with until I've gotten 2-3 levels
deep, and at any point in the tree I need to know where I am and what
the possible branches are I can take based on the next response from
the host.

Any suggestions on how to implement this cleanly?

Chris

For clues to a possible approach, look at the HTTP client implementation
in EventMachine. Sync to SCM, and it's in version_0/lib/protocols.

You can get as crazy with this kind of thing as you want, but if your
requirements are simple you can get away with a variable that indicates
the current protocol state. Then a simple case statement will constrain
your possible paths. It's quite rare for a network protocol to be more
complicated than that. But if yours is, consider using a LALR(1)
grammar.
 
V

Vidar Hokstad

snacktime said:
So I'm refactoring a very ugly piece of client code that needs to
implement some fairly complicated error correction over a line based
tcp protocol. There are about 10 different error scenarios I have to
detect and respond differently to. Think error correction over serial
lines, that's actually where this was derived from and now it's
layered over a tcp connection. Most of the scenarios involve several
back and forth messages between client and host. Most of the
scenarios are very similar, so if you think of it as a tree I might
not know what scenario I am dealing with until I've gotten 2-3 levels
deep, and at any point in the tree I need to know where I am and what
the possible branches are I can take based on the next response from
the host.

Any suggestions on how to implement this cleanly?

A state machine is fundamentally a set of states with associated
actions (if any) and transitions. A transition consists of a criteria
and a state to transition to. How you model that depends on how much
flexibility you need. If the state machine is fairly fixed you could
just as well implement it as a class. Something like this for example:

class StateMachine
def initialize
@state = :some_state # Starting state of the machine


end

def some_state
# Carry out actions for this state here.


result = :foo # Just a dummy result

# Transitions:


case result
when :foo : @state = :some_other_state
# more transitions ...


else @state = :stop
end
end

def some_other_state
# Action goes here
# Transition
@state = :stop
end

# Doesn't have to do this, but it might be
# nice to have a simple way of passing data
# between the state machine and a block.
# Perhaps use the block to do any IO etc.
# so the state machine doesn't need to know
# anything about it's environment
def each
while @state != :stop
send @state
yield self,@state
end
end
end

StateMachine.new.each { |sm,state| p state }

If you need something more dynamic, it would be easy enough to store
the transitions in a Hash of hash'es { :state_1 => {
:transition_criteria_1 => :state_2, :transition_criteria_2 => :state_3
} }.

Vidar
 
A

ara.t.howard

sorry not to include orig message - the text was totally fubar and hosed my
mailer...

anyone.

this is incomplete, but i made a little finite state machine lib and dsl. it
even includes tools to vizualize if you have dot installed. it's here

http://rubyforge.org/projects/codeforpeople/
http://codeforpeople.org/lib/ruby/

an example


harp:~ > cat a.rb

require 'fsm'

FSM.system do

fsm{
states %w( empty full )
transition 'filling', 'empty' => 'full'
transition 'emptying', 'full' => 'empty'
}

observer{
on_entry 'empty' do
puts 'became empty.'

transition 'empty', 'filling' do
puts 'filling...'
sleep 1
end
end

on_entry 'full' do
puts 'became full.'

transition 'full', 'emptying' do
puts 'emptying...'
sleep 1
end
end
}

start
end

STDIN.gets


harp:~ > ruby a.rb
became empty.
filling...
became full.
emptying...
became empty.
filling...
became full.
emptying...
became empty.
filling...


sorry, no docs yet ;-)


-a
 
F

Francis Cianfrocca

harp:~ > cat a.rb

require 'fsm'

FSM.system do

fsm{
states %w( empty full )
transition 'filling', 'empty' => 'full'
transition 'emptying', 'full' => 'empty'
}

observer{
on_entry 'empty' do
puts 'became empty.'

transition 'empty', 'filling' do
puts 'filling...'
sleep 1
end
end

on_entry 'full' do
puts 'became full.'

transition 'full', 'emptying' do
puts 'emptying...'
sleep 1
end
end
}

start
end

STDIN.gets


Reading this, I kept trying to figure out how this is different from
plain old yacc, then I realized you have no lookahead capability. What
kind of situations have you used this in? I don't understand Chris'
example enough yet to decide if it constitutes a regular language or
something more.

Mechanically dealing with comms protocols is a pretty interesting
subject, although in general the more recent ones aren't terribly
complicated. (The practical subset of HTTP isn't bad; SIP isn't bad;
SMTP is a bear; XML is ambiguous in at two places I know of.) I often
end up just writing recursive-descent parsers that expose their
internal state and use a pushback buffer so they can be restarted.
 
A

ara.t.howard

Reading this, I kept trying to figure out how this is different from plain
old yacc, then I realized you have no lookahead capability. What kind of
situations have you used this in?

reprocessing 30 satellite years of data from a tape archive... there are
several steps that are async and, though it may not be obvious from the above
examples, the fsm class puts every thing in Threads and communcates via
Queues...

so, basically, it makes it safe for multiple threads to be doing work and
directiing their actions via a shared state machine....


-a
 
S

snacktime

Ya I hesitated to post that as it was pretty hard to read. You can
actually read the spec at http://www.fdms.com/specs. It's the Visa
protocol. Not that you would want to, but that's an easier way to see
it if you do.

What's really scary though is my original implementation of this from
9 years ago in perl that uses modems.
 
M

Martin DeMello

So I'm refactoring a very ugly piece of client code that needs to
implement some fairly complicated error correction over a line based
tcp protocol.

Give smc.sourceforge.net a look

martin
 
A

ara.t.howard

Ugh. This is what sucks about hanging out with all you smart people, you
guys have got me thinking about this now. Ara, you're talking about a FSM
processor here,

yes. it's state machine __system__: the machine and the system to process it
in an async way...
but have you thought about augmenting it to handle context-free grammars?

nope. but i'd take any and all patches! ;-)
Ideally I'd like to be able to define protocols textually, in BNF-like or
yacc-like form.

seen this

http://raa.ruby-lang.org/project/fsmgen/

i haven't used it - looked interesting...
And given that text, mechanically generate a state machine that would be
restartable so it could be event-driven. And the scanner would be pretty
funky too, it would have to be able to generate a token that means "not
enough input to scan a complete token, go to sleep now."

naw, that's easy. just do

consumer.q.push token_bits
consumer.q.push token_bits
consumer.q.push token_bits
consumer.q.push token_done

and the consumer is automatically sleeping as tokens arrive partially.
I've just been in the middle of implementing AMQP the old-fashioned way (by
hand) so this is timely.

what's AMQP ??

-a
 
A

ara.t.howard

You'd have to add a parse stack and a lookahead buffer. Might not be that
easy, but I'll take a look if I get a chance.

k. go easy on me. it's version 0.0.0 ;-)
You solved that?? Nice.

well - the event part:

harp:~ > cat a.rb
require "time"
require "fsm"

STDOUT.sync = true
Thread.abort_on_exception = true

fsys = FSM.system{
fsm{
states %w( a b c )
transition "a2b", "a" => "b"
transition "b2c", "b" => "c"
}
#start # leave starting state and enter state "a"
}

Thread.new{
sleep 1
fsys.start # enters state 'a'
sleep 1
fsys.fsm.transition "a", "a2b"
sleep 1
fsys.fsm.transition "b", "b2c"
}

Thread.new{
puts "waiting for state a @ #{ Time.now.iso8601(6) }..."
fsys.observer.wait_for "a"
puts "state a waited for."

puts "waiting for state b @ #{ Time.now.iso8601(6) }..."
fsys.observer.wait_for "b"
puts "state b waited for."

puts "waiting for state c @ #{ Time.now.iso8601(6) }..."
fsys.observer.wait_for "c"
puts "state c waited for."
}.join


harp:~ > ruby a.rb
waiting for state a @ 2006-09-09T14:03:42.527586-06:00...
state a waited for.
waiting for state b @ 2006-09-09T14:03:43.537855-06:00...
state b waited for.
waiting for state c @ 2006-09-09T14:03:44.547454-06:00...
state c waited for.


there a lots of other event hooks too. wait_for_once. etc. regarding the
special token, that's easy - there are also hooks for input into the system:

harp:~ > cat a.rb
require "time"
require "fsm"

STDOUT.sync = true
Thread.abort_on_exception = true
MSGS = Queue.new and Thread.new{ loop{ puts MSGS.pop } }
def t() Time.now.iso8601(6) end
def p(msg) MSGS.push msg end


fsys = FSM.system{
fsm{
states %w( a b c )
transition "a2b", "a" => "b"
transition "b2c", "b" => "c"
transition "c2a", "c" => "a"
}

observer{
on_entry do |info|
p "entry <#{ info.inspect }> @ #{ t }"
end
on_input do |*argv|
p "input <#{ argv.inspect }> @ #{ t }"
end
}
start
}


Thread.new{
1.times do
sleep 1
fsys.input 'A'
fsys.transition "a", "a2b"
sleep 1

sleep 1
fsys.input 'B'
fsys.transition "b", "b2c"
sleep 1

sleep 1
fsys.input 'C'
fsys.transition "c", "c2a"
sleep 1
end
}.join



jib:~ > ruby a.rb
entry <[FSM::Event::Entry, "a", nil]> @ 2006-09-09T14:23:43.335322-06:00
input <[[FSM::Event::Input, "a", "A"]]> @ 2006-09-09T14:23:44.338837-06:00
entry <[FSM::Event::Entry, "b", nil]> @ 2006-09-09T14:23:44.340079-06:00
input <[[FSM::Event::Input, "b", "B"]]> @ 2006-09-09T14:23:46.358668-06:00
entry <[FSM::Event::Entry, "c", nil]> @ 2006-09-09T14:23:46.359932-06:00
input <[[FSM::Event::Input, "c", "C"]]> @ 2006-09-09T14:23:48.378684-06:00
entry <[FSM::Event::Entry, "a", nil]> @ 2006-09-09T14:23:48.380190-06:00


so, for the special token, you simply remain in the current state - the
observer is already sleeping ;-)

is this what you are thinking of?

Advanced Message Queueing Protocol. Very interesting if you're into
message-queueing, it's the evolution of the "amq" that J.P.Morgan has been
working on for a couple of years and that you've heard periodic rumors about
during that time. It's different because it specifies not only the wire
protocol (which itself is state-of-the-art and worth a look) but also the
broker architecture. Needless to say, all the current work has been done in
Java, so they need an agile implementation. :)

learn something every day!

cheers.

-a
 
S

snacktime

Chris, unless I'm missing something, this seems like a very simple protocol.
I'm sure you left out some stuff like the end-state signalling, the goodbye
kisses, etc. But there aren't very many state transitions here. At first
glance it looks like a nonpipelined request/response protocol like SMTP (but
far simpler than SMTP). The only confusing thing here is the 60-second
timeout. If the server doesn't respond to a request for 60 seconds, is the
client supposed to assume that the server will *never* send a response? Or
is there some correlation built into the requests and responses so the
client can later recognize a delayed or out-of-order response?

The difficult part is the error correction, at least for me not having
worked with state machines that much. There are several points in the
protocol where an error can occur, and you have to switch states,
handle the error, and then possibly jump back to where you left off if
the error is resolved. Some of the trees go 5-8 levels deep, and most
of them have their own timers which can also decide which state you
transition to next. For the timers, some of them you just disconnect
after a certain time, and some of them are the time to wait before
changing state. For instance if you get an ACK within 3 seconds go to
state A, otherwise state B.
 
J

John Carter

First off I would just like to note a small mathematical congruence....

Every state machine can be replaced by a Thread. ie. Instead of a tangly
state machine, you have a thread sucking on an event queue.

Sometimes I find that really cleans things up.

Now add Ruby Exception classes and rescues into the mix and the picture
can start to get quite rosy.

Another wild off the wall trick I have used with great success several
times in my life....

Translate the events into char's.

Then an event stream is a string.

Then a Regexp is a very very powerful, terse and expressive way of
recognizing exceedingly complex event streams.


John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

"We have more to fear from
The Bungling of the Incompetent
Than from the Machinations of the Wicked." (source unknown)
 
F

Francis Cianfrocca

John said:
Every state machine can be replaced by a Thread.

Eliminating threads is part of the point here. They make certain
operations easier to model but they have all their own baggage,
including scalability problems. I'm not planning to participate if my
comment ignites the one-millionth flame war on this subject, but I will
note that there is a balance to be struck. And threads generally lose
attractiveness as the performance requirements of an application go up.
Then a Regexp is a very very powerful, terse and expressive way of
recognizing exceedingly complex event streams.

A Regexp is only appropriate for an event stream that can be modeled
with a DFA. That does happen to be true of most comm protocols, but
there are exceptions. Some streams require CFGs.
 
J

John Carter

Let me add my 2c. Every program is a state machine. The difference between
an ordinary program and what we might call a "state machine" is one of
explicit expression.
Correct.

Also, the use of threads make the state of the state machine harder to
determine. By using threads, the thread scheduler's quirks, and every other
thread being run, becomes part of the behavior of a particular thread's
state machine. One might go so far as to say that threads that show any
interaction at all prevent a state machine from being deterministic.

Yes and no.

If you only ever communicate event's with between thread's via a SizedQueue's of
length 1, you have the same "interlocking cog" determinancy as an
unthreaded state machine only model.

The difference is you may ...
* Program each Thread/state machine in a "simple forward manner" same as any program
* You can, in appropriate cases, get substantial performance gains by increasing
the Queue sizes.

Djikstra's "Goto Consider Harmful" is even more applicable to state
machines where _every_ transition is a "goto"!

Where there is more than one state machine, your main program becomes a
poorly designed inefficient home grown scheduler.

The fact is state machines suffer the same defects as Threads, ie. races
and deadlocks, it's just the "interlocking gears" nature tends to make
them trigger or not trigger the defects the same way on every run.
(However, change the program and that is not true.)

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

"We have more to fear from
The Bungling of the Incompetent
Than from the Machinations of the Wicked." (source unknown)
 
F

Francis Cianfrocca

If you only ever communicate event's with between thread's via a SizedQueue's of
length 1, you have the same "interlocking cog" determinancy as an
unthreaded state machine only model.

Been there, done that. It's a well-studied (and often very effective)
programming model to pipe different functional "stacks" inside a
process together with intra-process event queues. Unfortunately it
still leaves you with performance problems unless you really keep the
number of threads down. Such programs tend to scale only moderately
well, in my experience (which is extensive). Especially in the case of
Ruby (which has significant problems handling large per-process
working sets), I think it's more potentially interesting to build
large systems by piping *processes* (not threads) together with
queues.

Where there is more than one state machine, your main program becomes a
poorly designed inefficient home grown scheduler.
Maybe *your* main program does :). There is a technique and a body of
experience to this kind of programming, just as there is with threads.
Threads are superficially more attractive to most programmers, perhaps
for the reason you give (that you can program each thread in a "simple
forward manner"). This "simplicity" is often deceptive. At the end of
the day, I think we should be able to agree that high-performance
concurrent programming is just hard to do well, regardless of the
chosen technique.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,430
Messages
2,571,676
Members
48,796
Latest member
Greg L.

Latest Threads

Top