learning vhdl - state machines

Al Muliman · Feb 12, 2009

Trying to find an accepted way to construct state machines in vhdl, I
came upon this:

http://web.engr.oregonstate.edu/~traylor/ece474/new_vhdl_pdfs/state_machines_in_vhdl.pdf

This is just explanation that I found - I picked this one because it
seems rather nice.

One key point seems to be to avoid latches.

The accepted way to construct state machines would be to have a
synchronous part that turns the next state into the current and a
combinatorial part that determines the next state.

My worry is that the combinatorial part selects a new state with an
encoding that is "more than one bit away" from that of the current state
(or any other state that the combinatorial part might otherwise have
selected) just as the synchronous part flipsflops. In my mind the
current state might become a mix of the two different states that the
combinatorial part had to choose from. Is this worry unfounded?

I am a total fpga noob with a software background mind you.

eliascm · Feb 12, 2009

State Machines

What should happen in your state machine, as in any synchronous design, is the the combinational changes that start on a clock edge will be stable by the time the next clock edge occurs. The only way this wouldn't happen is that the clock period is so short that the combinational logic changes don't become stable between clock edges. If you set your timing contraints correctly, the FPGA synthesis software will indicate whether or not you will have a timing problem.

Al Muliman · Feb 12, 2009

Jonathan Bromley skrev:

I don't have time to get involved in this, but I'm sure you will
find people telling you that the style recommended in that
document is not in wide use - rolling a combinational and
a clocked process into one VHDL (lexical) process is not
generally accepted style, although the majority of synthesis
tools now accept it.

So... you would separate the clocked and combinational parts of the
process into a process each and have the two processes share the state
variables like I see in some of the other examples? I thought it seemed
handy to just put the two parts in the same process, but I am definately
trying to learn what is the most accepted practice.

Yes. In VHDL, FFs clocked by the same clock are guaranteed to
update in a coherent manner. Furthermore, timing analysis
within the synthesis and implementation tools will ensure
that this guarantee holds good in the real hardware too.
If, on reading the post-implementation timing report, you
see warnings about "hold time violations" then you know
that these synchronous assumptions are in jeopardy and
you should start worrying all over again

But... as I understand, the combinational process which selects the next
state is not clocked at all? So any asynchronous input to the
combinational process might potentially result in an undesired state
being clocked past the state FFs due to different delays through the
various part of the combinational process?

Forgive me, but I don't have much (you might say "any") understanding of
what circuitry is generated by the vhdl constructs. I am working on an
example so I can see for myself but it will be a little while till I get
there.

If you write your design in Verilog, then your worry
most certainly DOES have foundation and you need to
take great care to code in a way that keeps things OK.

Ok, thanks for the heads up. I will stick with VHDL for now to avoid
more confusion than what already clouds my mind, but I'm sure I'll do
good to keep your comment in mind.

Thank you for your input.

Al Muliman · Feb 12, 2009

There are single-process styles that do not have any
combinational outputs, and that's the technique I prefer
myself.

Okay that was also the solution I had reached on my own before I started
looking for the True Path. But that means states and outputs are
registered, right? Inferred... learning new words here. And more logic,
right? So I might try the combinational approach to see what that could
do for me.

There is widespread disagreement on this matter,
and the "correct" answer is strongly affected by what
you're trying to do with the state machine. I am trying
very hard here to avoid fomenting religious warfare

I sensed as much, but I am agnostic (and still a bit confused).

It is
*essential* that all inputs to such a thing be synchronous, so that
they exhibit sufficient setup time relative to the clock. Of course,
it is always possible to meet this requirement by individually
registering (resynchronizing) each potentially asynchronous input
into its own register.

Ah good, I am still on track. Thank you.

As you correctly said, a transition whose Hamming distance is
one (only one bit changes) is immune to problems of this kind,
apart from the old spectre of metastability. If you can
code your state machine so that *all* its transitions are
one-bit only, then you have what's known as a "Gray code".

Yes, I know of gray code and learned today that the enumeration types
typically used for state machines can be "custom encoded":

attribute enum_encoding: string;
attribute enum_encoding of state_type:type is "001 010 011 100 101 110 111";

I suppose the default encoding is 0,1,2,3 etc...

In the old days when flip-flops were a precious resource,
writing state machines so that their asynchronous inputs
affected only Gray transitions was a standard trick.
Nowadays it's rarely worth the (considerable) bother;
simply register all the asynchronous inputs.

I recognise the reasoning from software development. It is inevitable
and possibly necessary. But I should learn.

I think you are amply aware of what is going on.

It's more like a sense of what's going on than actual awareness. A
mixture of what little I recall from the lectures I have attended about
digital design (reduction of karnaugh maps, race conditions and that
sort), the parallels I can draw from software development and the
realization that vhdl is nothing like software at all. What happens when
i write "if rising_edge(clk)" - the inferring that is going an - all
that I have to learn. I have yet to wrap my mind around functions and
procedures in vhdl - that I am definately saving for later. Right now I
just need a lean, mean state machine...

The topic seems a nice angle to get started on vhdl - I recognise state
machines from software development, and they are used all the time... in
vhdl as well as in software I expect.

I do belive you have answered the questions that bothered me the most.
Thank you.

Andy · Feb 12, 2009

Okay that was also the solution I had reached on my own before I started
looking for the True Path. But that means states and outputs are
registered, right? Inferred... learning new words here. And more logic,
right? So I might try the combinational approach to see what that could
do for me.

Registered outputs do not always need more logic than combinatorial
ones (except the register itself, which is almost free in most FPGA
architectures). In the simplest case, with one-hot state encoding, the
output is optimized to be just the corresponding bit in the state
register (whether you are using combinatorial or registered outputs)

How you code things depends on how you think. If you think of outputs
being (combinatorially) decoded from states, then to register the
outputs with the same clock-cycle-based timing (on and off at the same
time as the states are), you need to assert the output with every
transition into the state(s) that you would have decoded. That may
mean more coding, but not usually more resulting logic. I've gotten so
used to it, it is second nature to me, and I prefer registered outputs
from state machines (avoids output glitches).

I use a single, clocked process for the state machine and its outputs.
It is immune from latches, requires less coding that two processes,
and simulates more quickly due to modern simulator optimizations
(mostly noticeable in large designs).

I also happen to use VHDL variables inside clocked processes, but
depending on how you think about design, they may help or hinder you.
Variables behave more like SW (updates are not postponed), and thus
the clock-cyclic behavior is easier to understand, but mentally
synthesizing a variable-based description into gates and flops is not
as easy. It suffices to say that a good synthesis tool will reliably
synthesize a clocked process into a circuit that behaves, on a clock-
cyclic basis, like the description simulates, so long as the inputs
are synchronous to the clock. If you prefer to design with gates and
flops, then use signals, and maybe even separate combinatorial and
clocked processes. If you prefer to design with a behavioral
description and let the synthesis tool do the lifting, use variables
in clocked processes.

I sensed as much, but I am agnostic (and still a bit confused).

Ah good, I am still on track. Thank you.

Not necessarily true. First off, it depends on whether you want grey
coding to avoid decoding glitches, or you want grey coding to avoid
erroneous transitions based on asynchronous inputs.

In a gray coded state machine for asynchronous inputs, the most
adjacent destinations, including the current state if applicable,
reachable from the current state is two. If a state machine needs to
transition from one state to one of two different desitination states
(without the possibility of waiting in the current state), those two
destinations must be "adjacent" (hamming distance=1) to each other,
not to the original state. Thus the transitions are not always
adjacent, but the destinations are. This is (was, before registers
became cheap) the typical approach to gray coding state machines:
State driven transitions are always synchronous, and do not require
adjacent transitions. Input-driven transitions may not be synchronous,
and thus need adjacent transitions.

These were reliable techniques, and still are, from a hardware point
of view. But from a design review/audit point of view, separately
synchronizing all state machine inputs is much easier to verify that
it is done correctly. It also renders worrying about adjacent
transitions moot, which then makes other, potentially more efficient
state encodings (e.g. one-hot) usable.

Andy

Mike Treseler · Feb 12, 2009

Al said:
One key point seems to be to avoid latches.

I avoid latches and fussing with Gray codes
by following a single process template
like the one below.

main : process (clock, reset) is
begin -- for details see http://mysite.verizon.net/miketreseler/

if reset = '1' then -- Assumes synched trailing edge reset pulse
init_regs; -- reg_v := init_c;
elsif rising_edge(clock) then
update_regs; -- reg_v := f(reg_v);
end if;
update_ports;
end process main;

In my code, counters, state machines, shifters, arbiters etc
are all described the same way:
[] a register declaration
[] a procedure to update the register on every rising clock edge.

Each process can cover one or more registers.

-- Mike Treseler

Al Muliman · Feb 12, 2009

Registered outputs do not always need more logic than combinatorial
ones (except the register itself, which is almost free in most FPGA
architectures). In the simplest case, with one-hot state encoding, the
output is optimized to be just the corresponding bit in the state
register (whether you are using combinatorial or registered outputs)

How you code things depends on how you think. If you think of outputs
being (combinatorially) decoded from states, then to register the
outputs with the same clock-cycle-based timing (on and off at the same
time as the states are), you need to assert the output with every
transition into the state(s) that you would have decoded. That may
mean more coding, but not usually more resulting logic. I've gotten so
used to it, it is second nature to me, and I prefer registered outputs
from state machines (avoids output glitches).

I use a single, clocked process for the state machine and its outputs.
It is immune from latches, requires less coding that two processes,
and simulates more quickly due to modern simulator optimizations
(mostly noticeable in large designs).

Ok I think I need an example now...

Is this the newfangled pattern for a fsm:

entity fsm is
port ( clock : in std_logic;
reset : in std_logic;
input : in std_logic);
end fsm;
architecture behavioral of fsm is
type state_type is (THIS, THAT);
begin
process (clock, reset)
variable state : state_type;
begin
if (reset = '1') then
state := THIS;
elsif rising_edge(clock) then
-- register inferred on any variables or signals
-- assigned to inside this elsif section, right?
-- due to the rising_edge...
case state is
-- Can you assign directly to a state variable when it is
-- registered? Or does it mess the case up?
-- Those are things I worry about...
when THIS => if (input = '1') then state := THAT; end if;
when THAT => if (input = '0') then state := THIS; end if;
when others => state := THIS;
end case;
end if;
end process;
end behavioral;

I like it - one process, a private variable; no polluting the namespace.
Sorry, software speak. But is this it then?

I also happen to use VHDL variables inside clocked processes, but
depending on how you think about design, they may help or hinder you.

That I shall worry about when I start thinking consciously about design.
Currently I am fumbling in the dark.

Not necessarily true. First off, it depends on whether you want grey
coding to avoid decoding glitches, or you want grey coding to avoid
erroneous transitions based on asynchronous inputs.

In a gray coded state machine for asynchronous inputs, the most
adjacent destinations, including the current state if applicable,
reachable from the current state is two. If a state machine needs to
transition from one state to one of two different desitination states
(without the possibility of waiting in the current state), those two
destinations must be "adjacent" (hamming distance=1) to each other,
not to the original state. Thus the transitions are not always
adjacent, but the destinations are. This is (was, before registers
became cheap) the typical approach to gray coding state machines:
State driven transitions are always synchronous, and do not require
adjacent transitions. Input-driven transitions may not be synchronous,
and thus need adjacent transitions.

These were reliable techniques, and still are, from a hardware point
of view. But from a design review/audit point of view, separately
synchronizing all state machine inputs is much easier to verify that
it is done correctly. It also renders worrying about adjacent
transitions moot, which then makes other, potentially more efficient
state encodings (e.g. one-hot) usable.

Thank you for clarifying further.

Andy · Feb 12, 2009

Ok I think I need an example now...

Is this the newfangled pattern for a fsm:

entity fsm is
port ( clock : in std_logic;
reset : in std_logic;
input : in std_logic);
end fsm;
architecture behavioral of fsm is
type state_type is (THIS, THAT);
begin
process (clock, reset)
variable state : state_type;
begin
if (reset = '1') then
state := THIS;
elsif rising_edge(clock) then
-- register inferred on any variables or signals
-- assigned to inside this elsif section, right?
-- due to the rising_edge...
case state is
-- Can you assign directly to a state variable when it is
-- registered? Or does it mess the case up?
-- Those are things I worry about...
when THIS => if (input = '1') then state := THAT; end if;
when THAT => if (input = '0') then state := THIS; end if;
when others => state := THIS;
end case;
end if;
end process;
end behavioral;

I like it - one process, a private variable; no polluting the namespace.
Sorry, software speak. But is this it then?

With signals in a clocked process, any signals assigned represent
registers.

With variables, it is different. A reference to a variable can be a
registered or combinatorial reference, depending on the relative order
of reading and writing the variable within a given clock cycle. If a
variable was previously written in the current clock cycle, before it
is accessed, then THAT REFERENCE to that variable is combinatorial. If
the variable was not written prior to being accessed, within that
clock cycle, then that implies that a previous clock cycle's value
has to be remembered, from which synthesis will infer a register FOR
THAT REFERENCE. The key is that the reference to the variable
determines register/combinatorial, not the variable itself. Multiple
references to the same variable can be combinatorial, registered, or
even both, in the case of a conditional prior write. In this case the
same condition that determines the prior write will drive a
multiplexer to select the registered or combinatorial value to be used
for that reference.

Just remember this: the description will behave like the SW reads:
variable values are updated as soon as the assignment is executed. If
the description recalls a value stored in a previous clock cycle, then
a register will be inferred. If the description recalls a value stored
previously in the current clock cycle, then a combinatorial value is
inferred.

In terms of your template, the reference to state occurs right up
front in the case statement, before state has been written within the
current clock cycle. So that reference to state is to the output of
the register that stores state.

Hope this helps.

Andy

Al Muliman · Feb 13, 2009

With signals in a clocked process, any signals assigned represent
registers.

With variables, it is different. A reference to a variable can be a
registered or combinatorial reference, depending on the relative order
of reading and writing the variable within a given clock cycle. If a
variable was previously written in the current clock cycle, before it
is accessed, then THAT REFERENCE to that variable is combinatorial. If
the variable was not written prior to being accessed, within that
clock cycle, then that implies that a previous clock cycle's value
has to be remembered, from which synthesis will infer a register FOR
THAT REFERENCE. The key is that the reference to the variable
determines register/combinatorial, not the variable itself. Multiple
references to the same variable can be combinatorial, registered, or
even both, in the case of a conditional prior write. In this case the
same condition that determines the prior write will drive a
multiplexer to select the registered or combinatorial value to be used
for that reference.

Just remember this: the description will behave like the SW reads:
variable values are updated as soon as the assignment is executed. If
the description recalls a value stored in a previous clock cycle, then
a register will be inferred. If the description recalls a value stored
previously in the current clock cycle, then a combinatorial value is
inferred.

In terms of your template, the reference to state occurs right up
front in the case statement, before state has been written within the
current clock cycle. So that reference to state is to the output of
the register that stores state.

Hope this helps.

It does. Thank you. It also raises new questions, but I have to keep my
focus.

Al Muliman · Feb 13, 2009

For software folk, it may be useful to gloss this in
a slightly different way:

The linear, procedural idiom beloved of beginner programmers:

do something
wait for input
do something else
wait for more input
do yet other things
...

does not work in a typical modern operating-system
environment; instead, your program must look like
an event loop:

while (1) {
wait for anything interesting to happen
deal with the consequences of that event
update variables (state) to reflect the change
}

Ah the message pump

So the main code body looks like a state machine;
each time it is woken up by an external event, it
must look at the external event AND at its own internal
state ("where have I got to so far?") and use that to
decide what to do, and how to update the internal state.

In a software event loop, each trip round the loop must
execute very quickly so that the system remains responsive
to other events (oh, how I wish certain well-known programs
would keep that in mind... why does my entire PC freeze
just because a certain well-known word-processor is
trying to find a file on the network?). Indeed, if
you have lots of work to do, you must break it up into
small pieces, and after doing each little bit you must
set a wakeup timeout and go back to waiting; you can
do the next bit of work when next awoken by the timer.

A hardware "event loop", in typical modern synchronous
designs, has only one input event to wait for: the clock.
On each and every clock edge, it examines all the inputs
that it cares about given its current state. Just as
with a software event loop, the loop body MUST execute
very quickly. In fact, the body of your VHDL clocked
process conceptually executes in zero time at the
moment of the clcok edge. It is an algorithm for
computing, in zero simulated time, the next state
from the current state and inputs; EVERY synchronous
design is nothing more than a big state machine.

However, the current-state/next-state algorithm
is in reality "executed" by a piece of combinational
logic. Only in simulation is the next-state algorithm
executed on a computer. The job of a synthesis tool
is to infer from your algorithm what the logic function
needs to be. Your algorithm can be arbitrarily
complicated, with arbitrary numbers of internal variables,
some of which hold state information from one clock to
the next, and it all still works.

Aaaargh.... this was all going to be so beautifully
elegant and clear, and it's come out all rambly and
long-winded as usual. Anyway, it may be useful to
someone, I suppose

Yes I think I get what you are saying. But... I still have questions. I
am going to think about what has already been explained to me here, read
a little xilinx documentation and work on my own example (state machine
for writing to the LCD on a Spartan-3E kit). After that I shall return
with more questions ;-). You have all been most helpfull. Usenet is
still alive. If you have any pointers on what the internal building
blocks of an FPGA looks like, I stand a better chance of learning what
is meant by a register and when the value that a register holds is
clocked through, and why they come almost free in an FPGA but not so in
an Asic.

Al Muliman · Feb 13, 2009

If you have any pointers on what the internal building
blocks of an FPGA looks like,

The Xilinx documentation of the Spartan-3E probably does just fine.

Martin Thompson · Feb 13, 2009

Al Muliman said:
Ok I think I need an example now...

Is this the newfangled pattern for a fsm:

entity fsm is
port ( clock : in std_logic;
reset : in std_logic;
input : in std_logic);
end fsm;
architecture behavioral of fsm is
type state_type is (THIS, THAT);

Sorry for jumping into a conversation, but I thought I'd point out that
you can move the above line... (the type def)

begin
process (clock, reset)

to here...

type state_type is (THIS, THAT);

variable state : state_type;
begin
etc..

I like it - one process, a private variable; no polluting the
namespace. Sorry, software speak. But is this it then?

That's how I do it

And you don't pollute the "type" namespace with state_type either. So
if you have more than one process in your entity each with a state
machine, they can have independent 'state_type's.

If you only have one process per entity (search out Mike Treseler - a
big one-process-per-entity advocate) it makes no odds as the namespace
is private to the entity. I should point out, I'm also usually a
one-process man, but occasionally have two interacting state machines
in an entity.

That I shall worry about when I start thinking consciously about
design. Currently I am fumbling in the dark.

I'd add a vote for variables too

Cheers,
Martin

Martin Thompson · Feb 13, 2009

Al Muliman said:
Usenet is still alive. If you have any pointers on
what the internal building blocks of an FPGA looks like, I stand a
better chance of learning what is meant by a register and when the
value that a register holds is clocked through

I see you found the userguide - that's what I would've suggested.

and why they come almost free in an FPGA but not so in an Asic.

Each look-up table (LUT) in the FPGA has a register associated with it
(sometimes called a flipflop - FF). Except in extremely highly
pipelined design you usually have more LUTs used than FFs (my current
design uses a few hundred more LUTs than FFs), so there's FF aroudn to
be used without pushing the device size up. They're not entirely free
in that although they are there, sometimes routing the signal to a
free flipflop can't be done as the free one is "miles" away from where
the logic that generates it is. However, usually, you are putting a
register on some intermediate signal which also goes off to other
logic, and in those cases, it's easy for the tools to manage.

HTH!

Cheers,
Martin

Al Muliman · Feb 13, 2009

Sorry for jumping into a conversation, but I thought I'd point out that
you can move the above line... (the type def)

to here...

type state_type is (THIS, THAT);

etc..

Jump all you like. I thought I had tried to define the types for the
process only and found it did not work, but now I see that it does.
Thank you for pointing that out.

That's how I do it

And you don't pollute the "type" namespace with state_type either. So
if you have more than one process in your entity each with a state
machine, they can have independent 'state_type's.

Right. Things I recognise in some form from software.

If you only have one process per entity (search out Mike Treseler - a
big one-process-per-entity advocate) it makes no odds as the namespace
is private to the entity. I should point out, I'm also usually a
one-process man, but occasionally have two interacting state machines
in an entity.

I'd add a vote for variables too

That's what I hear from other people as well. So what I have now is a
single process containing two state machines using only variables for
maintaning state. Types and variables defined only for the process. All
variables except one and all signals are registered. I never finished my
attempt to create a combinatorial process to select next state. This is
much more concise and I have no worries. First simulation looks correct.
I am starting to get a feel for the difference between signals and
variables, but I am still not quite there.

Al Muliman · Feb 13, 2009

Each look-up table (LUT) in the FPGA has a register associated with it
(sometimes called a flipflop - FF). Except in extremely highly
pipelined design you usually have more LUTs used than FFs (my current
design uses a few hundred more LUTs than FFs), so there's FF aroudn to
be used without pushing the device size up. They're not entirely free
in that although they are there, sometimes routing the signal to a
free flipflop can't be done as the free one is "miles" away from where
the logic that generates it is. However, usually, you are putting a
register on some intermediate signal which also goes off to other
logic, and in those cases, it's easy for the tools to manage.

HTH!

It does. Thank you.

Al Muliman · Feb 13, 2009

It does. Thank you.

And MIA such an FPGA is a complex thing (Looking at a schematic in the
user guide and pulling my hair).

beky4kr · Feb 13, 2009

And MIA such an FPGA is a complex thing (Looking at a schematic in the
user guide and pulling my hair).

VHDL supports enumerating (enumerating example) and type variables.
This allow better readability and allow the simulator to show the
values with their ASCII meaningfully names on wave....
h--p://bknpk.no-ip.biz/SDIO/enumerating_1.html

Mike Treseler · Feb 14, 2009

Jonathan said:
In a software event loop, each trip round the loop must
execute very quickly so that the system remains responsive
to other events (oh, how I wish certain well-known programs
would keep that in mind... why does my entire PC freeze
just because a certain well-known word-processor is
trying to find a file on the network?).

Windows 1.0 was in fact a single loop
with a cycle time of "whatever".
I had to be careful not to move the
mouse during boot, or the cycle time became "forever".
Fortunately for the engineers, only our
customers had to deal with windows.
We stayed in emacs running on the vaxen.

It is an algorithm for
computing, in zero simulated time, the next state
from the current state and inputs; EVERY synchronous
design is nothing more than a big state machine.

Now you've let the cat out of the bag.
Luckily my boss doesn't read usenet

However, the current-state/next-state algorithm
is in reality "executed" by a piece of combinational
logic. Only in simulation is the next-state algorithm
executed on a computer. The job of a synthesis tool
is to infer from your algorithm what the logic function
needs to be. Your algorithm can be arbitrarily
complicated, with arbitrary numbers of internal variables,
some of which hold state information from one clock to
the next, and it all still works.

Yes, synthesis works better than most hope to expect.

Aaaargh.... this was all going to be so beautifully
elegant and clear, and it's come out all rambly and
long-winded as usual.

Well, it was more concise and useful than
the average engineering text on the subject.

Anyway, it may be useful to
someone, I suppose

Most of the tips my ski instructor gives me
seem counterintuitive and dangerous.
Then, much later, something clicks
and suddenly I get it.

-- Mike Treseler

MAX6675 VHDL	0	Sep 25, 2023
Coding State Machines	6	Oct 17, 2008
State machines	5	Oct 9, 2007
Style of coding complex logic (particularly state machines)	53	Aug 24, 2006
State machine definitions	12	Jul 26, 2012
Delay counters in three process state machines	1	Jun 26, 2009
Moore State Change	3	Nov 10, 2008
Synthesis of multiple wait statements per VHDL-200X	8	Jul 13, 2011

learning vhdl - state machines

Al Muliman

eliascm

Al Muliman

Al Muliman

Andy

Mike Treseler

Al Muliman

Andy

Al Muliman

Al Muliman

Al Muliman

Martin Thompson

Martin Thompson

Al Muliman

Al Muliman

Al Muliman

beky4kr

Mike Treseler

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads