Style of coding complex logic (particularly state machines)

Eli Bendersky · Aug 24, 2006

Hello all,

In a recent thread (where the O.P. looked for a HDL "Code Complete"
substitute) an interesting discussion arised regarding the style of
coding state machines. Unfortunately, the discussion was mostly
academic without much real examples, so I think there's place to open
another discussion on this style, this time with real examples
displaying the various coding styles. I have also cross-posted this to
c.l.vhdl since my examples are in VHDL.

I have written quite a lot of VHDL (both for synthesis and simulation
TBs) in the past few years, and have adopted a fairly consistent coding
style (so consistent, in fact, that I use Perl scripts to generate some
of my code

. My own style for writing complex logic and state
machines in particular is in separate clocked processes, like the
following:

type my_state_type is
(
wait,
act,
test
);

signal my_state: my_state_type;
signal my_output;

....
....

my_state_proc: process(clk, reset_n)
begin
if (reset_n = '0') then
my_state <= wait;
elsif (rising_edge(clk))
case my_state is
when wait =>
if (some_input = some_value) then
my_state <= act;
end if;
...
...
when act =>
...
when test =>
...
when others =>
my_state <= wait;
end case;
end if;
end process;

my_output_proc: process(clk, reset_n)
begin
if (reset_n = '0') then
my_output <= '0';
elsif (rising_edge(clk))
if (my_state = act and some_input = some_other_val) then
...
else
...
end if;
end if;
end process;

Now, people were referring mainly to two styles. One is variables used
in a single big process, with the help of procedures (the style Mike
Tressler always points to in c.l.vhdl), and another style - two
processes, with a combinatorial process.

It would be nice if the proponents of the other styles presented their
ideas with regards to the state machine design and we can discuss the
merits of the approaches, based on real code and examples.

Thanks
Eli

backhus · Aug 24, 2006

Hi Eli,
discussion about styles is not really satisfying. You find it in this
newsgroup again and again, but in the end most people stick to the style
they know best. Style is a personal queastion than a technical one.

Just to give you an example:
The 2-process -FSM you gave as an example always creates the registered
outputs one clock after the state changes. That would drive me crazy
when checking the simulation.

Why are you using if-(elsif?) in the second process? If you have an
enumerated state type you could use a case there as well. Would look
much nicer in the source, too.

Now... Will you change your style to overcome these "flaws" or are you
still satisfied with it, becaused you are used to it?

Both is OK.

Anyway, each style has it's pros and cons and it always depends on what
you want to do.
-- has the synthesis result to be very fast or very small?
-- do you need to speed up your simulation
-- do you want easy readable sourcecode (that also is very personal,
what one considers "readable" may just look like greek to someone else)
-- etc. etc.

So, there will be no common consensus.

Best regards
Eilert

Eli Bendersky · Aug 24, 2006

backhus said:
Hi Eli,
discussion about styles is not really satisfying. You find it in this
newsgroup again and again, but in the end most people stick to the style
they know best. Style is a personal queastion than a technical one.

Just to give you an example:
The 2-process -FSM you gave as an example always creates the registered
outputs one clock after the state changes. That would drive me crazy
when checking the simulation.

I guess this indeed is a matter of style. It doesn't drive me crazy
mostly because I'm used to it. Except in rare cases, this single clock
cycle doesn't change anything. However, the benefit IMHO is that the
separation is cleaner, especially when a lot of signals depend on the
state.

Why are you using if-(elsif?) in the second process? If you have an
enumerated state type you could use a case there as well. Would look
much nicer in the source, too.

I prefer to use if..else if there is only one "if". When there are
"elsif"s, case is preferable.

Now... Will you change your style to overcome these "flaws" or are you
still satisfied with it, becaused you are used to it?

Both is OK.

Anyway, each style has it's pros and cons and it always depends on what
you want to do.
-- has the synthesis result to be very fast or very small?
-- do you need to speed up your simulation
-- do you want easy readable sourcecode (that also is very personal,
what one considers "readable" may just look like greek to someone else)
-- etc. etc.

So, there will be no common consensus.

In my original post I had no intention to reach a common consensus. I
wanted to see practical code examples which demonstrate the various
techniques and discuss their relative merits and disadvantages.

Kind regards,
Eli

Andy · Aug 24, 2006

Very interesting coding style. I'm curious why there are separate
clocked processes. You could just tack on the output code to the bottom
of the state transition process, but that is only a nit.

As long as I'm using registered outputs, I would personally prefer a
combined process, but that's just how I approach the problem. I want to
know everthing that happens in conjunction with a state by looking in
one place, not by looking here to see where/when the next state goes,
and then looking there to see what outputs are generated.

To illustrate, by modifying the original example:

my_state_proc: process(clk, reset_n)
type my_state_type is (wait, act, test);
variable my_state: my_state_type;
begin
if (reset_n = '0') then
my_state := wait;
my_output <= '0';
elsif (rising_edge(clk))
case my_state is
when wait =>
if (some_input = some_value) then
my_state := act;
end if;
...
...
when act =>
if some_input = some_other_val then
my_output <= yet_another_value;
else
...
end if; ...
when test =>
...
when others =>
my_state := wait;
end case;
end if;
end process;

The only time I would use separate logic code for outputs is if I
wanted to have combinatorial outputs (from registered variables, not
from inputs). Then I would put the output logic code after the clocked
clause, inside the process. I try to avoid combinatorial
input-to-output paths if at all possible.

Then it would look like this:

my_state_proc: process(clk, reset_n)
type my_state_type is (wait, act, test);
variable my_state: my_state_type;
begin
if (reset_n = '0') then
my_state := wait;
my_output <= '0';
elsif (rising_edge(clk))
case my_state is
when wait =>
if (some_input = some_value) then
my_state := act;
end if;
...
...
when act =>
if some_input = some_other_val then
my_output <= yet_another_value;
else
...
end if; ...
when test =>
...
when others =>
my_state := wait;
end case;
end if;
if state = act then -- cannot use process inputs here
my_output <= yet_another_value; -- or here
end if;
end process;

Interestingly, the clock cycle behavior of the above is identical if I
changed the end of the process to:

...
end case;
-- you CAN use process inputs here:
if (state = act) then
my_output <= yet_another_value; -- or here
end if;
end if;
end process;

Note that my_output is now a registered output from combinatorial
inputs, whereas before it was a combinatorial output from registered
values. Previously you could not use process inputs, now you can.

Andy

mikegurche · Aug 24, 2006

Eli said:
Hello all,

In a recent thread (where the O.P. looked for a HDL "Code Complete"
substitute) an interesting discussion arised regarding the style of
coding state machines. Unfortunately, the discussion was mostly
academic without much real examples, so I think there's place to open
another discussion on this style, this time with real examples
displaying the various coding styles. I have also cross-posted this to
c.l.vhdl since my examples are in VHDL.

I have written quite a lot of VHDL (both for synthesis and simulation
TBs) in the past few years, and have adopted a fairly consistent coding
style (so consistent, in fact, that I use Perl scripts to generate some
of my code . My own style for writing complex logic and state
machines in particular is in separate clocked processes, like the
following:

type my_state_type is
(
wait,
act,
test
);

signal my_state: my_state_type;
signal my_output;

...
...

my_state_proc: process(clk, reset_n)
begin
if (reset_n = '0') then
my_state <= wait;
elsif (rising_edge(clk))
case my_state is
when wait =>
if (some_input = some_value) then
my_state <= act;
end if;
...
...
when act =>
...
when test =>
...
when others =>
my_state <= wait;
end case;
end if;
end process;

my_output_proc: process(clk, reset_n)
begin
if (reset_n = '0') then
my_output <= '0';
elsif (rising_edge(clk))
if (my_state = act and some_input = some_other_val) then
...
else
...
end if;
end if;
end process;

Now, people were referring mainly to two styles. One is variables used
in a single big process, with the help of procedures (the style Mike
Tressler always points to in c.l.vhdl), and another style - two
processes, with a combinatorial process.

It would be nice if the proponents of the other styles presented their
ideas with regards to the state machine design and we can discuss the
merits of the approaches, based on real code and examples.

Thanks
Eli

I usually separate the state register and combinational logic for the
following reason.

First, I think that the term "coding style" is very misleading. It
is more like "design style". My approach for designing a system
(not just FSM) is
- Study the specification and think about the hardware architecture
- Draw a sketch of top-level block diagram and determine the
functionalities of the blocks.
- Repeat this process recursively if a block is too complex
- Derive HDL code according to the block diagram and perform synthesis.
This approach is based on the observation that synthesis software is
weak on architecture-level manipulation but good at gate-level logic
minimization. It allows me to have full control of the system
architecture (e.g., I can easily identify the key components, optimize
critical path etc.).

The basic block diagram of FSM (and most sequential circuits) consists
of a register, next-state logic and output logic. Based on my design
style, it is natural to describe each block in a process or a
concurrent signal assignment. The number of segments (process and
concurrent signal assignments etc.) is really not an issue. It is just
a by-product of this design style.

The advantage of this approach is that I have better control on final
hardware implementation. Instead of blindly relying on synthesis
software and testing code in a trial-and-error basis, I can
consistently get what I want, regardless which synthesis software is
used. On the downside, this approach requires more time in initial
design phase and the code is less compact. The VHDL code itself
sometimes can be cumbersome. But it is clear and easy to comprehend
when presented with the block diagram.

One interesting example in FSM design is the look-ahead output buffer
discussed in section 10.7.2 of "RTL Hardware Design Using VHDL"
(http://academic.csuohio.edu/chu_p/), the book mentioned in the
previous thread. It is a clever scheme to obtain a buffered Moore
output without the one-clock delay penalty. The code follows the block
diagram and uses four processes, one for state register, one for output
buffer, one for next-state logic and one for look-ahead output logic.
Although it is somewhat lengthy, it is easy to understand. I believe
the circuit can be described by using one clocked process with proper
mix of signals and variables and reduce the code length by 3 quarters,
but I feel it will be difficult to relate the code with the actual
circuit diagram and vice versa.

My 2 cents.

Mike G.

David Ashley · Aug 24, 2006

One interesting example in FSM design is the look-ahead output buffer
discussed in section 10.7.2 of "RTL Hardware Design Using VHDL"
(http://academic.csuohio.edu/chu_p/), the book mentioned in the
previous thread. It is a clever scheme to obtain a buffered Moore
output without the one-clock delay penalty. The code follows the block
diagram and uses four processes, one for state register, one for output
buffer, one for next-state logic and one for look-ahead output logic.
Although it is somewhat lengthy, it is easy to understand. I believe
the circuit can be described by using one clocked process with proper
mix of signals and variables and reduce the code length by 3 quarters,
but I feel it will be difficult to relate the code with the actual
circuit diagram and vice versa.

Combining the input and registered state this way allows for
a non registered path from input to output. Is this ok? Or is
there an assumption that the device connected to the output
is itself latching on the clock edge?

-Dave

Mike Treseler · Aug 24, 2006

This approach is based on the observation that synthesis software is
weak on architecture-level manipulation but good at gate-level logic
minimization.

I have observed that synthesis software does what it is told.
If I describe two gates and a flop, that is what I get.
If I describe a fifo or an array of counters, that
is what I get.

The advantage of this approach is that I have better control on final
hardware implementation. Instead of blindly relying on synthesis
software and testing code in a trial-and-error basis, I can
consistently get what I want, regardless which synthesis software is
used.

What I want is a netlist that sims the same
as my code and makes reasonable use of the
device resources. Synthesis does a good
job of this with the right design rules.
Trial and error would only come into play
if I were to run synthesis without simulation.

On the downside, this approach requires more time in initial
design phase and the code is less compact. The VHDL code itself
sometimes can be cumbersome. But it is clear and easy to comprehend
when presented with the block diagram.

I prefer clean, readable code,
verified by simulation and static timing.
I use the rtl viewer to convert
my logical description to a structural
one for review.

-- Mike Treseler

mikegurche · Aug 24, 2006

Mike said:
I have observed that synthesis software does what it is told.
If I describe two gates and a flop, that is what I get.
If I describe a fifo or an array of counters, that
is what I get.

What I want is a netlist that sims the same
as my code and makes reasonable use of the
device resources. Synthesis does a good
job of this with the right design rules.
Trial and error would only come into play
if I were to run synthesis without simulation.

I prefer clean, readable code,
verified by simulation and static timing.
I use the rtl viewer to convert
my logical description to a structural
one for review.

-- Mike Treseler

Mike T.,

This issue has been debated in many threads and I don't want to do it
again. The original poster, Eli, stated:

". . . I had no intention to reach a common consensus. I wanted to
see practical code examples which demonstrate the various techniques
and discuss their relative merits and disadvantages"

I expressed my opinion and gave an example from a book. You can do
the same. Whatever method you choose is fine with me, but I am
irritated that you always think your way is THE WAY.

Mike G.

Mike Treseler · Aug 24, 2006

The original poster, Eli, stated:

I've already shared my examples.

My posting was intended as part of the
"discussion of their relative merits and disadvantages"

I expressed my opinion and gave an example from a book. You can do
the same. Whatever method you choose is fine with me, but I am
irritated that you always think your way is THE WAY.

The vast majority of designers use your style, not mine.

Backhus said it best:

"discussion about styles is not really satisfying. You find it in this
newsgroup again and again, but in the end most people stick to the style
they know best. Style is a personal question than a technical one."

-- Mike Treseler

Eli Bendersky · Aug 25, 2006

Andy wrote:

[...]

To illustrate, by modifying the original example:

my_state_proc: process(clk, reset_n)
type my_state_type is (wait, act, test);
variable my_state: my_state_type;
begin
if (reset_n = '0') then
my_state := wait;
my_output <= '0';
elsif (rising_edge(clk))
case my_state is
when wait =>
if (some_input = some_value) then
my_state := act;
end if;
...
...
when act =>
if some_input = some_other_val then
my_output <= yet_another_value;
else
...
end if; ...
when test =>
...
when others =>
my_state := wait;
end case;
end if;
end process;

The only time I would use separate logic code for outputs is if I
wanted to have combinatorial outputs (from registered variables, not
from inputs). Then I would put the output logic code after the clocked
clause, inside the process. I try to avoid combinatorial
input-to-output paths if at all possible.

[...]

Thanks for this example. I have been always trying to avoid variables
for things like this and it's interesting to see them used correctly.

The problem I see with the approach comes in complicated code where
several signals depend on my_state (say 3-4 is enough). Then, the
single-process-handling-everything becomes rather convoluted. Besides,
since my_state is a variable local to the process, you can't see it
outside so you can't use it to drive other signals. So basically you
force all code dealing with my_state to be in one process.
Another thing is that I prefer out-of-process statements for
combinatorial logic, because IMHO it makes a cleaner separation (I
immediately see it's combinatorial, without the need to see if it has
some extra "end if"s below it that signify it's clocked.

comb_out <= '1' when my_state = act else '0';

Somehow I immediately see the combinatorial logic here (if it gets too
hairy a function can be coded).

Eli

backhus · Aug 25, 2006

In my original post I had no intention to reach a common consensus. I
wanted to see practical code examples which demonstrate the various
techniques and discuss their relative merits and disadvantages.

Kind regards,
Eli

Hi Eli,
Ok, that's something different.
Earns some contribution from my side

My example uses 3 Processes.
The first one is the simple state Register.
the second is the combinatocrical branch selection,
The third creates the registered outputs.

Recognize that the third process uses NextState for the case selection.
Advantage: Outputs change exactly at the same time as the states do.
Disadvantage: The branch logic is connected to the output logic, causing
longer delays.
Workaround: If a one clock delay of the outputs doesn't matter, Current
State can be used instead.

The only critical part I see is the second process. Because it's
combinatorical some synthesis tools might generate latches here, when
the designer writes no proper code. But we all should know how to write
latch free code, don't we? ;-)

The structure is very regular, which makes it a useful template for
autogenerated code.

Have a nice synthesis
Eilert

ENTITY Example_Regout_FSM IS
PORT (Clock : IN STD_LOGIC;
Reset : IN STD_LOGIC;
A : IN STD_LOGIC;
B : IN STD_LOGIC;
Y : OUT STD_LOGIC;
Z : OUT STD_LOGIC);
END Example_Regout_FSM;

ARCHITECTURE RTL_3_Process_Model_undelayed OF Example_Regout_FSM IS
TYPE State_type IS (Start, Middle, Stop);
SIGNAL CurrentState : State_Type;
SIGNAL NextState : State_Type;

BEGIN

FSM_sync : PROCESS(Clock, Reset)
BEGIN -- CurrentState register
IF Reset = ’1’ THEN
CurrentState <= Start;
ELSIF Clock’EVENT AND Clock = ’1’ THEN
CurrentState <= NextState;
END IF;
END PROCESS FSM_sync;

FSM_comb : PROCESS(A, B, CurrentState)
BEGIN -- CurrentState Logic
CASE CurrentState IS
WHEN Start =>
IF (A NOR B) = ’1’ THEN
NextState <= Middle;
END IF;
WHEN Middle =>
IF (A AND B) = ’1’ THEN
NextState <= Stop;
END IF;
WHEN Stop =>
IF (A XOR B) = ’1’ THEN
NextState <= Start;
END IF;
WHEN OTHERS => NextState <= Start;
END CASE;
END PROCESS FSM_comb;

FSM_regout : PROCESS(Clock, Reset)
BEGIN -- Output Logic
IF Reset = ’1’ THEN
Y <= ’0’;
Z <= ’0’;
ELSIF Clock’EVENT AND Clock = ’1’ THEN
Y <= ’0’; -- Default Value assignments
Z <= ’0’;
CASE NextState IS
WHEN Start => NULL;
WHEN Middle => Y <= ’1’;
Z <= ’1’;
WHEN Stop => Z <= ’1’;
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_regout;
END RTL_3_Process_Model_undelayed;

mikegurche · Aug 25, 2006

backhus said:
Hi Eli,
Ok, that's something different.
Earns some contribution from my side

My example uses 3 Processes.
The first one is the simple state Register.
the second is the combinatocrical branch selection,
The third creates the registered outputs.

Recognize that the third process uses NextState for the case selection.
Advantage: Outputs change exactly at the same time as the states do.
Disadvantage: The branch logic is connected to the output logic, causing
longer delays.
Workaround: If a one clock delay of the outputs doesn't matter, Current
State can be used instead.

The only critical part I see is the second process. Because it's
combinatorical some synthesis tools might generate latches here, when
the designer writes no proper code. But we all should know how to write
latch free code, don't we? ;-)

The structure is very regular, which makes it a useful template for
autogenerated code.

Have a nice synthesis
Eilert

ENTITY Example_Regout_FSM IS
PORT (Clock : IN STD_LOGIC;
Reset : IN STD_LOGIC;
A : IN STD_LOGIC;
B : IN STD_LOGIC;
Y : OUT STD_LOGIC;
Z : OUT STD_LOGIC);
END Example_Regout_FSM;

ARCHITECTURE RTL_3_Process_Model_undelayed OF Example_Regout_FSM IS
TYPE State_type IS (Start, Middle, Stop);
SIGNAL CurrentState : State_Type;
SIGNAL NextState : State_Type;

BEGIN

FSM_sync : PROCESS(Clock, Reset)
BEGIN -- CurrentState register
IF Reset = '1' THEN
CurrentState <= Start;
ELSIF Clock'EVENT AND Clock = '1' THEN
CurrentState <= NextState;
END IF;
END PROCESS FSM_sync;

FSM_comb : PROCESS(A, B, CurrentState)
BEGIN -- CurrentState Logic
CASE CurrentState IS
WHEN Start =>
IF (A NOR B) = '1' THEN
NextState <= Middle;
END IF;
WHEN Middle =>
IF (A AND B) = '1' THEN
NextState <= Stop;
END IF;
WHEN Stop =>
IF (A XOR B) = '1' THEN
NextState <= Start;
END IF;
WHEN OTHERS => NextState <= Start;
END CASE;
END PROCESS FSM_comb;

FSM_regout : PROCESS(Clock, Reset)
BEGIN -- Output Logic
IF Reset = '1' THEN
Y <= '0';
Z <= '0';
ELSIF Clock'EVENT AND Clock = '1' THEN
Y <= '0'; -- Default Value assignments
Z <= '0';
CASE NextState IS
WHEN Start => NULL;
WHEN Middle => Y <= '1';
Z <= '1';
WHEN Stop => Z <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_regout;
END RTL_3_Process_Model_undelayed;

Hi, Eilert,

I generally use this style but with a different output segment. I have
three output logic templates:

Template 1: vanilla, unbuffered output
-- FSM with unbuffered output
-- Can be used for Mealy/Moore output
-- (include input in sensitivity list for Mealy)
FSM_unbuf_out : PROCESS(CurrentState)
Y <= '0'; -- Default Value assignments
Z <= '0';
CASE CurrentState IS
WHEN Start => NULL;
WHEN Middle => Y <= '1';
Z <= '1';
WHEN Stop => Z <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_regout;

Template 2: add buffer for output (There are 4 processes now ;-)
-- FSM with buffered output
-- there is a 1-clock delay
-- can be used for Mealy/Moore output
FSM_unbuf_out : PROCESS(CurrentState)
Y_tmp <= '0'; -- Default Value assignments
Z_tmp <= '0';
CASE CurrentState IS
WHEN Start => NULL;
WHEN Middle => Y_tmp <= '1';
Z_tmp <= '1';
WHEN Stop => Z_tmp <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_unbuf_out;

-- buffer for output signal
FSM_out_buf : PROCESS(Clock, Reset)
BEGIN -- Output Logic
IF Reset = '1' THEN
Y <='0'; -- Default Value assignments
Z <='0';
ELSIF Clock'EVENT AND Clock = '1' THEN
Y <= Y_tmp ; -- Default Value assignments
Z <= Z_tmp;
END IF;
END PROCESS FSM_out_buf;

Template 3: buffer with "look-ahead" output logic
-- FSM with look-ahead buffered output
-- no 1-clock delay
-- can be used for Moore output only
FSM_unbuf_out : PROCESS(NextState)
Y_tmp <= '0'; -- Default Value assignments
Z_tmp <= '0';
CASE NextState IS
WHEN Start => NULL;
WHEN Middle => Y_tmp <= '1';
Z_tmp <= '1';
WHEN Stop => Z_tmp <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_unbuf_out;

-- buffer for output signal
-- same as template 2
FSM_out_buf : PROCESS(Clock, Reset)
. . .

The code is really lengthy. However, as you indicated earlier, its
structure is regular, and can be served as a template or even
autogenerated. I develop the template based on
"http://academic.csuohio.edu/chu_p/rtl/chu_rtL_book/rtl_chap10_fsm.pdf"
It is a very good article on FSM (or very bad, if this is not your
coding style).

Mike G.

Martin Gagnon · Aug 25, 2006

["Followup-To:" header set to comp.lang.vhdl.]

[snip]

Hi, Eilert,

I generally use this style but with a different output segment. I have
three output logic templates:

Template 1: vanilla, unbuffered output
-- FSM with unbuffered output
-- Can be used for Mealy/Moore output
-- (include input in sensitivity list for Mealy)
FSM_unbuf_out : PROCESS(CurrentState)
Y <= '0'; -- Default Value assignments
Z <= '0';
CASE CurrentState IS
WHEN Start => NULL;
WHEN Middle => Y <= '1';
Z <= '1';
WHEN Stop => Z <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_regout;

Template 2: add buffer for output (There are 4 processes now ;-)
-- FSM with buffered output
-- there is a 1-clock delay
-- can be used for Mealy/Moore output
FSM_unbuf_out : PROCESS(CurrentState)
Y_tmp <= '0'; -- Default Value assignments
Z_tmp <= '0';
CASE CurrentState IS
WHEN Start => NULL;
WHEN Middle => Y_tmp <= '1';
Z_tmp <= '1';
WHEN Stop => Z_tmp <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_unbuf_out;

-- buffer for output signal
FSM_out_buf : PROCESS(Clock, Reset)
BEGIN -- Output Logic
IF Reset = '1' THEN
Y <='0'; -- Default Value assignments
Z <='0';
ELSIF Clock'EVENT AND Clock = '1' THEN
Y <= Y_tmp ; -- Default Value assignments
Z <= Z_tmp;
END IF;
END PROCESS FSM_out_buf;

Template 3: buffer with "look-ahead" output logic
-- FSM with look-ahead buffered output
-- no 1-clock delay
-- can be used for Moore output only
FSM_unbuf_out : PROCESS(NextState)
Y_tmp <= '0'; -- Default Value assignments
Z_tmp <= '0';
CASE NextState IS
WHEN Start => NULL;
WHEN Middle => Y_tmp <= '1';
Z_tmp <= '1';
WHEN Stop => Z_tmp <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_unbuf_out;

-- buffer for output signal
-- same as template 2
FSM_out_buf : PROCESS(Clock, Reset)
. . .

The code is really lengthy. However, as you indicated earlier, its
structure is regular, and can be served as a template or even
autogenerated. I develop the template based on
"http://academic.csuohio.edu/chu_p/rtl/chu_rtL_book/rtl_chap10_fsm.pdf"
It is a very good article on FSM (or very bad, if this is not your
coding style).

Hi.. I've read this pdf and it's look very interesting.. it's how many
different type of state machine implementations etc.. But the way I code
my state machine is different of all of them and I don't know if it's
good and I'm not sure to which one mine is equivalent. My state machine
is on one single process.. but is different than the way is shown in the
rtl_chap10_fsm.pdf file.. (the one that's is done in a single process
and is supposed to be bad) Here's one of my state machines example.

====================================================================

type txgen_states_t is (
st_idle,
st_gotsync,
st_tx_delay,
st_tx_startcharge,
st_tx_stopcharge,
st_tx_fire,
st_wait_min_period
);

....

constant zero32: std_logic_vector(31 downto 0) := (others=>'0');
signal prev_state_buf, cur_state_buf : txgen_states_t ;

....

txgen_state_machine_proc:
process(clk, reset_n)
begin
if reset_n = '0' then
prev_state_buf <= st_idle ;
cur_state_buf <= st_idle ;

elsif rising_edge(clk) then
prev_state_buf <= cur_state_buf ;

case cur_state_buf is
when st_idle =>
if sync = '1' then
cur_state_buf <= st_gotsync ;

else
cur_state_buf <= cur_state_buf;

end if;

when st_gotsync =>
cur_state_buf <= st_tx_delay ;

when st_tx_delay =>
if tx_delay_done = '1' then
cur_state_buf <= st_tx_startcharge;
else
cur_state_buf <= cur_state_buf ;
end if;

when st_tx_startcharge =>
if tx_charge_done = '1' then
cur_state_buf <= st_tx_stopcharge;
else
cur_state_buf <= cur_state_buf ;
end if;

when st_tx_stopcharge =>
if tx_transfer_done = '1' then
cur_state_buf <= st_tx_fire;
else
cur_state_buf <= cur_state_buf ;
end if;

when st_tx_fire =>
cur_state_buf <= st_wait_min_period;

when st_wait_min_period =>
if tx_min_period_done = '1' then
cur_state_buf <= st_idle;
else
cur_state_buf <= cur_state_buf ;
end if;

when OTHERS =>
cur_state_buf <= st_idle ;

end case;
end if;
end process;

---
--- One of the input_signal process
---
tx_charge_done_proc:
process(clkin, reset_n)
begin
if reset_n = '0' then
tx_charge_cnt <= (others=>'0') ;
tx_charge_done <= '0';

elsif rising_edge(clkin) then
case cur_txgen_st_buf is
when st_tx_startcharge =>
if tx_charge_cnt > zero32(tx_charge_cnt'range) then
tx_charge_cnt <= tx_charge_cnt - '1';
tx_charge_done <= '0';
else
tx_charge_cnt <= tx_charge_cnt ;
tx_charge_done <= '1';
end if;

when OTHERS =>
tx_charge_cnt <= tx_charge_time ;
tx_charge_done <= '0';
end case;
end if;
end process;

=================================================================

A lot of the code is missing.. I show the state machine process one of the
input used by the state machine...

I've never had problem with this way of implementing my state machine.. I
do a lot of RTL desing on a Xilinx Virtex-II with 200Mhz Clock.. and
everything work as I expect..

what do you think about the way I do my state machine ?

Thanks..

KJ · Aug 25, 2006

Rather than posting code, I'll refer to yours since it is roughly along
the lines of what I do. Instead I'll hope that my explanation is clear
enough that one can follow my reasoning (whether you agree or disagree
with it) without any more than occasional snippets of code.

First off, I don't make any religious distinction between 'state
machine' signals and 'output' signals so I wouldn't feel compelled to
have a separate process for outputs, so I might choose to simply
combine them into a single process. The advantage (IMO): generally
less code, somewhat more readable and maintainable since in many cases,
it is much easier to follow the logic that says "if x then goto this
state and set this output to this value", end of story.

Having said that though, I do tend to have multiple clocked processes.
I base what goes into each process on the somewhat fuzzy definition of
what things are in some sense 'related'. Things to me are 'related' if
I'm replicating code to implement them in separate processes. An
example would be if I have three signals A, B, C that all are of the
form "if (x = '1') then.... else.... end if;" then I would most likely
have A, B, C in a single process. Of course A, B, C being different
would have some additional logic associated uniquely with them so
within the overall "if (x = '1')...else...end if;" statement there
would be additional logic 'if', 'case', whatever that go into defining
them.

Outputs that depend on 'next' state will tend to get implemented in
with the state machine for the simple reason that they meet the
'related' criteria. Outputs that depend on current state will tend to
get implemented elsewhere because they are not 'related'. Again, no
heartburn here because I'm being pragmatic rather than dogmatic about
source code positioning, I let the relationships drive how it appears.
This tends to produce more robust code (IMO) since there tends to be
less duplicated logic that will over time start to diverge because
something changed 'up there' but forgot to be changed 'down there'. By
physically grouping related things, it is easier to see implications of
the change I'm contemplating on other related signals and whether there
is a relationship that should be maintained or severed somewhat.

I then try to balance that out with the again somewhat fuzzy term of
'readability'. A single process of 1000 lines of anything to me is too
long, I aim for it to fit on a screen....maybe one with somewhat high
resolution but that's the basic idea. Scrolling back and forth while
you're trying to understand code is not productive and is disruptive I
think.

Another criteria I use for whether things should be together in a
single process is the number of signals going in and out of that
process. I happen to really like the Modelsim 'Dataflow' window and
how it integrates with the source and wave windows so that as I'm
debugging I can immediately see the inputs that go into producing the
one signal that I'm moseying through in order to find the root cause of
whatever it is I'm debugging. The single monolithic process that has
100 inputs and 100 outputs will show up as just a large block with all
those I/O when I click on it. But if the equation is simply A <= B and
C and is implemented in a 'screen sized' process then the dataflow
window shows me that C depends on A and B and possibly a few other
inputs that might happen to coincidentally be in that process because
other signals that use them were deemed 'related' and it will jumps me
write to the correct lines of code that implement the logic (because
that process fits on a screen) where I immediately see that A depends
only on B and C. You lose all of that as you put more and more things
into a process.

If either the 'lines that fit on the screen test' or the 'number of
signals in and out test' seem to be getting out of hand (again, the
fuzzy definition) I'll revist just how 'related' these things really
are. Signals are 'more' related if separating them meant they would
share more replicated code. An example here would be simply a process
with a bunch of signals that are all clocked, but they all share a
common clock enable so they are of the form "if (Clock_Enable = '1')
then....end if;" so those signals are 'related' by my definition, they
do share the "if (Clock_Enable = '1') then....end if;" construct. But
if that's about it then I would have no heartburn about making two (or
more) processes replicating the "if (Clock_Enable = '1') then....end
if;" construct to appease the 'fitting on a screen' and 'number of I/O'
tests.

I feel free to violate the 'screen size' rule in favor of the 'related'
rule if the situation dictates and have the multi-screen process.

In spite of the multiple clocked processes I consider myself brethen of
the 'one process' state machine group because my multiple clocked
processes are just one virtual clocked process, they are not the
combinatorial 'next state' process feeding into the clocked process.

Combinatorial logic is implemented using concurrent statements outside
the process. When implemented inside the process I have to think too
hard and scroll back to remember if the usage of variable 'x' in this
particular case is the input or the output of the flop. Call me lazy
on this one if you want.

I only use variables like C language macros. In other words if I want
a shorthand way of referring to a hunk of logic, I'll define the
equation for the variable and it will almost always be right at the
very top of the process, then I'll use it wherever....and that would
only be because for some reason it wasn't looking right as a
combinatorial function concurrent statement for some reason. Variables
when added to the Modelsim wave window do not show any of the signal
history, signals do not have that limitation. If it's a 'simple'
function that is being implemented by the variable then this is not a
big deal, if the function being implemented is rather tricky then being
able to display the history can be very important. If I don't, then I
have to restart the simulation adding the variables to the wave window
before I say 'run'....wasted time...and if those variables then lead me
back to another entity with signals and different variables, the
signals I can wave, the variables...well, restart the sim again.

The drawback of signals is that take longer simulation time...wasted
time too. I'm trying to resurrect the test code that I had comparing
use of variables versus signals but I seem to remember about a 10% hit
for signals. But I still use signals because just one blown simulation
that needs to be restarted just to get the variable's history can more
than compensate for that 10%...which for someone picking up somebody
elses code can easily happen since they are not familiar with the code
to begin with to 'know' which variables to wave....in other words
'supportability'. I try to give the poor shmuck who has to pick up my
code all the help I can...even if it means they're sitting waiting for
an extra 10%

The variable people have a definite point about simulation time, but
there is really no good data to support the overall debug cycle time
being in any way better using variables. They seem to imply that they
can run 10% more test cases, but it is less than that if they were to
consider the down sides and the probabilities of them occurring (see
above about having to restart...or extra time pondering what they think
the value of the variable is in their head since they can't wave it
without restarting). They still might come out ahead using variables
(and I might too if I did that, one day I might, they do have a point).

I rarely (veeeeeeeery rarely) use combinatorial processes. In fact, I
can't remember the last time I did but I'm pretty sure at some point I
did but even there I'm pretty sure that the sensitivity list consisted
of only one or two signals.

I never have sensitivity list issues (see above paragraph).

I never have combinatorial latches (ditto).

I never use async resets with the exception of the flip flop that
receives the external reset signal that is the start of a shift chain
for developing my internal design reset.

I never have issues with some clocked signals getting cleared and
others not or going to unexpected states (see above paragraph). I have
however fixed several designs that did use asynchronous resets
inappropriately both on a board and within programmable logic.

I don't recall ever having to fix reset issues on others designs when
synchronous resets were used...hmm, well maybe I've just lived in a
narrow design world.

Even in a gated clock design I have not run across the need for the
async reset anywhere other than that first flip flop previously
mentioned. Go figure.

I prefer executable code over comments (but I certainly do appreciate
the comments).

I use the 'time' data type in synthesizable code. No seriously, I do
and for very good reason....you know, the specification we all run into
at some point that says that signal 'x' must be asserted for 2 us...and
let's see my clock is 20 ns, no problem, figure out the proper count
values and go on....then two years down the road version 2.0 with the
speedup, now we can run with a 15 ns clock....and now you have a 1.5 us
pulse....DOH!!...I don't have that problem (anymore) because I use type
'time'....shamelessly leaving a cliff hanger on this one for those that
haven't figured out how I use 'time' types in synthesizable code.

Later posts on this topic talked about automatically generated code
from using a particular form of a template. I couldn't care less what
format the auto code generator use since that will not be the 'source',
the inputs to that code generator are the source and is where I'll go
for more information. If I have to dig into auto generated code to
find a problem, I will, but somebody is going to have a newly opened
service request to answer for my troubles if I find a problem.
Templates that are intended for people to use should be done with
people in mind, not a code generator.

I love to write non-synthesizable testbench code too as well...where I
shamelessly break just about every rule I mentioned above if needed.

I have a tendency to ramble on at times.

KJ

rickman · Aug 25, 2006

David said:
Combining the input and registered state this way allows for
a non registered path from input to output. Is this ok? Or is
there an assumption that the device connected to the output
is itself latching on the clock edge?

I have not seen the reference, but I do FSM one of two ways. If I need
to truely optimize things for speed or size or both, I separate my
logic from the register; otherwise I use a single clocked process for
both. I always register my outputs just like the state and in essence
use lookahead for that. But this happens in the same logic so it is
very easy to see.

I define the state diagram as a pseudo Mealy machine. By pseudo Mealy
machine I mean that you define your outputs on the transitions rather
than the states with the realization that the output is only reflected
when the state changes. Given a cur_state value, the transitions in
the diagram and the code both indicate the next_state and the
next_output. The coding matches the diagram so coding is easier.

mikegurche · Aug 25, 2006

backhus said:
Hi Eli,
Ok, that's something different.
Earns some contribution from my side

My example uses 3 Processes.
The first one is the simple state Register.
the second is the combinatocrical branch selection,
The third creates the registered outputs.

Recognize that the third process uses NextState for the case selection.
Advantage: Outputs change exactly at the same time as the states do.
Disadvantage: The branch logic is connected to the output logic, causing
longer delays.
Workaround: If a one clock delay of the outputs doesn't matter, Current
State can be used instead.

The only critical part I see is the second process. Because it's
combinatorical some synthesis tools might generate latches here, when
the designer writes no proper code. But we all should know how to write
latch free code, don't we? ;-)

The structure is very regular, which makes it a useful template for
autogenerated code.

Have a nice synthesis
Eilert

ENTITY Example_Regout_FSM IS
PORT (Clock : IN STD_LOGIC;
Reset : IN STD_LOGIC;
A : IN STD_LOGIC;
B : IN STD_LOGIC;
Y : OUT STD_LOGIC;
Z : OUT STD_LOGIC);
END Example_Regout_FSM;

ARCHITECTURE RTL_3_Process_Model_undelayed OF Example_Regout_FSM IS
TYPE State_type IS (Start, Middle, Stop);
SIGNAL CurrentState : State_Type;
SIGNAL NextState : State_Type;

BEGIN

FSM_sync : PROCESS(Clock, Reset)
BEGIN -- CurrentState register
IF Reset = '1' THEN
CurrentState <= Start;
ELSIF Clock'EVENT AND Clock = '1' THEN
CurrentState <= NextState;
END IF;
END PROCESS FSM_sync;

FSM_comb : PROCESS(A, B, CurrentState)
BEGIN -- CurrentState Logic
CASE CurrentState IS
WHEN Start =>
IF (A NOR B) = '1' THEN
NextState <= Middle;
END IF;
WHEN Middle =>
IF (A AND B) = '1' THEN
NextState <= Stop;
END IF;
WHEN Stop =>
IF (A XOR B) = '1' THEN
NextState <= Start;
END IF;
WHEN OTHERS => NextState <= Start;
END CASE;
END PROCESS FSM_comb;

FSM_regout : PROCESS(Clock, Reset)
BEGIN -- Output Logic
IF Reset = '1' THEN
Y <= '0';
Z <= '0';
ELSIF Clock'EVENT AND Clock = '1' THEN
Y <= '0'; -- Default Value assignments
Z <= '0';
CASE NextState IS
WHEN Start => NULL;
WHEN Middle => Y <= '1';
Z <= '1';
WHEN Stop => Z <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_regout;
END RTL_3_Process_Model_undelayed;

Love the enemy

(I hope the code is right)

ENTITY Example_Regout_FSM IS
PORT (Clock : IN STD_LOGIC;
Reset : IN STD_LOGIC;
A : IN STD_LOGIC;
B : IN STD_LOGIC;
Y : OUT STD_LOGIC;
Z : OUT STD_LOGIC);
END Example_Regout_FSM;

ARCHITECTURE RTL_1_Process_Model_undelayed OF Example_Regout_FSM IS
TYPE State_type IS (Start, Middle, Stop);
SIGNAL CurrentState: State_Type;
BEGIN
FSM_one_for_all: PROCESS(Clock, Reset)
VARIABLE NextState: State_Type;
BEGIN
IF Reset = '1' THEN
CurrentState <= Start;
Y <= '0';
Z <= '0';
ELSIF Clock'EVENT AND Clock = '1' THEN
-- variable used to repsents the o/p of next-state logic
CASE CurrentState IS
WHEN Start =>
IF (A NOR B) = '1' THEN
NextState := Middle;
END IF;
WHEN Middle =>
IF (A AND B) = '1' THEN
NextState := Stop;
END IF;
WHEN Stop =>
IF (A XOR B) = '1' THEN
NextState := Start;
END IF;
WHEN OTHERS => NextState := Start;
END CASE;

-- to register
CurrentState <= NextState;

-- to the buffered output logic
Y <= '0'; -- Default Value assignments
Z <= '0';
CASE NextState IS
WHEN Start => NULL;
WHEN Middle => Y <= '1';
Z <= '1';
WHEN Stop => Z <= '1';
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS FSM_one_for_all;
END RTL_1_Process_Model_undelayed;

Peace

Mike G.

Eli Bendersky · Aug 28, 2006

KJ said:
Rather than posting code, I'll refer to yours since it is roughly along
the lines of what I do. Instead I'll hope that my explanation is clear
enough that one can follow my reasoning (whether you agree or disagree
with it) without any more than occasional snippets of code.

[snip]

Combinatorial logic is implemented using concurrent statements outside
the process. When implemented inside the process I have to think too
hard and scroll back to remember if the usage of variable 'x' in this
particular case is the input or the output of the flop. Call me lazy
on this one if you want.

This is what I use as well, also avoiding combinatorial processes.
Their merit is probably faster simulation time, but it comes at the
price of inferior readability and those "latch avoidance" side effects.

I only use variables like C language macros. In other words if I want
a shorthand way of referring to a hunk of logic, I'll define the
equation for the variable and it will almost always be right at the
very top of the process, then I'll use it wherever....and that would
only be because for some reason it wasn't looking right as a
combinatorial function concurrent statement for some reason. Variables
when added to the Modelsim wave window do not show any of the signal
history, signals do not have that limitation. If it's a 'simple'
function that is being implemented by the variable then this is not a
big deal, if the function being implemented is rather tricky then being
able to display the history can be very important. If I don't, then I
have to restart the simulation adding the variables to the wave window
before I say 'run'....wasted time...and if those variables then lead me
back to another entity with signals and different variables, the
signals I can wave, the variables...well, restart the sim again.

The drawback of signals is that take longer simulation time...wasted
time too. I'm trying to resurrect the test code that I had comparing
use of variables versus signals but I seem to remember about a 10% hit
for signals. But I still use signals because just one blown simulation
that needs to be restarted just to get the variable's history can more
than compensate for that 10%...which for someone picking up somebody
elses code can easily happen since they are not familiar with the code
to begin with to 'know' which variables to wave....in other words
'supportability'. I try to give the poor shmuck who has to pick up my
code all the help I can...even if it means they're sitting waiting for
an extra 10%

The variable people have a definite point about simulation time, but
there is really no good data to support the overall debug cycle time
being in any way better using variables. They seem to imply that they
can run 10% more test cases, but it is less than that if they were to
consider the down sides and the probabilities of them occurring (see
above about having to restart...or extra time pondering what they think
the value of the variable is in their head since they can't wave it
without restarting). They still might come out ahead using variables
(and I might too if I did that, one day I might, they do have a point).

I also try to avoid variables for another reason (in addition to the
ones you stated). Somehow, when variables are used I can't be 100% sure
if the resulting code is synthesizable, because it can turn out not to
be. Additionally, since I do use signals, variables create the mixup of
"update now" and "update later" statements which make the process more
difficult to understand. With signals only it's all "update later".

I never use async resets with the exception of the flip flop that
receives the external reset signal that is the start of a shift chain
for developing my internal design reset.

Don't you run into fanout problems for that single flip-flop that
pushes the sync reset signal to all other FFs in the design, or does
the synthesis tool take care of this ? I tend to use async resets, but
my whole design is usually synchronized to the same clock so there are
no reset problems.

I never have issues with some clocked signals getting cleared and
others not or going to unexpected states (see above paragraph). I have
however fixed several designs that did use asynchronous resets
inappropriately both on a board and within programmable logic.

I don't recall ever having to fix reset issues on others designs when
synchronous resets were used...hmm, well maybe I've just lived in a
narrow design world.

Can you point out a few common problems with async resets ? In
particular, what is using them "appropriately" and what isn't ?

Eli

KJ · Aug 28, 2006

Eli Bendersky said:
Don't you run into fanout problems for that single flip-flop that
pushes the sync reset signal to all other FFs in the design, or does
the synthesis tool take care of this ? I tend to use async resets, but
my whole design is usually synchronized to the same clock so there are
no reset problems.

The fanout of the reset signal is the same regardless of whether you use
synchronous or asynchronous resets. In either case, the reset signal still
needs to be synchronized to the clock (see further down for more info) and
in both cases the reset signal itself must meet timing constraints. If the
reset signal doesn't meet timing constraints due to fanout (and the
synthesis tool didn't pick up on this and add the needed buffers
automatically) then most fitters for FPGAs give some method for limiting
fanout with some vendor specific attribute that can be added to the signal.

Can you point out a few common problems with async resets ? In
particular, what is using them "appropriately" and what isn't ?

1. Forgetting (or not realizing) that the reset signal does in fact need to
be synchronized to the clock(s). Whether using async or sync resets in the
design, the timing of the trailing edge of reset must be synchronized to the
appropriate clock. Simply ask yourself, what happens when the reset signal
goes away just prior to the rising edge of the clock and violates the setup
time of a particular flip flop? The answer is that well...you can get
anything....and that each flip flop that gets this signal can respond
differently.....and then what would that state do to you think your 7 state,
one hot, state machine will be in after this clock? Quite possibly you
might find two hot states instead of just one.
2. Somewhat related to #1...Forgetting that your 'synchronized to the clock'
reset signal is only synchronized within the one clock domain....and using
it in some other clock domain which puts you right back into the situation
of #1, that the reset signal can violate timing. This is really a clock
domain crossing problem and would occur whether async or sync resets were
used though but thought I'd toss it in. It does mean though that you need
separate shift chains (one for each clock domain that needs a reset) but
again, you need this regardless of if the rest of the design uses reset
synchronously or asynchronously.
3. On a board that distributes the reset signal to whoever needs it, having
that reset signal pick up noise that gets coupled over from some other
signal on the board. By using the reset signal synchronously internal to
the device, you can minimize (and often eliminate) what otherwise would have
been 'inadvertant' resets caused by noise coupling. If the board design
happens to be a single clock design, then this 'noise' would most likely be
occurring just after the clock when all the outputs are switching, but if
you use the reset signal in a synchronous manner then it is just like any
other signal and doesn't need any special care when routing the board....you
can't the same for an async reset signal on the board, routing of the
'reset' signal can be an issue...and one that you won't be able to give any
real good guidance about to the PCB designer that is trying to route this
signal.
4. Overuse of just which signals really need to be 'reset'. This is
somewhat related to #3 and is also a function of the designer. Some feel
that every blasted flip flop needs to be reset...with no reason that can be
traced back to the specification for what the board is supposed to do, it's
just something 'they always do'. Inside an FPGA this may not matter much
since we're implicitly trusting the FPGA vendors to distribute a noise free
signal that we can use for the async reset, but on a board this can lead to
distributing 'reset' to a whole bunch of devices...which just gives that
signal much more opportunity to pick up the noise mentioned in #3. If
you're lucky, the part that gets the real crappy, noisy reset signal is the
one where you look at the function and realize that no, nothing in here
'really' needs to get reset when the 'reset' signal comes in. At worst
though, you see that yes the reset is needed, and you may start band-aiding
stuff on to the board to get rid of the noise or filter it digitally inside
the device if you can, etc. Bottom line though is that if more (some?)
thought had been put in up front, the reset signal wouldn't have been
distributed with such wild abandon in the first place.
5. There was also a post either here or in comp.lang.vhdl in the past couple
months that talked about how using the generally listed template can result
in gated clocks getting synthesized when you have some signals that you want
reset, and other signals that you don't. Being in the same process and all,
the original poster found that gated clocks were being synthesized in order
to implement this logic. The correct form of the template (that rarely gets
used by anyone posting to either this group or the vhdl group) is of the
form
process(clk, reset)
begin
if rising_edge(clk) then
s1 <= Something;
s2 <= Something else;
end if;
if (reset = '1') then
s1 <= '0';
-- s2 does not need to be reset,
end if;
end process;

Again, the scenario here is that you have
- More than one signal being assigned in this process
- At least one of those signals is not supposed to change as a result of
reset (either this is by intent, or by unintentionally forgetting to put the
reset equation)

Depending on the synthesis tool, this could result in a gated clock getting
generated as the clock to signal 's2' in the above example.

KJ

KJ · Aug 28, 2006

KJ said:
I never use async resets with the exception of the flip flop that
receives the external reset signal that is the start of a shift chain
for developing my internal design reset.

I overstated somewhat. There are times when external interfaces
require asynch reset behaviour. Generally though the behaviour that is
required is for the outputs to 'shut off', 'tri-state' or something of
that flavor. In those situations, you are of course are then required
to async reset those outputs....but that in no way implies that that
the async reset needs to go anywhere else (like into the state machines
that have the logic that drives those outputs).

So use those async reset flip flops where it is actually required per
specification and nowhere else is probably closer to the truth about my
actual usage.

KJ

rickman · Aug 28, 2006

KJ said:
The fanout of the reset signal is the same regardless of whether you use
synchronous or asynchronous resets. In either case, the reset signal still
needs to be synchronized to the clock (see further down for more info) and
in both cases the reset signal itself must meet timing constraints. If the
reset signal doesn't meet timing constraints due to fanout (and the
synthesis tool didn't pick up on this and add the needed buffers
automatically) then most fitters for FPGAs give some method for limiting
fanout with some vendor specific attribute that can be added to the signal.

The fanout of an async reset in an FPGA is not an issue because the
signal is a dedicated net. The timing is an issue as all the FFs have
to be released in a way that does not allow counters and state machines
to run part of their FFs before the others. But this can be handled by
ways other than controlling the release of the reset. Typically these
circuits only require local synchronization which can be handled easily
by the normal enable in the circuit. For example most state machines
do nothing until an input arrives. So synchronization of the release
of the reset is not important if the inputs are not asserted. Of
course this is design dependant and you must be careful to analyze your
design in regards to the release of the reset.

1. Forgetting (or not realizing) that the reset signal does in fact need to
be synchronized to the clock(s). Whether using async or sync resets in the
design, the timing of the trailing edge of reset must be synchronized to the
appropriate clock. Simply ask yourself, what happens when the reset signal
goes away just prior to the rising edge of the clock and violates the setup
time of a particular flip flop? The answer is that well...you can get
anything....and that each flip flop that gets this signal can respond
differently.....and then what would that state do to you think your 7 state,
one hot, state machine will be in after this clock? Quite possibly you
might find two hot states instead of just one.

That is what I addressed above. Whether the circuit will malfunction
depends on the circuit as well as the inputs preset. It is often not
hard to assure that one or the other prevents the circuit from changing
any state while the reset is released.

Since the dedicated global reset can not be synchronized to a clock of
even moderately high speed, you can provide local synchronous resets to
any logic that actually must be brought out of reset cleanly. I
typically use thee FFs in a chain that are reset to zero and require
three clock cycles to clock a one through to the last FF.

4. Overuse of just which signals really need to be 'reset'. This is
somewhat related to #3 and is also a function of the designer. Some feel
that every blasted flip flop needs to be reset...with no reason that can be
traced back to the specification for what the board is supposed to do, it's
just something 'they always do'. Inside an FPGA this may not matter much
since we're implicitly trusting the FPGA vendors to distribute a noise free
signal that we can use for the async reset, but on a board this can lead to
distributing 'reset' to a whole bunch of devices...which just gives that
signal much more opportunity to pick up the noise mentioned in #3. If
you're lucky, the part that gets the real crappy, noisy reset signal is the
one where you look at the function and realize that no, nothing in here
'really' needs to get reset when the 'reset' signal comes in. At worst
though, you see that yes the reset is needed, and you may start band-aiding
stuff on to the board to get rid of the noise or filter it digitally inside
the device if you can, etc. Bottom line though is that if more (some?)
thought had been put in up front, the reset signal wouldn't have been
distributed with such wild abandon in the first place.

This is not a problem when you use the dedicated reset net. Even
though there are FFs that do not need a reset, it does not hurt to put
the entire device in a known state every time. It is not hard to miss
a FF that needs to be reset otherwise.

Personally I think the noise issue is a red herring. If you have noise
problems on the board, changing your reset to sync will not help in
general. You would be much better off designing a board so it does not
have noise problems.

Coding State Machines	6	Oct 17, 2008
problem executing testbench. error in coding	1	Nov 25, 2014
VHDL finite state machine	11	Apr 27, 2009
State machines	5	Oct 9, 2007
State machine definitions	12	Jul 26, 2012
LFSR doesn't generate random values during simulation	1	Aug 4, 2017
IC 74374-getting incorrect simulation	0	Apr 26, 2019
Why doesn't this produce the logic I expect?	3	Feb 10, 2011

Style of coding complex logic (particularly state machines)

Eli Bendersky

backhus

Eli Bendersky

Andy

mikegurche

David Ashley

Mike Treseler

mikegurche

Mike Treseler

Eli Bendersky

backhus

mikegurche

Martin Gagnon

KJ

rickman

mikegurche

Eli Bendersky

KJ

KJ

rickman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads