Why is this code persnickety?

KK6GM · Oct 10, 2010

I'm a software guy with a decent hardware understanding, just
beginning to teach myself VHDL and FPGAs. I've got a Nexys2 board and
in playing with the multiplexed 7-seg displays I've run across a
situation where my code to delay and switch between digits runs for
some count values and doesn't run for others. Here's the code section
in question:

prescale_to_100us: process (clk_50M)
variable counter : natural range 0 to 2500 := 1;
begin
if (rising_edge(clk_50M)) then
counter := counter - 1;
if (counter = 0) then
counter := 2500;
clk_100us <= not clk_100us;
end if;
end if;
end process;

drive_digits: process (clk_100us, sw)
constant Scan_Count : natural := 300;
variable counter : natural range 0 to Scan_Count := 1;
variable digs7_var : std_logic_vector (3 downto 0) := "1110";
begin
if (rising_edge(clk_100us)) then
counter := counter - 1;
if (counter = 0) then
case sw is
when "001" => counter := Scan_Count / 8;
when "010" => counter := Scan_Count / 4;
when "100" => counter := Scan_Count / 2;
when others => counter := Scan_Count;
end case;

case digs7_var is
when "1110" => digs7_var := "1101";
when "1101" => digs7_var := "1011";
when "1011" => digs7_var := "0111";
when "0111" => digs7_var := "1110";
when others => digs7_var := "1110";
end case;

digs7 <= digs7_var;
end if;
end if;
end process;

This code consistently works for e.g. Scan_Count = 300, 302 or 305,
but fails (no multiplexing, just a solid 0 in the LSD position) for
301, 303 and 304. Nothing else is changed, just the value of
Scan_Count. Naturally I assume that the problem lies in my code, but
I have no idea what it might be. Any ideas? Thanks.

BTW, do I need "sw" in the sensitivity list for drive_digits?

Mike

Jonathan Bromley · Oct 10, 2010

I'm a software guy with a decent hardware understanding, just
beginning to teach myself VHDL and FPGAs. I've got a Nexys2 board and
in playing with the multiplexed 7-seg displays I've run across a
situation where my code to delay and switch between digits runs for
some count values and doesn't run for others. Here's the code section
in question:

prescale_to_100us: process (clk_50M)
variable counter : natural range 0 to 2500 := 1;
begin
if (rising_edge(clk_50M)) then
counter := counter - 1;
if (counter = 0) then
counter := 2500;
clk_100us <= not clk_100us;
end if;
end if;
end process;

drive_digits: process (clk_100us, sw)
constant Scan_Count : natural := 300;
variable counter : natural range 0 to Scan_Count := 1;
variable digs7_var : std_logic_vector (3 downto 0) := "1110";
begin
if (rising_edge(clk_100us)) then
counter := counter - 1;
if (counter = 0) then

......... reload scan counter, update multiplex output

end if;
end if;
end process;

This code consistently works for e.g. Scan_Count = 300, 302 or 305,
but fails (no multiplexing, just a solid 0 in the LSD position) for
301, 303 and 304. Nothing else is changed, just the value of
Scan_Count. Naturally I assume that the problem lies in my code, but
I have no idea what it might be. Any ideas? Thanks.

Your code looks broadly OK to me (have you simulated it?).
It's good that you hide locals inside the processes as variables.
Symbolic names for the various constants would help readability
and maintainability, I'd suggest, and (see your latter question)
no, you don't want 'sw' in the clocked process's sensitivity
list - but it shouldn't do any harm, because the whole body
of the process runs only on rising_edge.

Carefully check out the synthesis report. My first suspicion
would be that the clk_100us signal is not being routed on a
clock net. Possibly it hasn't been recognized as a clock
because of that bad sensitivity list - I don't know - but
if the tools didn't recognise the clock then the results
you see are not too surprising because you'll get clock
skew between the various FFs of the second process's
counter and this can give rise to hold time violations.

As a sanity check on this, consider reworking the clock
divider to generate a clock enable so that you can clock
both processes from the common 50MHz.

Also, note that your "if counter=0" tests (in both
processes) are testing NEXT-STATE values of the counter,
and therefore will give slower logic than if you had
tested "counter" before decrementing it. Try this:

prescale_to_100us: process (clk_50M)
variable counter: natural range 0 to 2500 := 1;
begin
if counter=1 then -- test first, tests current state
counter := 2500;
enable_100us <= '1';
else
counter := counter-1;
enable_100us <= '0';
end if;
end process;

drive_digits: process (clk_50M)
constant Scan_Count : natural := 300;
variable counter : natural range 0 to Scan_Count := 1;
variable digs7_var : std_logic_vector (3 downto 0) := "1110";
begin
if (rising_edge(clk_50M)) then
if (enable_100us) then
-- same logic as before.

Don't try to roll the two "if" tests into a single test;
some synthesis tools are rather picky about keeping the
rising_edge test in a separate statement.

BTW, do I need "sw" in the sensitivity list for drive_digits?

No, definitely not - see above.

Hope this helps, but I fear it may not...

KK6GM · Oct 10, 2010

........ reload scan counter, update multiplex output

Your code looks broadly OK to me (have you simulated it?).
It's good that you hide locals inside the processes as variables.
Symbolic names for the various constants would help readability
and maintainability, I'd suggest, and (see your latter question)
no, you don't want 'sw' in the clocked process's sensitivity
list - but it shouldn't do any harm, because the whole body
of the process runs only on rising_edge.

Carefully check out the synthesis report. My first suspicion
would be that the clk_100us signal is not being routed on a
clock net. Possibly it hasn't been recognized as a clock
because of that bad sensitivity list - I don't know - but
if the tools didn't recognise the clock then the results
you see are not too surprising because you'll get clock
skew between the various FFs of the second process's
counter and this can give rise to hold time violations.

As a sanity check on this, consider reworking the clock
divider to generate a clock enable so that you can clock
both processes from the common 50MHz.

Also, note that your "if counter=0" tests (in both
processes) are testing NEXT-STATE values of the counter,
and therefore will give slower logic than if you had
tested "counter" before decrementing it. Try this:

prescale_to_100us: process (clk_50M)
variable counter: natural range 0 to 2500 := 1;
begin
if counter=1 then -- test first, tests current state
counter := 2500;
enable_100us <= '1';
else
counter := counter-1;
enable_100us <= '0';
end if;
end process;

drive_digits: process (clk_50M)
constant Scan_Count : natural := 300;
variable counter : natural range 0 to Scan_Count := 1;
variable digs7_var : std_logic_vector (3 downto 0) := "1110";
begin
if (rising_edge(clk_50M)) then
if (enable_100us) then
-- same logic as before.

Don't try to roll the two "if" tests into a single test;
some synthesis tools are rather picky about keeping the
rising_edge test in a separate statement.

No, definitely not - see above.

Hope this helps, but I fear it may not...

That did indeed solve the problem - many thanks! Clearly I need to
get a better understanding of the proper idioms for good, reliable
clocking. Guess I'm lucky to have stumbled across this whole issue
early on.

Mike

jacko · Oct 11, 2010

In some of my code I had a divided clock. I found that it was easier
to detect and latch the end count 1 cycle before into a single bit and
then use that single bit on the next cycle to do the clock inversion.
This speeded up the design by over 4 times. I think this is because
the clock enable of the divided clock FF was the output of an FF
instead of 2 layers of 4LUTs.

Andy · Oct 11, 2010

As has been mentioned, your code compares count to zero after it has
been decremented, meaning both combinatorial circuits must settle in
series before you get a valid answer, and that takes more time than
would otherwise be necessary.

There are a couple of ways around this. The first is to check and then
only decrement if safe to do so:

if counter = 0 then -- parallel comparison of each bit
counter := half_100us_period; -- use a constant (hint: 2500 - 1)!
clk_100us <= not clk_100us;
else
counter := counter - 1;
end if;

This method compares the output of the count register to zero, before
it is decremented, thus avoiding the serial delays of decrement and
compare within one clock cycle.

The second is related to this, but uses a trait of integer arithmetic
to avoid the parallel comparison of the count to zero:

if count - 1 < 0 then -- checked the carry bit
count := half_100us_period;
clk_100us <= not clk_100us;
else
count := count - 1; -- share decrementer with the comparison
end if;

The above ONLY WORKS WITH INTEGER SUBTYPE COUNTERS! No matter what the
integer subtype range of the operands, the - operator (or any other
integer operator) returns a full integer range result. That is not an
error. It is only an error if you try to store the result in a
subtype, and it is outside of that subtype's range. So, even though
count cannot be less than zero, count - 1 can be less than zero, and
can be used to check the carry bit.

What's more, the synthesis tool will share the two decrementers (the
one for the comparison, and the one for the counting action),
resulting in only one decrementer being produced.

In real life, if the counter is plenty fast (and/or the clock is
plenty slow), use the method that is most easily understood by the
reader, that also still meets timing. That reader could be a reviewer,
a verification engineer, a test engineer, or another developer that
needs to update the design, and could even be you in another 6 weeks
after you have forgotten why you did it the way you did.

Andy

KK6GM · Oct 12, 2010

As has been mentioned, your code compares count to zero after it has
been decremented, meaning both combinatorial circuits must settle in
series before you get a valid answer, and that takes more time than
would otherwise be necessary.

Right, this is obvious now that it has been pointed out to me.

To ask a question related to my OP, what is the preferred method (or
methods) of driving FFs at a rate derived from a master clock? For
example, suppose I want to clock some FFs at 1kHz from a 50MHz clock.
Is it to use the 1-clock-out-of-N enable as was discussed above? Are
there other "correct" ways as well? I guess the question is, do FFs
being updated at rates of MasterClock/N normally have their clock
inputs driven by MasterClock with suitable enables, or is it common to
actually derive other clocks from MasterClock?

KJ · Oct 12, 2010

To ask a question related to my OP, what is the preferred method (or
methods) of driving FFs at a rate derived from a master clock?

In FPGAs or CPLDs, generate a clock enable. In ASICs, an actual gated
clock might be considered appropriate. The reason gated clocks are
generally not acceptable in FPGAs or CPLDs is because then the design
becomes sensitive to *minimum* propogation delays through logic due to
the inevitable need to reliably transfer signals from one clock domain
to the other, but there are no resources to control this delay. This
will create hold time issues where the signals from one clock domain
can switch prior to and be accepted by the gated clock one clock cycle
'early' or can cause incorrect logic because hold times were
violated. While a timing analysis will reveal the problem, the only
recourse is to change a random number seed and try to re-route and get
something different and hope. Every iteration of the design will
require this same sort of hand holding...most people have better
things to do with their time.

For
example, suppose I want to clock some FFs at 1kHz from a 50MHz clock.
Is it to use the 1-clock-out-of-N enable as was discussed above?
Yes

Are
there other "correct" ways as well?

If the division is fixed and known at design time, then there are
better forms of counters. As an example, an LFSR counter would
minimize the amount of logic needed to create the counter.

Another technique is to break the counter up into smaller pieces to
improve the clock cycle performance. Consider a 24 bit counter.
Rather than having to decode all 24 bits to decide that the count has
rolled over, another technique would be to break the counter up into
two (or three) sections of 8-12 bits. When the 'more significant'
bits sections counts down it sets a flop. Then the final rollover
occurs when all higher order sections have their bit set, and the
'least significant' bit section also counts down to 0.

I guess the question is, do FFs
being updated at rates of MasterClock/N normally have their clock
inputs driven by MasterClock with suitable enables,

In CPLDs and FPGAs...yes

or is it common to
actually derive other clocks from MasterClock?

In ASICs...yes (clock enables also work here also)

Kevin Jennings

KK6GM · Oct 13, 2010

OK, things are much clearer now. Thanks for all the good info.

Mike

Implementation a SVM ( SVPWM) in FPGA with VHDL	0	Oct 31, 2015
Why is Python telling me variable is local not global?	3	Sep 2, 2023
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Compiling error... not sure how to address the errors	5	Apr 22, 2013
CNC Axis VHDL	0	Jun 20, 2016
Counter with initial value	3	Nov 21, 2011
Idk need help in editing this source code	0	Nov 5, 2022
Qsort() is messing with my entire Code!!!	0	Apr 25, 2022

Why is this code persnickety?

KK6GM

Jonathan Bromley

KK6GM

jacko

Andy

KK6GM

KJ

KK6GM

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads