Non power of 2 natural counter - neat alternatives to mod operator?

A

Andrew FPGA

Hi,
I like using a restricted range natural type for counters. e.g.
variable TimeslotCount : natural range 0 to TIMESLOTS_PER_FRAME-1;
...
TimeslotCount := (TimeslotCount+1) mod TIMESLOTS_PER_FRAME;
...

When the count variable is a power of 2 range, e.g. TIMESLOTS_PER_FRAME
= 16 then Modelsim 6.0d is happy and ISE 8.1isp2 is happy.

But, when the count variable is not a power of 2 range, e.g.
TIMESLOTS_PER_FRAME = 20 then Modelsim is happy but XST(xilinx
synthesizer) can't handle this.
"ERROR:Xst:1775 - Unsupported modulo value 10 found in expression at
line 277. The modulo should be a power of 2."

An alternative would be
if TimeslotCount < (TIMESLOTS_PER_FRAME-1) then
TimeslotCount := TimeslotCount +1;
else
TimeslotCount := 0;
end if;

but this doesn't seem as nice as when using the mod operator?

Can anyone suggest a more elegant solution for a Synthesizable non
power of 2 natural type counter?

Regards
Andrew
 
K

KJ

An alternative would be
if TimeslotCount < (TIMESLOTS_PER_FRAME-1) then
TimeslotCount := TimeslotCount +1;
else
TimeslotCount := 0;
end if;

but this doesn't seem as nice as when using the mod operator?

Like it or not, the alternative you don't like is the only way for those
synthesis tools that don't support the 'mod' operator for anything other
than a power of 2. About the only other thing you could do is package it up
into a function so that when you need to increment the code looks a bit
cleaner...

TimeslotCount := my_mod(TimeslotCount +1, TIMESLOTS_PER_FRAME-1);

Something to that effect to replace the if statement. Implements exactly
the same thing but keeps from cluttering up the code with if statements

KJ
 
A

Andy

A non-equal comparison usually involves some sort of subtraction
hardware (assuming your target has fast, built-in arithmetic carry
logic), whereas your counter uses an adder (incrementor).

If you can live with a decrementing timeslotcount, it is more efficient
to do the following (this works with integers/naturals, NOT
vector-based arithmetic, e.g. slv, unsigned, etc.) where the comparison
and decrementer hardware are shared.

if timeslotcount - 1 < 0 then -- extract the carry bit
timeslotcount <= timeslots_per_frame - 1;
else
timeslotcount <= timeslotcount - 1; -- shared with comparison op
above
end if;

integer/natural operations are always signed 32 bit, so even though
timeslotcount cannot be < 0, timeslotcount - 1 can be. Limits are
checked upon storage, not during operations.

Synthesis optimizes out the unused bits to yield a minimal size
decrementor with one extra bit for the carry/borrow. No storage is used
for that bit.

There are ways to code this using vector types/arithmetic, but they are
not very pretty.

Integers simulate much faster too.


Andy
 
J

jens

A non-equal comparison usually involves some sort of subtraction
hardware (assuming your target has fast, built-in arithmetic carry
logic), whereas your counter uses an adder (incrementor).

Just out of curiosity, why not use an equality comparison instead?
i.e.

if TimeslotCount = (TIMESLOTS_PER_FRAME-1) then
TimeslotCount := 0;
else
TimeslotCount := TimeslotCount + 1;
end if;

or

if TimeslotCount = 0 then
TimeslotCount := (TIMESLOTS_PER_FRAME-1);
else
TimeslotCount := TimeslotCount - 1;
end if;

The intention is very clear, it works with most types, and it
synthesizes nicely.
 
A

Andy

For a constant timeslots_per_frame, an equal comparison would work as
well functionally, and may be faster than a subtraction, but will
require more logic and routing (when the subtractor for the comparison
is shared with that for the decrement).

One should also beware that an up counter that tests to the limit
reacts faster to changes in the limit than a down counter that reloads
the new limit only after it has finished counting down from the old
limit.

For non-constant limits, non-equal comparisons are usually best if
there is any chance of the count skipping over the limit (or the limit
skipping under the count).

Andy
 
J

jens

For a constant timeslots_per_frame, an equal comparison would work as
well functionally, and may be faster than a subtraction, but will
require more logic and routing (when the subtractor for the comparison
is shared with that for the decrement).

Please pardon my confusion, but I don't understand why a subtraction
with an equality comparison will require more resources than a
subtraction with an equality comparison that shares its subtraction
with the count function. So I ran the different techniques through
Xilinx tools:

I used the following constant/signal assigments (not a power of 2):

constant TIMESLOTS_PER_FRAME: natural := 123;
signal TimeslotCount: natural range 0 to TIMESLOTS_PER_FRAME-1;

option 1:
if rising_edge(clk) then
if TimeslotCount = 0 then
TimeslotCount <= (TIMESLOTS_PER_FRAME-1);
else
TimeslotCount <= TimeslotCount - 1;
end if;
end if;

option 2:
if rising_edge(clk) then
if TimeslotCount = (TIMESLOTS_PER_FRAME-1) then
TimeslotCount <= 0;
else
TimeslotCount <= TimeslotCount + 1;
end if;
end if;

option 3:
if rising_edge(clk) then
if TimeslotCount - 1 < 0 then -- extract the carry bit
TimeslotCount <= (TIMESLOTS_PER_FRAME-1);
else
TimeslotCount <= TimeslotCount - 1; -- shared with comparison op
above
end if;
end if;


Every one of those techniques resulted in the following utilization:

Number of Slices: 6 out of 768 0%
Number of Slice Flip Flops: 7 out of 1536 0%
Number of 4 input LUTs: 11 out of 1536 0%


For counters that require multiple preload inputs (like FSK encoders,
for example), a mux-friendly architecture (like Actel) should work
great with option #1 (though experience has shown that the Actel VHDL
compiler is too stupid to actually do that and fit something in a
reasonably sized part, so reverting back to schematics was necessary).
 
A

Andy

Jens,

Interesting results. What were your constraints? As fast as possible,
or a slow enough clock that it could seek a more space-efficient
implementation? As I implied above, options 1 and 2, have the
potential to run faster, but larger (due to the parallel decoding of
the terminal count). Whereas option 3 only requires one lut beyond
what is required to count, assuming they implement it intelligently
(and share the decrementors). Width of the counter may make a
difference here too, as small counters are often implemented without
the built in carry logic, actually making them faster. 7 bits would be
on the edge of where I would suppose a parallel vs ripple carry
implementation would win.

I have talked to Synplicity about automatically optimizing your option
#1 into #3, so long as performance allows. In the past, I have run all
three through Synplicity and FCII, and option 3 was smallest in both.
But that was a while back. I'll have to run it again to see how they
handle it now.
From memory, option 3 (and option 1, if the synthesis tool is smart
enough to see it) should take approx 8 luts and 7 flops each. Option 2
should take 11 luts (8+3) and 7 flops, which is what it did.

Note that Synplify, and maybe others, are smart enough on all options
to only implement reloading logic on those bits that are set
differently from the next count.

Thanks for your inputs,

Andy
 
A

Andrew FPGA

Thanks for the comments guys. Wrapping into a function seems quite
nice. I wonder about writing a function to overload the mod operation
too.

Actually I looked up the mod operation in numeric_std and it surprised
me there was no "mod" function with both arguments as natural. I guess
that means there is an implicit type conversion from natural to
signed/unsigned happening for one of my mod arguments.

Here are the mod operators defined in mti_numeric_std.vhd. (i.e.
modelsims implementation of numeric_std).
function "mod" (L, R: UNSIGNED) return UNSIGNED;
function "mod" (L, R: SIGNED) return SIGNED;
function "mod" (L: UNSIGNED; R: NATURAL) return UNSIGNED;
function "mod" (L: NATURAL; R: UNSIGNED) return UNSIGNED;
function "mod" (L: SIGNED; R: INTEGER) return SIGNED;
function "mod" (L: INTEGER; R: SIGNED) return SIGNED;

Regards
Andrew
 
A

Andy

Natural is a subtype of integer, and mod is defined in the language for
integer. Notice that the package does not define mod for integer &
integer either. Any operator defined for a type will work on any
subtype of that type.

Note also that, at one time (I don't know if it is still true),
Synopsis did not support synthesis of mod & / operators on
numeric_std.unsigned (even with integral powers of two). It was about
that time that I quit using their tools for the last time, in favor of
Synplicity. I also started using naturals/integers for numeric values,
and have never looked back.

I would not overload the mod operator directly, since many synthesis
tools recognize it, assume it is the standard modulo operation
(regardless of type), and implement it (or not) accordingly.

Andy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top