combinatorial feedback loop

Ken Cecka · Mar 11, 2009

I'm playing around with different ways to count the number of '1' bits in a vector. I'll only be dealing with 6 bit vectors, so I could just do this with a lookup table, but I started with the most readable implementation to see how it would synthesize (copied below).

The Xilinx tools are synthesizing this into a single counter with a feedback loop, and then telling me me that the design can run at 700MHz. I'm a little skeptical - anyone know if it's reasonable to expect timing analysis to unroll a feedback loop like this?

Ken

ENTITY test IS
GENERIC
(
bits : INTEGER := 6
);
PORT
(
reset : IN STD_LOGIC;
clk : IN STD_LOGIC;
input : IN STD_LOGIC_VECTOR((bits - 1) DOWNTO 0);
ones : OUT STD_LOGIC_VECTOR(log2(bits) DOWNTO 0)
);
END test;

ARCHITECTURE model OF test IS

SIGNAL input_l : STD_LOGIC_VECTOR(input'RANGE);
SIGNAL count : STD_LOGIC_VECTOR(ones'RANGE);

BEGIN

-- latch inputs and outputs for timing analysis
PROCESS (reset, clk)
BEGIN
IF (reset = '1') THEN
input_l <= (OTHERS => '0');
ones <= (OTHERS => '0');
ELSIF (clk'EVENT) AND (clk = '1') THEN
input_l <= input;
ones <= count;
END IF;
END PROCESS;

-- count ones
PROCESS (input_l, count)
BEGIN
FOR i IN input_l'RANGE LOOP
IF (input_l(i) = '1') THEN
count <= count + 1;
END IF;
END LOOP;
END PROCESS;

END;

Ken Cecka · Mar 11, 2009

Jonathan said:
I'm playing around with different ways to count the number of '1'
bits in a vector. I'll only be dealing with 6 bit vectors, so
I could just do this with a lookup table, but I started with
the most readable implementation to see how it would
synthesize (copied below).

The Xilinx tools are synthesizing this into a single
counter with a feedback loop, and then telling me
that the design can run at 700MHz. I'm a little
skeptical - anyone know if it's reasonable to expect
timing analysis to unroll a feedback loop like this?

Ken [...]
-- count ones
PROCESS (input_l, count)
BEGIN
FOR i IN input_l'RANGE LOOP
IF (input_l(i) = '1') THEN
count <= count + 1;
END IF;
END LOOP;
END PROCESS;

Click to expand...

OUCH! You really, really don't want to do this!

I suggest that your s[ck]epticism is entirely
justified - there's no way that makes any sense.
Indeed, a counter with combinational feedback
makes no sense as a design - it isn't coherent.

Of course you can re-code the loop with a variable:

PROCESS (input_l)
variable count: unsigned...
BEGIN
FOR i IN input_l'RANGE LOOP
IF (input_l(i) = '1') THEN
count := count + 1;
END IF;
END LOOP;
count_signal <= count;
END PROCESS;

and now synthesis will properly unroll the for-loop
(creating a chain of adders) and there is no
combinational feedback. I'm guessing you know
a load of other ways of doing it, too.

I was expecting it to unroll into a chain of adders in the first place and was surprised when it used feedback, but didn't stop think about the underlying adder implementation and whether this could even work. Thanks for the reality check!

Ken

Andy · Mar 11, 2009

Of course you can re-code the loop with a variable:

PROCESS (input_l)
variable count: unsigned...
BEGIN
FOR i IN input_l'RANGE LOOP
IF (input_l(i) = '1') THEN
count := count + 1;
END IF;
END LOOP;
count_signal <= count;
END PROCESS;

Don't forget to initialize count to zero before the loop...

Andy

kennheinrich · Mar 11, 2009

I'm playing around with different ways to count the number of '1' bits in a vector. I'll only be dealing with 6 bit vectors, so I could just do this with a lookup table, but I started with the most readable implementation to see how it would synthesize (copied below).

The Xilinx tools are synthesizing this into a single counter with a feedback loop, and then telling me me that the design can run at 700MHz. I'm a little skeptical - anyone know if it's reasonable to expect timing analysis to unroll a feedback loop like this?

Ken

ENTITY test IS
GENERIC
(
bits : INTEGER := 6
);
PORT
(
reset : IN STD_LOGIC;
clk : IN STD_LOGIC;
input : IN STD_LOGIC_VECTOR((bits - 1) DOWNTO 0);
ones : OUT STD_LOGIC_VECTOR(log2(bits) DOWNTO 0)
);
END test;

ARCHITECTURE model OF test IS

SIGNAL input_l : STD_LOGIC_VECTOR(input'RANGE);
SIGNAL count : STD_LOGIC_VECTOR(ones'RANGE);

BEGIN

-- latch inputs and outputs for timing analysis
PROCESS (reset, clk)
BEGIN
IF (reset = '1') THEN
input_l <= (OTHERS => '0');
ones <= (OTHERS => '0');
ELSIF (clk'EVENT) AND (clk = '1') THEN
input_l <= input;
ones <= count;
END IF;
END PROCESS;

-- count ones
PROCESS (input_l, count)
BEGIN
FOR i IN input_l'RANGE LOOP
IF (input_l(i) = '1') THEN
count <= count + 1;
END IF;
END LOOP;
END PROCESS;

END;

There are also lots of nifty ways to count ones that the software guys
like to use. I haven't though through them enough to say whether they
merely represent a way to reclaim the inherent inefficiency of
launching a big ole 32-bit ALU op (the minimum unit of work in the
software pipeline) for each measly one-bit, or whether you could
exploit some of the insights to make faster gate-level hardware. If
you want insane speed, I'm not sure how you could beat a (possibly
pipelined) log adder tree from an algorithmic viewpoint.

Random google hit from "count the number of ones":

http://gurmeetsingh.wordpress.com/2008/08/05/fast-bit-counting-routines/

There is also some interesting work on fast decoding of VLC codes
(MPEG video for example) using combinatorial decoding of parts of the
input word to switch muxes to re-align the remainder of the input word
to a successive decode stage. If you had any indication of the
statistics (and were willing to play the statistics game, which is not
always an option), you might be able to exploit the same idea, say for
fast lookahead carry generation of halves or quarters of your input
width.

I realize this is a slight tangent to your original question,
though... sorry

- Kenn

Ken Cecka · Mar 11, 2009

There are also lots of nifty ways to count ones that the software guys
like to use. I haven't though through them enough to say whether they
merely represent a way to reclaim the inherent inefficiency of
launching a big ole 32-bit ALU op (the minimum unit of work in the
software pipeline) for each measly one-bit, or whether you could
exploit some of the insights to make faster gate-level hardware. If
you want insane speed, I'm not sure how you could beat a (possibly
pipelined) log adder tree from an algorithmic viewpoint.

Random google hit from "count the number of ones":

http://gurmeetsingh.wordpress.com/2008/08/05/fast-bit-counting-routines/

There is also some interesting work on fast decoding of VLC codes
(MPEG video for example) using combinatorial decoding of parts of the
input word to switch muxes to re-align the remainder of the input word
to a successive decode stage. If you had any indication of the
statistics (and were willing to play the statistics game, which is not
always an option), you might be able to exploit the same idea, say for
fast lookahead carry generation of halves or quarters of your input
width.

I realize this is a slight tangent to your original question,
though... sorry

- Kenn

I spent a little time looking over some of the software tricks earlier today. They seem to be mostly geared around minimizing serial instructions.

The specific case I'm addressing is counting ones in a relatively small vector where I want to do it in a single clock cycle. Interestingly, I tried coding it with a chain of adders and as a lookup table, and the Xilinx compiler faithfully implemented this in the RTL schematic (adder chain and a ROM respectively), but they both reduced to the exact same logic in the technology schematic.

Ken

LFSR doesn't generate random values during simulation	1	Aug 4, 2017
Query related to division of 1.122 in mentioned code.	0	Mar 21, 2017
pls help me ; vhdl;	0	Jun 21, 2016
Illegal sequential statement (VHDL)	0	Feb 17, 2017
Binary to BCD code understanding	0	Dec 27, 2021
erreur VHDL	3	May 6, 2014
Is my port map syntax right???? i am not able to find the proper guidance for the error in any book	0	Mar 19, 2016
What does 'genbin' and 'addbins' in OSVVM mean?	0	Feb 20, 2022

combinatorial feedback loop

Ken Cecka

Ken Cecka

Andy

kennheinrich

Ken Cecka

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads