Steve said:
I am trying to implement a simple circuit that counts the number of 1's in a
16 or 32 bit register.
I am fairly new to VHDL and have implemented it the following way
for i in 15 downto 0 loop
if (Reg(i) = '1') then
Bitcount := BitCount + 1;
end if;
end loop;
in synthesizing this code, i found that this generated numerous 16 bit
adders. is there a better way to implement consider the solution can be
stored in only 4 bits?
Steve
A few years ago I needed a circuit to do the same thing and looked at a
number of alternatives. Basically, what a circuit like this needs to do
is to add n 1-bit numbers. However, if you just start cascading a bunch
of 1-bit full adders you will end up with some redundant logic and a
number of untestable faults. The reason is that not all bit
combinations are possible in the final result. For example, for 16-bit
input vector you need a 5-bit vector to hold the result. But, only the
combinations of 00000 to 10000 (0-16) are possible. The remaining
combinations, 10001 to 11111 (17-31) can never occur. The same thing
applies to the intermediate stages leading to the final result. I spent
some time studying how to optimize this type of circuit and came up with
2, 4, 8, and 16-bit solutions that use optimized adders at each stage
that handle only the bit combinations that can possibly occur and none
others.
Here are my 8-bit and 16-bit summing networks. It may look like a lot
of equations, but the results are optimal with no redundancies and are
100% testable.
-- This macro outputs a binary count of the number of input bits that
-- are active at one time. It effectively sums 8 1-bit numbers.
LIBRARY IEEE;
USE IEEE.Std_Logic_1164.all;
ENTITY sum8 IS
PORT(a : IN Std_uLogic_vector(1 TO 8);
sum : OUT Std_uLogic_vector(3 downto 0) );
END ENTITY;
ARCHITECTURE beh OF sum8 IS
BEGIN
PROCESS(a)
VARIABLE s2aa, s2ab, s2ba, s2bb, s2ca, s2cb, s2da, s2db,
s4aa, s4ab, s4ac, s4ba, s4bb, s4bc, s8aaa
: std_logic;
VARIABLE s8 : Std_uLogic_vector(3 downto 0);
BEGIN
-- first level summing networks
s2aa := a(1) XOR a(2);
s2ab := a(1) AND a(2);
s2ba := a(3) XOR a(4);
s2bb := a(3) AND a(4);
s2ca := a(5) XOR a(6);
s2cb := a(5) AND a(6);
s2da := a(7) XOR a(8);
s2db := a(7) AND a(8);
-- second level summing networks
s4aa := s2aa XOR s2ba;
s4ab := (s2aa AND s2ba) OR (s2ab XOR s2bb);
s4ac := s2ab AND s2bb;
s4ba := s2ca XOR s2da;
s4bb := (s2ca AND s2da) OR (s2cb XOR s2db);
s4bc := s2cb AND s2db;
-- third level summing networks
s8(0) := s4aa XOR s4ba;
s8aaa := s4aa AND s4ba;
s8(1) := s8aaa XOR (s4ab XOR s4bb);
s8(2) := (s8aaa AND (s4ab OR s4bb)) OR (s4ab AND s4bb) OR (s4ac XOR
s4bc);
s8(3) := s4ac AND s4bc;
sum <= s8;
END PROCESS;
END ARCHITECTURE beh;
-- This macro outputs a binary count of the number of input bits that
-- are active at one time. It effectively sums 16 1-bit numbers.
LIBRARY IEEE;
USE IEEE.Std_Logic_1164.all;
ENTITY sum16 IS
PORT(a : IN Std_uLogic_vector(1 TO 16);
sum : OUT Std_uLogic_vector(4 downto 0) );
END ENTITY;
ARCHITECTURE beh OF sum16 IS
BEGIN
PROCESS(a)
VARIABLE s2aa, s2ab, s2ba, s2bb, s2ca, s2cb, s2da, s2db, s2ea, s2eb,
s2fa, s2fb, s2ga, s2gb, s2ha, s2hb,
s4aa, s4ab, s4ac, s4ba, s4bb, s4bc, s4ca, s4cb, s4cc, s4da, s4db,
s4dc,
s8aa, s8aaa, s8ab, s8ac, s8ad, s8ba, s8baa, s8bb, s8bc, s8bd,
s16aa, s16bb
: Std_uLogic;
VARIABLE s16 : Std_uLogic_vector(4 downto 0);
BEGIN
-- first level summing networks
s2aa := a(1) XOR a(2);
s2ab := a(1) AND a(2);
s2ba := a(3) XOR a(4);
s2bb := a(3) AND a(4);
s2ca := a(5) XOR a(6);
s2cb := a(5) AND a(6);
s2da := a(7) XOR a(8);
s2db := a(7) AND a(8);
s2ea := a(9) XOR a(10);
s2eb := a(9) AND a(10);
s2fa := a(11) XOR a(12);
s2fb := a(11) AND a(12);
s2ga := a(13) XOR a(14);
s2gb := a(13) AND a(14);
s2ha := a(15) XOR a(16);
s2hb := a(15) AND a(16);
-- second level summing networks
s4aa := s2aa XOR s2ba;
s4ab := (s2aa AND s2ba) OR (s2ab XOR s2bb);
s4ac := s2ab AND s2bb;
s4ba := s2ca XOR s2da;
s4bb := (s2ca AND s2da) OR (s2cb XOR s2db);
s4bc := s2cb AND s2db;
s4ca := s2ea XOR s2fa;
s4cb := (s2ea AND s2fa) OR (s2eb XOR s2fb);
s4cc := s2eb AND s2fb;
s4da := s2ga XOR s2ha;
s4db := (s2ga AND s2ha) OR (s2gb XOR s2hb);
s4dc := s2gb AND s2hb;
-- third level summing networks
s8aa := s4aa XOR s4ba;
s8aaa := s4aa AND s4ba;
s8ab := s8aaa XOR (s4ab XOR s4bb);
s8ac := (s8aaa AND (s4ab OR s4bb)) OR (s4ab AND s4bb) OR (s4ac XOR
s4bc);
s8ad := s4ac AND s4bc;
s8ba := s4ca XOR s4da;
s8baa := s4ca AND s4da;
s8bb := s8baa XOR (s4cb XOR s4db);
s8bc := (s8baa AND (s4cb OR s4db)) OR (s4cb AND s4db) OR (s4cc XOR
s4dc);
s8bd := s4cc AND s4dc;
-- fourth level summing network
s16(0) := s8aa XOR s8ba;
s16aa := s8aa AND s8ba;
s16(1) := s16aa XOR (s8ab XOR s8bb);
s16bb := (s16aa AND (s8ab OR s8bb)) OR (s8ab AND s8bb);
s16(2) := s16bb XOR (s8ac XOR s8bc);
s16(3) := (s16bb AND (s8ac OR s8bc)) OR (s8ac AND s8bc) OR (s8ad XOR
s8bd);
s16(4) := s8ad AND s8bd;
sum <= s16;
END PROCESS;
END ARCHITECTURE beh;
(I originally came up with these before I ever started using VHDL. With
some clever application of some nested VHDL Procedures the coding could
obviously be simplified. This was just a brute-force conversion.)
Charles