problem in optimization of vhdl code

A

ashu

hi

my code given below is taking too much of (600) logic cells t& max
opreating freq is 7 MHZ i think which is very less,
can u plz suggest me some remedy........


library ieee ;
use ieee.std_logic_1164.all ;
use ieee.std_logic_arith.all ;


entity test is

port ( clk,sta_in : in std_logic ;
sta_out : out std_logic ;
sel : in bit_vector ( 2 downto 0 ) ;
data_in : in integer range -2047 to 2047 ;
data_out : out integer range -127 to 127 );

end test ;

architecture a of test is


begin

process( clk,sta_in )

variable mul : integer range -65535 to 65535 ;
variable mul1 : integer range -127 to 127 ;
variable s : std_logic_vector(0 to 16 ) ;
variable b,b1 : bit_vector ( 0 to 16 ) ;
variable b2 : bit_vector ( 0 to 7 ) ;
variable s2 : std_logic_vector ( 0 to 7 ) ;

begin


if ( clk 'event and clk = '1' ) then

if ( sta_in = '1' ) then

mul := 32 * data_in ;

case sel is

when "000" =>

mul := mul ;

when "001" =>
mul := mul / 5 ;

mul := mul * 4 ;
when "010" =>

mul := mul / 5 ;
mul := mul * 3 ;

when "011" =>

mul := mul / 2 ;

when "100" =>

mul := mul / 5 ;
mul := mul * 2 ;

when "101" =>

mul := mul / 5 ;

when "110" =>

mul := mul / 10 ;

when "111" =>

mul := mul / 20 ;

end case ;

s := conv_std_logic_vector( mul ,17 ) ;

b := to_bitvector(s) ;

b1 := b srl 9 ;
s := to_stdlogicvector( b1 ) ;

mul1 := conv_integer ( signed(s) ) ;

if ( b (8)= '1' ) then


mul1 := mul1 + 1 ;

else

mul1 := mul1 ;

end if ;

dat_out <= mul1 ;

sta_out <= sta_in ;

else


sta_out <= '0' ;


end if ;
end if ;
end process ;
end a ;



code is working logically its is producing the required outputs but
timing analyzer tool is showing max delay of 130ns due to which max
freq is limited to 7 mhz
should i try some other conversion functions etc.......plz let me know
about that


ashwani anand
 
K

KJ

Probably has to do with your dividers. I see the following at least

mul := mul / 5 ;
mul := mul / 10 ;
mul := mul / 20 ;

These are 'expensive' in terms of logic resources and performance to
implement. Can be done obviously but doesn't mean you don't pay a price.
Consider looking into the lpm_divide component and see if that will work
your application. It implements a divider the tradeoff being that it takes
several clock cycles for the output to become valid but what you get is
something that will work at a much higher clock frequency.

KJ
 
J

jens

In addition to KJ's suggestions...

Examine the tradeoffs between various architectures and verify what the
tools are doing- e.g. with the dividers, you can use one divider with a
selectable divide by value or multiple dividers with fixed divide by
values, try both to see which one works better.

With the division, you can replace that with a multiplication and a
shift, e.g. to divide by 5, multiply by 2^N/5, then shift by N (larger
values of N will give more accurate results).

Most of the multiplications are by powers of 2, try shifting instead.

The multiplication by 3 could be replaced by two additions, which could
either be two adders or one pipelined adder.

Good luck!
 
M

Mike Treseler

ashu said:
my code given below is taking too much of (600) logic cells t& max
opreating freq is 7 MHZ i think which is very less,
can u plz suggest me some remedy........

Put your design on an RTL viewer and you
will see the problem. Five large blocks
of combinational logic between the
input pins and the output register.

I will assume your device has no dsp blocks.
To increase Fmax you can infer registers
after each block of logic. For example,
using the variable s before you define will give you
one pipe register:
end case ;
-- s := conv_std_logic_vector( mul ,17 ) ;
b := to_bitvector(s) ;
s := conv_std_logic_vector( mul ,17 ) ;

But don't reuse the identifier s, declare s1.
b1 := b srl 9 ;
s := to_stdlogicvector( b1 ) ;
^ ------------don't reuse s, declare s1
mul1 := conv_integer ( signed(s) ) ;
^-----s1

Inferring registers costs you nothing here
as they are presently bypassed in your
combinational blocks.

To make this design easier to understand,
consider changing to signed and unsigned
vectors of generic widths and ieee.numeric.std
functions.

-- Mike Treseler
 
G

gj

HI Ashu,

Instead of performing a large multiplication, its better to implement
multiplication and division block as component.
Make multiplication and division block with small data value(say for 4
* 4 bits),
then instantiate tht blocks in ur code call them as per ur requirement.
it will reduce ur logic cells and will be easy to synthesize.

Regards
GJ
 
Joined
May 16, 2006
Messages
5
Reaction score
0
regarding bidirectional port

Hello
i have written one simple alu accumaltor structure and that u can see in following code.my problem is that when i try to simulate i am getting bus conflict,means when i want to put data back to data bus as iam using data bus inout ,it is showing undefined signal 'X' (bus conflict) .can any one help me out in this problem.iam using xiling for vhdl code and simulating it modelsim.




library IEEE;
use IEEE.STD_LOGIC_1164.all;

package MP_PACK is

-- Typdeklarationen:

type OPTYPE is (NOP,LDAUNM,ADD,SUB);--OPCODE
end MP_PACK;

top level

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use WORK.MP_PACK.ALL;

entity ALU_AKKU is

generic(DELAY : time := 10 ns);

port ( CLK ,RESET : in bit;
DATABUS : inout std_logic_vector(3 downto 0);
OPCODE : in OPTYPE;
LOADDABUS : in bit;
LOADACBUS : in bit
);

end ALU_AKKU;

architecture Behavioral of ALU_AKKU is

signal ACBUS,ALUOUT : std_ulogic_vector(3 downto 0);

component ALU_ENT

generic(DELAY : time:= 10 ns);
port( DATABUS : in std_ulogic_vector(3 downto 0):= (others=>'0');
ACBUS : in std_ulogic_vector(3 downto 0):= (others=>'0');
OPCODE : in OPTYPE;
ALUOUT : out std_ulogic_vector(3 downto 0):= (others=>'0'));

end component ALU_ENT;

component AKKU_ENT

generic(DELAY : time := 10 ns);
port( CLK : in bit;
RESET : in bit;
ALUOUT : in std_ulogic_vector(3 downto 0):= (others=>'0');
DATABUS : out std_logic_vector(3 downto 0):= (others=>'0');
ACBUS : out std_ulogic_vector(3 downto 0):= (others=>'0');
LOADDABUS : in bit:='0';
LOADACBUS : in bit:='0'
);

end component AKKU_ENT;


begin

COMP_1 : ALU_ENT

generic map (10 ms)
port map(to_stdulogicvector(DATABUS),ACBUS,OPCODE,ALUOUT);

COMP_2 : AKKU_ENT

generic map (10 ms)
port map(CLK,RESET,ALUOUT,DATABUS,ACBUS,LOADDABUS,LOADACBUS);



end Behavioral;

ALU
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use WORK.MP_PACK.ALL;



entity ALU_ENT is

generic(DELAY : time:= 10 ns);
port( DATABUS : in std_ulogic_vector(3 downto 0):= (others=>'0');
ACBUS : in std_ulogic_vector(3 downto 0):= (others=>'0');
OPCODE : in OPTYPE;
ALUOUT : out std_ulogic_vector(3 downto 0):= (others=>'0'));
end ALU_ENT;

architecture ALU_ARCH of ALU_ENT is
begin
ALU_PROCESS:process(DATABUS,ACBUS,OPCODE)

variable ZWERG : std_logic_vector(3 downto 0):= (others=>'0');
begin

ZWERG := (others =>'0');

case OPCODE is

when LDAUNM => ZWERG := to_stdlogicvector(DATABUS);
when ADD => ZWERG := to_stdlogicvector(DATABUS) + to_stdlogicvector(ACBUS);
when others => null;

end case;
ALUOUT <= to_stdulogicvector(ZWERG);

end process ALU_PROCESS;

end ALU_ARCH;

Accumaltor

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;



entity AKKU_ENT is

generic(DELAY : time := 10 ns);

port( CLK : in bit;
RESET : in bit:='0';
ALUOUT : in std_ulogic_vector(3 downto 0):="0000";
DATABUS : out std_logic_vector(3 downto 0):="0000";
ACBUS : out std_ulogic_vector(3 downto 0):="0000";
LOADDABUS : in bit:='0';
LOADACBUS : in bit:='0'
);

end AKKU_ENT;

architecture Behavioral of AKKU_ENT is

signal ACCU_INTERN : std_ulogic_vector(3 downto 0):="0000";

begin

DATABUS_LOAD:process(LOADDABUS,ACCU_INTERN)

begin

if LOADDABUS = '1' then

DATABUS <= to_stdlogicvector(ACCU_INTERN) ;

else

DATABUS <= (others=>'Z') ;

end if;

end process DATABUS_LOAD;


AKKU_LOAD: process(CLK,RESET) is

begin

if RESET = '1' then

ACCU_INTERN <= (others=>'0') ;

elsif CLK'event and CLK = '1' then

if LOADACBUS = '1' then

ACCU_INTERN <= ALUOUT ;

end if;
end if;

end process AKKU_LOAD;

ACBUS <= ACCU_INTERN ;

end Behavioral;

Test bench

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
use WORK.MP_PACK.ALL;

ENTITY testbench IS
END testbench;

ARCHITECTURE behavior OF testbench IS

-- Component Declaration
COMPONENT ALU_AKKU
port(CLK,RESET : in bit;
DATABUS : inout std_logic_vector(3 downto 0);
OPCODE : in OPTYPE;
LOADDABUS : in bit;
LOADACBUS : in bit
);
END COMPONENT;

SIGNAL OPCODE : OPTYPE;
SIGNAL DATABUS : std_logic_vector(3 downto 0) := (others=>'0');
signal CLK : bit;
signal RESET : bit;
signal LOADDABUS : bit;
signal LOADACBUS : bit;


BEGIN

-- Component Instantiation
uut: ALU_AKKU PORT MAP(
CLK => CLK,
RESET => RESET,
DATABUS => DATABUS,
OPCODE => OPCODE,
LOADDABUS=>LOADDABUS,
LOADACBUS=> LOADACBUS

);
-- Test Bench Statements
ALU_AKKU_CLK: PROCESS
BEGIN
wait for 100 ns;
CLK <= not CLK;

END PROCESS ALU_AKKU_CLK;

ALU_AKKU_PROCESS:process
begin



DATABUS <= b"0001" ;
OPCODE <= LDAUNM;

for I in 1 to 1 loop
wait until CLK'event and CLk='1';
end loop;

LOADACBUS <= '1';


for I in 1 to 2 loop
wait until CLK'event and CLk='1';
end loop;

LOADACBUS <= '0';
DATABUS <= b"0010";
OPCODE <= ADD ;

for I in 1 to 1 loop
wait until CLK'event and CLk='1';

end loop;

LOADACBUS <= '1';
OPCODE <= NOP;

for I in 1 to 2 loop
wait until CLK'event and CLk='0';
end loop;

LOADACBUS <= '0';
LOADDABUS <= '1';

for I in 1 to 1 loop
wait until CLK'event and CLk='1';
end loop;

LOADDABUS <= '0';
wait;


end process ALU_AKKU_PROCESS;
-- End Test Bench
END;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top