mac design in vhdl

Discussion in 'VHDL' started by sksaras@hotmail.com, Sep 1, 2006.

  1. Guest

    hi..
    I want to design a MAC(multiply-accumulator).I have written the
    following code.The problem is ,when I do place & route in Xilinx ISE
    7.1version,I get too many timing errors(around 30).My clock is 70Mhz.

    entity mac is
    generic(
    input_width1 : integer:= 16;
    input_width2 : integer:= 16;
    output_width : integer := 36;
    mac_cycle_width : integer := 4
    );
    port (
    RESET : IN STD_LOGIC;
    CLK : IN STD_LOGIC;
    FD : IN STD_LOGIC;
    ND : IN STD_LOGIC;
    A : IN STD_LOGIC_VECTOR(input_width1-1 DOWNTO 0);
    B : IN STD_LOGIC_VECTOR(input_width2-1 DOWNTO 0);
    Q : OUT STD_LOGIC_VECTOR(output_width-1 DOWNTO 0);
    RDY : OUT STD_LOGIC
    );
    end entity mac;

    architecture rtl of mac is

    signal cycle : STD_LOGIC_VECTOR(mac_cycle_width-1 DOWNTO 0);
    signal rdy1 : std_logic;
    signal sum : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0);
    signal temp2 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0);
    signal prod : STD_LOGIC_VECTOR (input_width1 + input_width2 -1
    DOWNTO 0);

    begin

    -- cycle determines the no of mac accumulations
    -- fd indicates the start of new accumulation
    process(reset,clk)
    begin
    if reset = '1' then
    cycle <= (others =>'0');
    elsif(clk'event and clk = '1')then
    if ( fd = '1')then
    cycle
    <=conv_std_logic_vector((1),cycle'length);--"0001";--conv_std_logic_vector((1),cycle'length);
    --;
    else
    cycle <= cycle +'1';
    end if;
    end if;
    end process;

    -- ND indicates that the new data is ready at the input.
    -- the 2 inputs are multiplied and the product is added
    -- to the previous accumulator result .
    -- In the last cycle the accumulator result is given out and at
    -- same time the accumulator is reset to zeros.
    -- here the accumulator is variable temp1.
    -- SUM holds the final accumulated result and temp2 holds the
    -- intermediate results.
    process(reset,clk)
    variable temp1 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0);
    begin
    if(reset ='1')then
    sum <= (others => '0');
    prod <= (others => '0');
    temp1 := (others => '0');
    elsif( clk'event and clk ='0')then
    if (nd ='1')then
    prod <= A * (B);
    temp1 := temp1 + prod;
    if( cycle=conv_std_logic_vector((0),cycle'length))then
    --"0000"--conv_std_logic_vector((0),cycle'length)
    sum <= temp1;
    temp1 := (others => '0');
    end if;
    end if;
    end if;
    temp2 <= temp1;
    end process;


    -- here Q indicates the accumulator output during all cycles
    -- SUM holds the final accumulated result and temp2 holds the
    -- intermediate results. both are combined to form Q.
    process(clk)
    begin
    if( clk'event and clk ='1')then
    if (
    cycle=conv_std_logic_vector((0),cycle'length))then--"0000"--conv_std_logic_vector((0),cycle'length)
    q <= sum;
    else
    q <= temp2;
    end if;
    end if;
    end process;

    --q <= sum;

    -- At the end of MAC cycle rdy is generated to indicated
    -- that the MAC output is ready.
    process(reset,clk)
    begin
    if(reset ='1')then
    rdy1 <= '0';
    elsif( clk'event and clk ='1')then -- ori '1'
    if( cycle=conv_std_logic_vector((0),cycle'length))then
    --"0000"conv_std_logic_vector((0),cycle'length)
    rdy1 <= '1';
    else
    rdy1 <= '0';
    end if;
    end if;
    end process;

    rdy <= rdy1;

    end rtl;
    , Sep 1, 2006
    #1
    1. Advertising

  2. Guest

    hi all..
    is there any wrong in this design?. fuctionally when I tested on
    modelsim ,correct results were produced.Please help..


    wrote:
    > hi..
    > I want to design a MAC(multiply-accumulator).I have written the
    > following code.The problem is ,when I do place & route in Xilinx ISE
    > 7.1version,I get too many timing errors(around 30).My clock is 70Mhz.
    >
    > entity mac is
    > generic(
    > input_width1 : integer:= 16;
    > input_width2 : integer:= 16;
    > output_width : integer := 36;
    > mac_cycle_width : integer := 4
    > );
    > port (
    > RESET : IN STD_LOGIC;
    > CLK : IN STD_LOGIC;
    > FD : IN STD_LOGIC;
    > ND : IN STD_LOGIC;
    > A : IN STD_LOGIC_VECTOR(input_width1-1 DOWNTO 0);
    > B : IN STD_LOGIC_VECTOR(input_width2-1 DOWNTO 0);
    > Q : OUT STD_LOGIC_VECTOR(output_width-1 DOWNTO 0);
    > RDY : OUT STD_LOGIC
    > );
    > end entity mac;
    >
    > architecture rtl of mac is
    >
    > signal cycle : STD_LOGIC_VECTOR(mac_cycle_width-1 DOWNTO 0);
    > signal rdy1 : std_logic;
    > signal sum : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0);
    > signal temp2 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0);
    > signal prod : STD_LOGIC_VECTOR (input_width1 + input_width2 -1
    > DOWNTO 0);
    >
    > begin
    >
    > -- cycle determines the no of mac accumulations
    > -- fd indicates the start of new accumulation
    > process(reset,clk)
    > begin
    > if reset = '1' then
    > cycle <= (others =>'0');
    > elsif(clk'event and clk = '1')then
    > if ( fd = '1')then
    > cycle
    > <=conv_std_logic_vector((1),cycle'length);--"0001";--conv_std_logic_vector((1),cycle'length);
    > --;
    > else
    > cycle <= cycle +'1';
    > end if;
    > end if;
    > end process;
    >
    > -- ND indicates that the new data is ready at the input.
    > -- the 2 inputs are multiplied and the product is added
    > -- to the previous accumulator result .
    > -- In the last cycle the accumulator result is given out and at
    > -- same time the accumulator is reset to zeros.
    > -- here the accumulator is variable temp1.
    > -- SUM holds the final accumulated result and temp2 holds the
    > -- intermediate results.
    > process(reset,clk)
    > variable temp1 : STD_LOGIC_VECTOR (output_width-1 DOWNTO 0);
    > begin
    > if(reset ='1')then
    > sum <= (others => '0');
    > prod <= (others => '0');
    > temp1 := (others => '0');
    > elsif( clk'event and clk ='0')then
    > if (nd ='1')then
    > prod <= A * (B);
    > temp1 := temp1 + prod;
    > if( cycle=conv_std_logic_vector((0),cycle'length))then
    > --"0000"--conv_std_logic_vector((0),cycle'length)
    > sum <= temp1;
    > temp1 := (others => '0');
    > end if;
    > end if;
    > end if;
    > temp2 <= temp1;
    > end process;
    >
    >
    > -- here Q indicates the accumulator output during all cycles
    > -- SUM holds the final accumulated result and temp2 holds the
    > -- intermediate results. both are combined to form Q.
    > process(clk)
    > begin
    > if( clk'event and clk ='1')then
    > if (
    > cycle=conv_std_logic_vector((0),cycle'length))then--"0000"--conv_std_logic_vector((0),cycle'length)
    > q <= sum;
    > else
    > q <= temp2;
    > end if;
    > end if;
    > end process;
    >
    > --q <= sum;
    >
    > -- At the end of MAC cycle rdy is generated to indicated
    > -- that the MAC output is ready.
    > process(reset,clk)
    > begin
    > if(reset ='1')then
    > rdy1 <= '0';
    > elsif( clk'event and clk ='1')then -- ori '1'
    > if( cycle=conv_std_logic_vector((0),cycle'length))then
    > --"0000"conv_std_logic_vector((0),cycle'length)
    > rdy1 <= '1';
    > else
    > rdy1 <= '0';
    > end if;
    > end if;
    > end process;
    >
    > rdy <= rdy1;
    >
    > end rtl;
    , Sep 4, 2006
    #2
    1. Advertising

  3. Duane Clark Guest

    wrote:
    > hi all..
    > is there any wrong in this design?. fuctionally when I tested on
    > modelsim ,correct results were produced.Please help..
    >
    >
    > wrote:
    >> hi..
    >> I want to design a MAC(multiply-accumulator).I have written the
    >> following code.The problem is ,when I do place & route in Xilinx ISE
    >> 7.1version,I get too many timing errors(around 30).My clock is 70Mhz.
    >>


    Have you taken a look at the timing report (*.twr) to see what paths are
    failing? I assume the it is the multiply that is failing. Are you using
    a chip with built in hardware multipliers? Are they being used?
    Duane Clark, Sep 4, 2006
    #3
  4. Guest

    hi..thanks a lot.
    You are right.when I saw the timing analyzer (post -place & route
    static timing analyzer)most errors are in the multiply route.
    Now I have replaced multiply by xilinx multiplier core and the errors
    are reduced to 4 but the slices occupancy has increased a lot.
    Also I am not supposed to use any cores.Please suggest me any other
    ways to design the MAC.Or can I modify the above code (opitmise) so
    that errors are reduced?.

    Duane Clark wrote:
    > wrote:
    > > hi all..
    > > is there any wrong in this design?. fuctionally when I tested on
    > > modelsim ,correct results were produced.Please help..
    > >
    > >
    > > wrote:
    > >> hi..
    > >> I want to design a MAC(multiply-accumulator).I have written the
    > >> following code.The problem is ,when I do place & route in Xilinx ISE
    > >> 7.1version,I get too many timing errors(around 30).My clock is 70Mhz.
    > >>

    >
    > Have you taken a look at the timing report (*.twr) to see what paths are
    > failing? I assume the it is the multiply that is failing. Are you using
    > a chip with built in hardware multipliers? Are they being used?
    , Sep 5, 2006
    #4
  5. Duane Clark Guest

    wrote:
    > hi..thanks a lot.
    > You are right.when I saw the timing analyzer (post -place & route
    > static timing analyzer)most errors are in the multiply route.
    > Now I have replaced multiply by xilinx multiplier core and the errors
    > are reduced to 4 but the slices occupancy has increased a lot.
    > Also I am not supposed to use any cores.Please suggest me any other
    > ways to design the MAC.Or can I modify the above code (opitmise) so
    > that errors are reduced?.


    What chip are you targeting? The Virtex2 and later chips have built in
    hardware multipliers that consume no slices. Is this a homework problem?

    Again, you need to look at the timing report, see what paths are
    breaking, and then determine what to do to fix them.
    Duane Clark, Sep 5, 2006
    #5
  6. Guest

    hi..I am using spartan3 200k chip.I know that this chip has 12
    dedicated multipliers.If I use ' * ' to multiply ,will these
    multipliers be used ? If it uses these multipliers then why is the
    slice occupancy increased?

    Also when I use multiplier xilinx core, then it must use these built-in
    multipliers only to implement.hence it should occupy no or less
    slices.But there is increase in slices number.why?

    > Again, you need to look at the timing report, see what paths are
    > breaking, and then determine what to do to fix them.


    I have seen the timing report.The problem is in the multiply only.But
    how do I fix it ?.Please suggest.



    Duane Clark wrote:
    > wrote:
    > > hi..thanks a lot.
    > > You are right.when I saw the timing analyzer (post -place & route
    > > static timing analyzer)most errors are in the multiply route.
    > > Now I have replaced multiply by xilinx multiplier core and the errors
    > > are reduced to 4 but the slices occupancy has increased a lot.
    > > Also I am not supposed to use any cores.Please suggest me any other
    > > ways to design the MAC.Or can I modify the above code (opitmise) so
    > > that errors are reduced?.

    >
    > What chip are you targeting? The Virtex2 and later chips have built in
    > hardware multipliers that consume no slices. Is this a homework problem?
    >
    > Again, you need to look at the timing report, see what paths are
    > breaking, and then determine what to do to fix them.
    , Sep 6, 2006
    #6
  7. james Guest

    On 6 Sep 2006 00:13:50 -0700, wrote:

    >+++hi..I am using spartan3 200k chip.I know that this chip has 12
    >+++dedicated multipliers.If I use ' * ' to multiply ,will these
    >+++multipliers be used ? If it uses these multipliers then why is the
    >+++slice occupancy increased?

    ************

    Download XAPP467 from Xilinx web page and that will tell you what you
    need to know how to use the dedicated multipliers in the Spartan3
    series of devices. You can set the process constraints so that XST
    will infer use of a dedicated multiplier. Thus A * B will use a
    dedicated multiplier and not one comprised of LUTs. There is example
    code also that you can look at an adapt to your need.

    You should make use of these application notes as they can be helpful
    in understanding how the logic works internally.

    james
    james, Sep 13, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ZackS
    Replies:
    5
    Views:
    6,784
    Just an Illusion
    Jul 9, 2004
  2. Roman =?ISO-8859-15?Q?Bl=F6th?=

    Senseless rendering: Mac.Mozilla != Mac.Netscape6.01 ?!?!

    Roman =?ISO-8859-15?Q?Bl=F6th?=, Jul 2, 2003, in forum: HTML
    Replies:
    1
    Views:
    989
    Steve Pugh
    Jul 2, 2003
  3. mangm
    Replies:
    2
    Views:
    733
    mangm
    Dec 1, 2005
  4. afd
    Replies:
    1
    Views:
    8,275
    Colin Paul Gloster
    Mar 23, 2007
  5. K Richard Pixley
    Replies:
    3
    Views:
    335
    Ned Deily
    Jan 3, 2012
Loading...

Share This Page