Quartus II compilation too slow for RAM design

S

sora

when using qu(at)rtus tools to compile the correction of my Dual port
RAM design,
it takes me hours to compile and synthesize the source code file.

The RAM with only the size of 64 bytes, successful compile after 5
minutes.
The times increase linearly when the size of the RAM double,
let's say 128bytes takes about 10minutes or more, so on.

Im going to design 4 kbytes RAM myself with my small project.
Everything is
pretty good except the RAM with the compilation times consume hours of
time.

That's pretty unhappy.

************************************************
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;

entity RAM128 is
generic (
A: integer := 7;
WORDS: integer := 128;
M: integer := 8
);
port (
clk : in STD_LOGIC;
TxR : in STD_LOGIC;
TxW : in STD_LOGIC;
AddrTx1 : in STD_LOGIC_VECTOR(A-1 downto 0);
AddrTx2 : in STD_LOGIC_VECTOR(A-1 downto 0);
DataTxIn : in STD_LOGIC_VECTOR(M-1 downto 0);
DataTxOut : out STD_LOGIC_VECTOR(M-1 downto 0);
AddrRx1 : in STD_LOGIC_VECTOR(A-1 downto 0);
AddrRx2 : in STD_LOGIC_VECTOR(A-1 downto 0);
RxW : in STD_LOGIC;
RxR : in STD_LOGIC;
DataRxIn : in STD_LOGIC_VECTOR(M-1 downto 0);
DataRxOut : out STD_LOGIC_VECTOR(M-1 downto 0)
);
end RAM128;

architecture RAM128_arch of RAM128 is

subtype cell is std_logic_vector(M-1 downto 0);
type ramArray is array (0 to WORDS-1) of cell;
signal ram: ramArray;
signal AddrMatch :std_logic;

begin
AddrMatch <= '1' when (AddrTx1 = AddrRx2) else '0';

process(clk, AddrTx1, AddrRx2, TxW, RxW, AddrMatch)
begin
if (clk'event and clk = '1')then
if (TxW = '1') and (AddrMatch = '0')then
ram(CONV_INTEGER(unsigned(AddrTx1))) <= DataTxIn;
else
if (TxW = '1') and (AddrMatch = '1') and (RxW = '1') then
ram(CONV_INTEGER(unsigned(AddrTx1))) <= DataTxIn;
end if;
end if;

if (RxW = '1') and (AddrMatch = '0')then
ram(CONV_INTEGER(unsigned(AddrRx2))) <= DataRxIn;
else
if (TxW = '1') and (AddrMatch = '1') and (RxW = '1') then
ram(CONV_INTEGER(unsigned(AddrTx1))) <= DataTxIn;
end if;
end if;
end if;
end process;

process(RxR, AddrRx1, ram)
begin
if (RxR = '1')then
DataRxOut <= ram(CONV_INTEGER(unsigned(AddrRx1)));
else
DataRxout <= (others => '0');
end if;
end process;

process(TxR, AddrTx2, ram)
begin
if (TxR = '1')then
DataTxOut <= ram(CONV_INTEGER(unsigned(AddrTx2)));
else
DataTxOut <= (others => '0');
end if;
end process;

end RAM128_arch;
******************************************************************

Did any experts there can help to tackle this problem?
Did any good ideas there to reduce the time of compilation ?
This makes me crazy!
Please help to post your appreciatable ideas!!!
 
D

Derek Simmons

I'm just because most modern FPGA's include sometime RAM, why are
implementing your own?

In QuartusII, under 'Assignments' -> 'Settings' -> 'Analysis
& Synthesis' (left hand panel) -> 'More Settings...' (button), in
this dialog it gives you the option select how it should generate RAM.
Have you tried playing with these settings?

Derek
 
S

sora

Derek said:
I'm just because most modern FPGA's include sometime RAM, why are
implementing your own?

In QuartusII, under 'Assignments' -> 'Settings' -> 'Analysis
& Synthesis' (left hand panel) -> 'More Settings...' (button), in
this dialog it gives you the option select how it should generate RAM.
Have you tried playing with these settings?

Derek

Ya I know quartus provides its own RAM, but it is not flexible for me to design my >project.
I cant find quartus II provides any Dual port RAM with two writes port,
and my project requires it to transmit and recieve data like a FIFO with bidirection.
 
T

Thomas Entner

Ya I know quartus provides its own RAM, but it is not flexible for me to
Can you double the clock-rate and write every other cycle from the other
source into the RAM via its single write-port? Implementing your own RAM
will be an enormus waste of resources, esp. for 4KB.

Thomas

www.entner-electronics.com
 
K

KJ

sora said:
when using qu(at)rtus tools to compile the correction of my Dual port
RAM design,
it takes me hours to compile and synthesize the source code file.

The RAM with only the size of 64 bytes, successful compile after 5
minutes.
The times increase linearly when the size of the RAM double,
let's say 128bytes takes about 10minutes or more, so on.

Im going to design 4 kbytes RAM myself with my small project.
Everything is
pretty good except the RAM with the compilation times consume hours of
time.

In the Quartus manual they give examples in there of how to properly code to
infer memory. Perhaps peruse that a bit.

KJ
 
A

Andy

I don't know about altera FPGAs, but xilinx (and maybe others) have
distributed (LUT) 16x1 dual port rams with async (combinatorial) reads.
Your code probably would infer those resources rather quickly, but 4k
would still be a huge amount of them to stitch together. I know if I've
goofed up a ram inferrence if it takes a while to run in synplify
(because it cannot use the rams for some reason, and has to build it
out of registers).

Can you re-design to allow a clock cycle delay on your reads, and thus
infer synchronous-read block rams which both altera and xilinx have?

Andy
 
R

Rob Dekker

Hi Sora,

Your design models simultaniously 2 writes and 2 reads.
Under some setting of the control signals, you need simultanious access to ram's addresses AddrTx1, AddrRx2, AddrRx1 and AddrTx2.
That's a quad-port RAM.
Most FPGA RAM models have only 2 ports (3 at most) that can be accessed simultaniously.

Both Altera and Xilinx tools complain about this (not able to extract a RAM for this behavior)
and implement the model with flip-flops and decoders etc etc.

So if you want a RAM for this, re-code it so there are no more that two ports active.
You can resolve the multiple simultanious reads by replicating the ram, but do you really need the
simultanious write to AddrTx1 and AddrRx2 ?

Rob
 
A

Andy

Rob is correct. You cannot have lut based RAMs with two write ports (at
least in xilinx). You can have a write port, and a read port, and an
optional extra read port using the write port's address.

If altera has the same restrictions, that is probably why it is
creating registers.

There is a way to implement a memory with arbitrary numbers of read and
write ports, using multiple rams and xor encoding/decoding the data
between them. However it does not support simultaneous writes to the
same address from different ports. The number of rams required adds up
fast, though. Also it only works with combinatorial read rams, but the
combinatorial read capabiltiy is maintained, so long as you have time
for the xor decode on the data.

Andy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,043
Latest member
CannalabsCBDReview

Latest Threads

Top