Design is too large for the device! xc3s400

M

methi

Hi,

I am currently working with the 1727 bit wide shift register....For my
design requirements, I had to change it to a 3454 bit wide shift
register( 2*1727)...

When I did this and implemented the design, I have an error in the
Mapping that:
The design is too large for the device and package (I am using xc3s400
and tq144- spartan3)


So instead I am working on two shift registers, each 1727 bit wide....

and giving the output of one shift reg as input to the other...

This would still use the same number of resources ..so I am getting the
same Mapping error....I tried contacting Xilinx support.....waiting for
some help..

Is there another method to work around this..to minimize the design...

Or I was wondering if I should change the fpga I am using...that would
mean a whole lot of other changes on the board....

But yes that would be last resort type of thing....


Any suggestions or ideas

Thanks,
Methi
 
J

John_H

methi said:
Hi,

I am currently working with the 1727 bit wide shift register....For my
design requirements, I had to change it to a 3454 bit wide shift
register( 2*1727)...


<snip>

Is your design giving you registers for your shift elements?
The SRL16 primitives can cut that utilization by a significant amount.
You can also use BlockRAM to perform a shift register function.

There are many solutions at your disposal but no true idea of what your
overall constraits are.

Is the shift register the only thing in the design?
Are you using a reset for those shift elements?
Is it serial-in, serial-out?
Are there frequency constraints?
Is the shift register fixed in length or variable?
 
M

methi

Is the shift register the only thing in the design?

Nopes I have my design doing a whole lot of things...but its only when
I changed the length of the shift register that I came across the
Mapping error.
Are you using a reset for those shift elements?
I am not using any reset...It takes in a clock...and shifts a bit for
every rising edge of the clock...
Is it serial-in, serial-out?
Yes its serial_in and serial_out
Are there frequency constraints? No.

Is the shift register fixed in length or variable?
Its a variable shift register....the length is determined by an input
variable called "right".

The code is as follows:
entity shifting_two is
Port ( shiftin : in std_logic;
clock_in : in std_logic;
right : in integer;
shiftout : out std_logic);
end shifting_two;

architecture Behavioral of shifting_two is

signal shift_register : std_logic_vector ( 3454 downto 0 ):= (others =>
'0');

begin

process(clock_in)
begin
if rising_edge(clock_in) then
shift_register <= shift_register( 3453 downto 0 ) & shiftin;
shiftout <= shift_register(right-1);
end if;
end process;


end Behavioral;


How can I use a BlockRam...?

My Design Summary as showin in the Map report is as follows:

Design Information
------------------
Command Line : C:/Xilinx/bin/nt/map.exe -intstyle ise -p
xc3s400-tq144-4 -cm
area -pr b -k 4 -c 100 -tx off -o top_1190_mem_map.ncd top_1190_mem.ngd
top_1190_mem.pcf
Target Device : x3s400
Target Package : tq144
Target Speed : -4
Mapper Version : spartan3 -- $Revision: 1.16.8.2 $
Mapped Date : Wed Jul 20 11:37:18 2005

Design Summary
--------------
Number of errors: 1
Number of warnings: 38
Logic Utilization:
Number of Slice Flip Flops: 5,126 out of 7,168 71%
Number of 4 input LUTs: 4,150 out of 7,168 57%
Logic Distribution:
Number of occupied Slices: 4,175 out of
3,584 116%
(OVERMAPPED)
Number of Slices containing only related logic: 3,305 out of
4,175 79%
Number of Slices containing unrelated logic: 870 out of
4,175 20%
*See NOTES below for an explanation of the effects of unrelated
logic
Total Number 4 input LUTs: 4,274 out of 7,168 59%
Number used as logic: 4,150
Number used as a route-thru: 124
Number of bonded IOBs: 73 out of 97 75%
IOB Flip Flops: 31
Number of Block RAMs: 10 out of 16 62%
Number of GCLKs: 8 out of 8 100%
Number of DCMs: 2 out of 4 50%

Number of RPM macros: 1
Total equivalent gate count for design: 738,951
Additional JTAG gate count for IOBs: 3,504
Peak Memory Usage: 120 MB



Thankyou,

Methi
 
V

Vladislav Muravin

Methi,

You can use a circular buffer implemented by BRAM.

What are you doing with this shift register afterwards?

Vladislav
 
P

Peter Alfke

Methi, when you say "wide", I believe you mean "long" or "deep".
With SRL16s you can cut the size by a factor 16, but you do not have
parallel access to all the bits in your shift register.
Even more compact is a BlockRAM, where you can pack >16000 bits into
one BlockRAM.
So it all depends on how you use your shift register, and how you have
o control it...
Peter Alfke
 
M

methi

Thankyou Peter

Its a 3454 bit shift register.....By saying "wide" ,I am talking about
the depth...

What I am trying to do is...depending on the value of the variable
"right"...for eg: if right=2300, then the output would be the 2200th
bit of the shift register.

This would be mean that I need access to all the bits in the shift
register...
With SRL16s you can cut the size by a factor 16, but you do not have
parallel access to all the bits in your shift register.

Does this mean that I would get access only to the MSB and not the
individual bits like how it does in my code at present?
Even more compact is a BlockRAM, where you can pack >16000 bits into
one BlockRAM.

How do I use this BlockRAM....to work as shift register...
 
M

methi

Methi,

You can use a circular buffer implemented by BRAM.

What are you doing with this shift register afterwards?

Vladislav



I am just using the output of my shift register...which is the MSB
(this keeps changing depending on the value of the variable
"right")...as input to another component....its a pulse...
 
M

methi

I've come across this core in Xilinx which is a RAM-based shift
register....It takes in clock:rising edge clock signal, serial input,
address input(for variable length) and gives out a serial out...

Would using this core instead of the code I have save any resources...

Am wondering if anybody has worked with this before..

Otherwise the options I have so far is to go for:

1) BlockRAM

2) Cicular buffer with RAM

3)SLR16


Thanks,

Methi
 
R

Ray Andraka

methi said:
Nopes I have my design doing a whole lot of things...but its only when
I changed the length of the shift register that I came across the
Mapping error.



I am not using any reset...It takes in a clock...and shifts a bit for
every rising edge of the clock...



Yes its serial_in and serial_out



Its a variable shift register....the length is determined by an input
variable called "right".

The code is as follows:
entity shifting_two is
Port ( shiftin : in std_logic;
clock_in : in std_logic;
right : in integer;
shiftout : out std_logic);
end shifting_two;

architecture Behavioral of shifting_two is

signal shift_register : std_logic_vector ( 3454 downto 0 ):= (others =>
'0');

begin

process(clock_in)
begin
if rising_edge(clock_in) then
shift_register <= shift_register( 3453 downto 0 ) & shiftin;
shiftout <= shift_register(right-1);
end if;
end process;


end Behavioral;


How can I use a BlockRam...?
For that, you want to use a block RAM, which will give you up to 18K
bits length. You can either use two address counters, one for the read
side of the memory and one for the write side, and offset the address of
the write counter so that the read count trails it by N where N is the
shift register length, or you can use a single counter set up as a
modulo N count (use a loadable downcount for that, thereby keeping it to
one level of logic. If you were using an older Xilinx fpga, you'd need
to delay the read by a clock relative to the write for this second
scheme because they didn't support read before write operation. With the
spartan3, you can set the attribute on the bram for read first, which
allows you to apply the same address to both the read and the write
ports. It is easier if you instantiate the BRAM rather than trying to
let the software figure it out.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email (e-mail address removed)
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 
M

methi

Hi Ray,

Are you talking about a BRAM core available in Xilinx?

Ive only come across the RAM based shift reg which goes upto 1024 bits
(xilinx 6.3i)

Or Should I be working on a BRAM code in vhdl...


Thanks,

Methi
 
R

Ray Andraka

methi said:
Hi Ray,

Are you talking about a BRAM core available in Xilinx?

Ive only come across the RAM based shift reg which goes upto 1024 bits
(xilinx 6.3i)

Or Should I be working on a BRAM code in vhdl...


Thanks,

Methi
I was talking about a VHDL module with an instantiated RAMB16 primitive
in it. Something like this should do the trick for you:

nxt_addr<= to_unsigned(modulo-2,abits+1) when addr(abits)='1'
else addr-1;
process(clk)
begin
if clk'event and clk='1' then
if ce='1' THEN
addr<=nxt_addr;
end if;
end if;
end process;

a_addr<= std_logic_vector(resize(addr(abits-1 downto 0),14))

U1: ramb16_s1
--synthesis translate_off
generic map(
WRITE_MODE_A => "READ_FIRST",
WRITE_MODE_B => "READ_FIRST")
--synthesis translate_on
port map(
DIA => b_in,
ENA => ce,
WEA => '1',
SSRA => '0',
CLKA => clk,
ADDRA=> a_addr,
DOA =>b_out);

Since your shift register is only 3K+ long, you could even use the other
RAMB16 port for something else by setting the upper address bits to
opposite constant values on each port. The loadable down counter
reloads when it counts past 0 to -1. It is coded this way to give the
synthesis enough of a hint not to stick the load function in as a
multiplexer after the carry chain, which would force it to two levels of
logic.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email (e-mail address removed)
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 
M

Mike Treseler

methi said:
I am just using the output of my shift register...which is the MSB
(this keeps changing depending on the value of the variable
"right")...as input to another component....its a pulse...

If you are delaying a single pulse/edge for 3454 ticks,
there are easier ways to do it than with a shift register.

-- Mike Treseler
 
P

Peter Alfke

Methi,
Take a BlockRAM, with both ports configured as 16K x 1.
Make one port Write and the other one Read. Clock both ports with your
data clock.
Drive the Write address with a counter that you increment with the data
clock.
Drive the Read address from a subtractor circuit that subtracts the
length N of your shift register from the Write address.
Now you have a programmable-length shift register from the D input of
the write port to the Q output of the Read port.
And you get up to 16K bit length in a single BRAM plus four CLBs (14
bit counter plus 14-bit subtractor).
Peter Alfke
 
A

Andy Peters

Mike said:
If you are delaying a single pulse/edge for 3454 ticks,
there are easier ways to do it than with a shift register.

Exactly. He should use a counter that's initialized to required delay
and enabled when he sees his input pulse. It counts down, and when it
hits zero, an output pulse is generated and the counter is preset back
to his initial value.

There's nothing like getting set on one solution and stubbornly
pursuing it to the point where you're blind to other, simpler,
solutions.

-a
 
R

Ray Andraka

Peter said:
Methi,
Take a BlockRAM, with both ports configured as 16K x 1.
Make one port Write and the other one Read. Clock both ports with your
data clock.
Drive the Write address with a counter that you increment with the data
clock.
Drive the Read address from a subtractor circuit that subtracts the
length N of your shift register from the Write address.
Now you have a programmable-length shift register from the D input of
the write port to the Q output of the Read port.
And you get up to 16K bit length in a single BRAM plus four CLBs (14
bit counter plus 14-bit subtractor).
Peter Alfke
Peter, with SPartan3, he can do it with one port of the BRAM if he uses
a modulo-N count instead of a straight 14 bit binary count. I showed
this in the code I posted earlier. The modulo N count is easy if you do
it as a loadable down-count that reloads itself when it goes negative.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email (e-mail address removed)
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 
R

Ray Andraka

Mike said:
If you are delaying a single pulse/edge for 3454 ticks,
there are easier ways to do it than with a shift register.

-- Mike Treseler

Mike, I assumed this was more like a line buffer where he needed to
delay a sequence of bits by the 3454 clocks. If it is indeed just a
delay from a single pulse and you can guarantee that another pulse does
not occur until the first one has propagated out, then it can be done
with just a counter.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email (e-mail address removed)
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 
M

methi

I am trying to delay a pulse by N ticks where N takes a maximum value
of 3454...N is a variable here....
 
P

Peter Alfke

Ray, pretty clever.
But more difficult to understand, or to modify.
And what do I do with the unused port?

But nevertheless, hats off to a smart solution...
Peter
 
K

Kolja Sulimma

methi said:
shift_register <= shift_register( 3453 downto 0 ) & shiftin;
This alone would probably synthesize to 256 LUTs configured as SLR16.
No probleme there
shiftout <= shift_register(right-1);
But this is a 3453 to 1 multiplexer. Without any tricks you need about
2k-LUTs to implement it and it will be reaallllyyyy slow.

So you should use the BRAM as suggested by others.

Kolja Sulimma
 
K

Kolja Sulimma

methi said:
I am trying to delay a pulse by N ticks where N takes a maximum value
of 3454...N is a variable here....

Yes, but a single pulse, or many pulses? That was Rays question.
Can you guarantee that there at most M pulses within 3454 clock cycles?
(with a small value of M)

Kolja Sulimma
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,111
Latest member
KetoBurn
Top