One Cycle delay write Problem with 'Register File' when Simulatingwith mini MIPS

M

Muhammad Awais

Hi,

I am a little new in the world of vhdl. I am working on a project for
mini-MIPS. I have problem with Register File behavior. When I simulate
Reg file stand alone, it gives me good behavior ie. when writing - it
will write the data on the same clk'edge. But when integrated with
other components. It acts weird i.e. it writes after one cycle.

I am attaching two snapshots of my simulations

<http://users.encs.concordia.ca/~m_humayu/Screenshot-regfile.png>
1st Snap shot is the stand alone only simulation of Reg File, and we
can see As soon as Write Enable (wr_en) is '1' and clk ='1' the write
occur in to the corresponding register(wr_add), without any delay.

<http://users.encs.concordia.ca/~m_humayu/Screenshot-mips-regfile.png>
in the 2nd snapshot same component integrated into mips behave
different. At time = 40 ns (it's the write back stage of the first
instruction). The signals are from the regfiles components. When the
write back data (x08) is ready at the start of clk edge, the write is
enabled, and write back address (01) is held also, the write occurs
into reg(0) on the next clock cycle (time=50) . WHY? and what is the
possible solution.

Following is my Reg file - code, which is 32 x 32bit Register file.

Attached is my code of mini-MIPS - just for reference
<http://users.encs.concordia.ca/~m_humayu/miniMIPS32.zip>
regfile.vhd is used in ID_mips32.vhd (Instruction Decode). Top level
entity is <complete.vhd> and testbench is <tb_complete_mips.vhd>

Thanks

__________________________
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;

entity regfile is
port(
rst : in std_logic;
clk : in std_logic;
rd_1 : in std_logic_vector(4 downto 0);
rd_2 : in std_logic_vector(4 downto 0);
data_in : in std_logic_vector(31 downto 0);
wr_en : in std_logic;
wr_add : in std_logic_vector(4 downto 0);
out_1 : out std_logic_vector(31 downto 0);
out_2 : out std_logic_vector(31 downto 0)
);
end entity regfile ;

architecture rtl of regfile is

type reg_array is array (0 to 31) of std_logic_vector(31 downto 0);
signal reg : reg_array;
signal wr_add_temp : std_logic_vector(4 downto 0);
signal data_in_temp : std_logic_vector(31 downto 0);
signal wr_en_temp : std_logic;
--signal s_out_1, s_out_2 : std_logic_vector(31 downto 0);
begin

readprocess:

out_1 <= reg(conv_integer(rd_1));
out_2 <= reg(conv_integer(rd_2));


wr_add_temp <= wr_add;
data_in_temp <= data_in;
wr_en_temp <= wr_en;

writeprocess:

process(clk,rst,wr_en,wr_add,reg)
begin
if rst = '1' then
for index in 0 to 31 loop
reg(index) <= ( Others => '0' );
end loop;
elsif (RISING_EDGE(clk)) then
if wr_en_temp = '1' then
if wr_add_temp /= "00000" then
reg(conv_integer(wr_add_temp)) <= data_in_temp;
end if;
end if;
end if;
end process;

end architecture rtl;

_____________________________________________________________
 
U

Uncle Noah

Hi,

I am a little new in the world of vhdl.

So it seems.
I have problem with Register File behavior. When I simulate
Reg file stand alone, it gives me good behavior ie. when writing - it
will write the data on the same clk'edge. But when integrated with
other components. It acts weird i.e. it writes after one cycle.

My guess is that when you simulate the regfile alone, you just look
when the data are written. But when you simulate it as a component you
get to see that the new values are available on the rising edge of the
following clock cycle.

There is nothing strange in your VHDL model.

This is a very simple (and certainly statically verifiable case of
behavior). Looking to timing diagrams for such a simple thing is a
certain waste of time. Manual observations can hinder productivity. It
is important to know when to start looking and wasting time.

BTW for a register file, use a template. Your code has some problems
(see below).

Thanks for the code. I am a bone collector; a code troll ^_^
wr_add_temp <= wr_add;
data_in_temp <= data_in;
wr_en_temp <= wr_en;

Why do you need these? And why are they not on the sensitivity list of
the process?

Cheers,
Nikolaos Kavvadias
 
U

Uncle Noah

I am a little new in the world of vhdl.

So it seems.
I have problem with Register File behavior. When I simulate
Reg file stand alone, it gives me good behavior ie. when writing - it
will write the data on the same clk'edge. But when integrated with
other components. It acts weird i.e. it writes after one cycle.

My guess is that when you simulate the regfile alone, you just look
when the data are written. But when you simulate it as a component you
get to see that the new values are available on the rising edge of the
following clock cycle.

There is nothing strange in your VHDL model.
I am attaching two snapshots of my simulations

This is a very simple (and certainly statically verifiable case of
behavior). Looking to timing diagrams for such a simple thing is a
certain waste of time. Manual observations can hinder productivity. It
is important to know when to start manual observations and waste time.

BTW for a register file, you can use a template.

Thanks for the code. I am a bone collector; a code troll ^_^
wr_add_temp <= wr_add;
data_in_temp <= data_in;
wr_en_temp <= wr_en;

Why do you need these? What would really be needed to be included in
the sensitivity list?

Cheers,
Nikolaos Kavvadias
 
M

Mike Treseler

Muhammad said:
I have problem with Register File behavior. When I simulate
Reg file stand alone, it gives me good behavior ie. when writing - it
will write the data on the same clk'edge. But when integrated with
other components. It acts weird i.e. it writes after one cycle.

It is normal that a registered output occurs one tick later.
process(clk,rst,wr_en,wr_add,reg)

process(clk,rst)
may give your sim a better match with synthesis.

related example:
http://home.comcast.net/~mike_treseler/stack.vhd
http://home.comcast.net/~mike_treseler/stack.pdf

-- Mike Treseler
 
M

Muhammad Awais

Thanks mike and Noah.

Mike: Is tried putting clk and rst only in sensitivity list but no
change in resultant output.
process(clk,rst)

I found one solution i.e. delaying the clk to register file i.e
---------------------------------
clk_temp <= clk after 1 ns ;

writeprocess:

process(clk_temp,rst,clk)
begin
if rst = '1' then
for index in 0 to 31 loop
reg(index) <= ( Others => '0' );
end loop;
elsif (RISING_EDGE(clk_temp)) then
if wr_en_temp = '1' then
if wr_add_temp /= "00000" then
reg(conv_integer(wr_add_temp))
<= data_in_temp;
end if;
end if;
end if;
end process;

end architecture rtl;
---------------------------------------------

It solves my problem for now - but I still would like to understand
the physics behind this weird behavior at RTL level.
<http://users.encs.concordia.ca/~m_humayu/Screenshot-Regfile-delay-
clk.png>
 
K

KJ

Thanks mike and Noah.

I found one solution i.e. delaying the clk to register file i.e

I guess by 'solution' then you don't expect this to be synthesizable
into a physical part.
It solves my problem for now - but I still would like to understand
the physics behind this weird behavior at RTL level.

There are no physics behind it, you have a misconception about when
things are supposed to happen and are about to learn about simulation
delta delays. First off, the real solution to your problem is to
eliminate the following lines of code

clk_temp <= clk after 1 ns ;
wr_add_temp <= wr_add;
data_in_temp <= data_in;
wr_en_temp <= wr_en;

and then change your process to use the signals without the "_temp"
suffix.

Also, whatever is generating the various inputs (wr_add, data_in and
wr_en) are happening precisely on the rising edge of clk...which is a
testbench issue. No input should change precisely at the rising edge
of the clock that is supposedly generating it. Change your testbench
(or how you force the signals) so that all of the inputs change AFTER
the rising edge of the clock, that will model the reality of actual
devices. Get that working functionally how you want it to be working.

Kevin Jennings
 
J

jens

(just noticed that Kevin beat me to a reply, just ignore this if
anything is duplicated...)

As Mike mentioned, the 2nd snapshot is indeed normal. I assume the
signals being used inside rising_edge(clk) in writeprocess are
synchronous with clk (which they should be for a good design). Even
though it's not obvious from the simulation, those signals change
after the rising edge of clk - delayed by delta delay(s) in simulation
and gate delay(s) in a real part. So reg() won't change until the
next clock cycle. What you did by adding the delay was to generate
the signals some delta delay(s) after clk, then use them 1ns later
instead of 1 clock cycle later. Simulation won't match synthesis in
this case. You'll need to remove the 1ns delay, and deal with the one
clock cycle or remove that delay from the design another way.

Something I like to do for unit test simulations is to have the
testbench generate input signals (which would normally change on the
rising edge of clk) on the falling edge instead. That way it makes
timing more obvious, as opposed to not easily knowing (from the
simulation waveform) where those signals are changing in relation to
the rising edge of clock (which is part of what is causing the
confusion in this example).
 
T

Tricky

Also, whatever is generating the various inputs (wr_add, data_in and
wr_en) are happening precisely on the rising edge of clk...which is a
testbench issue. No input should change precisely at the rising edge
of the clock that is supposedly generating it. Change your testbench
(or how you force the signals) so that all of the inputs change AFTER
the rising edge of the clock, that will model the reality of actual
devices. Get that working functionally how you want it to be working.


This will often happen if your testbench uses physical time delays to
generate the stimulus, rather than having the stimulus react to the
clock. What is probably happening is that the write enable is changing
as the clock is changing, so that when the design sees a clock edge,
it also sees that write enable is 1.

Personally, I find having all stimulus react to the oposite clock edge
of the design to help eleminate these problems, and it can be clearer
what inputs are actually going into a system. Even if everything
reacts to the rising edge properly, in simulators like modelsim if you
read the curser value it will tell you what the new value is, rather
than the value that was clocked into the system. having the inputs
valid between falling edges, the curser will tell you eactly what
value was clocked in at that moment in time.
 
M

Martin Thompson

KJ said:
Also, whatever is generating the various inputs (wr_add, data_in and
wr_en) are happening precisely on the rising edge of clk...which is a
testbench issue. No input should change precisely at the rising edge
of the clock that is supposedly generating it.

Change your testbench (or how you force the signals) so that all of
the inputs change AFTER the rising edge of the clock, that will
model the reality of actual devices.

Hi KJ,

Do you really do that (time delay the input signals to the test system
in the testbench)? Do you also delay the outputs from the test
system?

Or do you mean delta delays?

The vast majority of my testbenches provide their inputs in a
synchronous fashion, using the same clock as the unit under test.

Maybe I've been staring at waveforms long enough that I "see" the
delta delays :)

Cheers,
Martin
 
K

KJ

Hi KJ,

Do you really do that (time delay the input signals to the test system
in the testbench)?  Do you also delay the outputs from the test
system?

Or do you mean delta delays?
If a signal 'some_input' is supposed to be a synchronous input to the
design, then the testbench model will be something of the form
...
wait until rising_edge(Clock);
some_input <= '1';
...

The signal 'some_input' then won't change until after Clock has
already gone from 0 to 1. Anything in the design that is looking for
'rising_edge(Clock)' will have already occurred prior to that point
(which is why creating a temporary signal 'Clock_temp <= Clock' and
then using 'Clock_temp' in the design is prone to problems since now
'some_input' will change at the same time as 'Clock_temp').

So it's not necessarily an explicit delay that you see in the
testbench, what it is though is simply modelling the actual
environment. As others have stated, they prefer to have synchronously
generated signals occur on the falling edge of the clock so they can
more easily see what's going on. While that's nice, it won't help
when debugging internal interfaces in your actual design since those
signals will be changing (presumably) only on the rising edge of the
clock input...so you might as well get used to seeing signals appear
to be changing instantaneously while keeping in mind that there really
is a simulation delta delay in there (which you can see if you drag
the signals over into a list window where time gets displayed as real
time and with the simulation delta).

In general, the testbench model models actual things. For some
subunits of the design, that testbench will model whatever other
subunit it is that will be talking to it and at the PCBA level, I'll
have models for all or most of the actual devices on the board. Given
that mindset, if the input to some design comes out of a 74F374, then
the 74F374 will be modelled...that model will not be changing outputs
on the falling edge of the clock to make it easier to 'see' them, it
will change the outputs in response to the rising edge since that's
what the part actually does.
The vast majority of my testbenches provide their inputs in a
synchronous fashion, using the same clock as the unit under test.

My testbenches models whatever the reality is of the environment that
the unit under test will be operated in.

KJ
 
M

Martin Thompson

KJ said:
If a signal 'some_input' is supposed to be a synchronous input to the
design, then the testbench model will be something of the form
...
wait until rising_edge(Clock);
some_input <= '1';
...

Right, that is like what I do. Although I have a utility function to
save typing:
tick(clk);
sig<='1';

:)

My testbenches models whatever the reality is of the environment that
the unit under test will be operated in.

Agreed - it just happens that most of my testbenches are modelling
other bits of FPGA, so synchronous-ness is a given :)

I've also done PCB level simulation of a RAM interface including PCB
track delays with the gate-level model of the FPGA. I'm not sure I'd
do that again though - you don't learn much about your design and
there are better ways of ensuring the timing is OK. It was a useful
exercise though!

Cheers,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top