Unassigned register decode

trescot · Jan 10, 2009

I have a question regarding unassigned values. I have a line of code,

fmt <= '1' when regA = "000" else '0';

Now if regA=XXX(unassigned) then I thought fmt will also be X or does
it depends on the simulator. I used vcs to try this and it shows
fmt=0.

Thanks,
Trescot

jeppe · Jan 10, 2009

Well a quick and short answer.

1) In real world applications (read hardware) theres no such thing as unassigned signals. The values will allways be 0 or 1

2) In the simulation world will you how ever get an output which refers to X (properly X or U)

Hope this was useful.
Jeppe

kennheinrich · Jan 11, 2009

I like this approach though it is a little wordy for each use... it
seems to me you could embed all this functionality in, er, a function,
and write

fmt <= has_value(regA,"000");

- Brian

(actually writing the function is left as an exercise; but a pretty
straightforward one given the process above)

std_match() ?

- Kenn

kennheinrich · Jan 11, 2009

std_match() ?

- Kenn

Hmmm. Scratch that. But being a fan of the *nix tool philosphy (many
small building blocks), I'd probably use some functions that made the
operations clear when reading the code:

fmt <= bool2slv( std_match(fail_if_X(regA), "000"));

where fail_if_X is the identity function but can thow an exception
(assert fails).

Of course, one man's clear is another VHDL 101 student's nightmare

- Kenn

Mike Treseler · Jan 11, 2009

Jonathan said:
Although the resulting model gives the simulator quite
a lot of work to do (linear search through the register
map for every access), in synthesis it builds really
nice logic.

The Time Bandit strikes again

Thanks for the posting.
Much food for thought.

I'll bet modelsim is pretty good at linear search.
I can easily get a faster computer,
but I can't get a faster brain.

Meanwhile, this
approach gives me the most flexible and clear way I've
yet discovered to create the register map of a medium-
sized design.

I agree.
I like how you pull the static, database stuff out of time.

It might be possible to extend this idea
to infer the registers in the process that
consumes the data.

-- Mike Treseler

KJ · Jan 12, 2009

Jonathan Bromley said:
On Sat, 10 Jan 2009 17:16:33 -0800 (PST), wrote:
Of course, it may be that
everyone else already has a better way of doing it, but
if so could they please speak up and tell us?

'Better' is always in the eye of the beholder though

Now we're ready to go. Please note that everything above,
with the exception of the very first package containing
the list-of-registers enumeration, is completely invariant
across different designs.

And there is first bit of a rub from the standpoint of straight code re-use,
the package. If you keep the package, entity and architecture all together
and source control them, then each new design, since it will have a
different address map will need it's own package. But updating the source
file for the new design now breaks things for the old design which still
needs to be maintained. As the creator of such source, you'd want to
provide the 'package' in the form of commented out sample package code that
the user would need to create on their own. That way the code that is
invariant across all designs will not need any modification and could simply
be used.

The second rub which you're already aware of is that the package effectively
then
prevents you from using a second instance with different data widths in the
same
design. Most of the time this is not a problem since many designs simply
have
one processor type of bus so only one instance is needed, but it does
indicate where
the design doesn't scale. Maybe that's what you meant when you mentioned
something about using for a medium sized design.

1) Instancing a peripheral couldn't be easier:

the_widget_register: entity work.widget
port map
( ...(
, select_me => reg_sel(reg_widget)
, my_read_data => reg_rd_D(reg_widget)
...
--- all other signals are common to all peripherals
...

Ummm....the reason one has read/write ports in a design in the first place
is because you need direct access to all of the bits in the ports at all
times. If you only need access to one port at any given time then what you
need is simple dual port memory. So you need an additional output that is
an array of T_reg_readback as an output of the entity as well.

A disadvantage: the register map information is in a
package, and therefore is global. VHDL-2008 package
generics could be used to fix this, but aren't yet
supported by any synth tool I know. Meanwhile, this
approach gives me the most flexible and clear way I've
yet discovered to create the register map of a medium-
sized design.
--

I use a two dimensional array of std_ulogic, that way data width and the
register list length are not limited by something in the package. The
package contains functions to convert to/from the 2d array to get a specific
register element. These functions are invariant across all designs. Within
each design there may also be an enumeration list of ports, which then
require conversion between the enumeration and an integer using 'val and
'pos attributes. Also, I don't actually implement the read/write ports in
that entity for a couple of reasons:

- Many designs have a bit or two in some port(s) that need to reset with
application of a reset signal. By implementing the ports themselves, you
either need to make a global decision to reset everything for every design
(which wastes routing resources, but some probably think is OK) or bring in
some generic parameters to specify which bits in which ports need to be
affected by reset and which do not (which makes the entity somewhat more
difficult to use). Neither problem is a *big* issue to deal with though.

- Sometimes the 'port select' is really a decode to some other external hunk
of code. In other words, the whole idea of implementing a global design
read/write port entity is not what you want in the first place. By simply
implementing the decoder/encoder to route data between the master and the
various slave devices (or ports) you get something that has no package
created restrictions and can scale as needed. For example, long lists of
registers running at a high clock speed will likely require multiple clocks
to do the muxing back to read data (which is going to be the critical timing
path element for this entity)...having a generic to control how many ticks
delay there is gives the user control of that function for dealing with
clock cycle performance of the module)

Lastly, a minor nit is the defaulting of the register read back data to '0'.
This is not needed and just adds additional routing and can slow down
performance in the (likely) critical path for muxing the data.

Thanks for the posting, nice to see higher level things than "my design
doesn't work...help"

Kevin Jennings

Petrov_101 · Jan 12, 2009

> I've struggled for years with the problem of how to
do register decode and readback in an elegant, flexible
way and at long last I think I have a reasonably nice
(partial, as ever) solution. Of course, it may be that
everyone else already has a better way of doing it, but
if so could they please speak up and tell us?

Here's what I have begun to do:

<much interesting text snipped>

I'll have to review your approach in more detail but it looks pretty
snazzy. I've taken a different approach to create banks of
registers. My desire was to move away from VHDL and towards a
specification template of sorts that is parsed by software and output
as both synthesizable VHDL and formatted documentation.

Example Input file

#-------------------------------------------------------------------------------
# Revision Register
#-------------------------------------------------------------------------------
NAME: REV
ADDR: 0x0
DIR: R
BITS: 7:0 Revision

#-------------------------------------------------------------------------------
# Control Register
#-------------------------------------------------------------------------------
NAME: CONTROL
ADDR: 0x0
DIR: W
SBIT: 2 StartSmStrobe
BIT: 1 AdcSmEn
BIT: 0 DacSmEn

#-------------------------------------------------------------------------------
# DAC Start Address Register
#-------------------------------------------------------------------------------
NAME: DACSTART
ADDR: 0x1
DIR: W
BITS: 11:0 DacStartAddr

etc...

The software groups the above registers together, determines the
maximum bus size required to read/write the largest register, creates
input and output ports for all signals and outputs a vhdl module that
you can instantiate in your code.

I need to add an optional comment field to be parsed and sent to my
documentation tools but the vhdl code generation works well enough to
be useful.

Pete

Mike Treseler · Jan 12, 2009

Example Input file .. . .
#-------------------------------------------------------------------------------
# DAC Start Address Register
#-------------------------------------------------------------------------------
NAME: DACSTART
ADDR: 0x1
DIR: W
BITS: 11:0 DacStartAddr

etc...

The software groups the above registers together, determines the
maximum bus size required to read/write the largest register, creates
input and output ports for all signals and outputs a vhdl module that
you can instantiate in your code.

Interesting. It would seem that if an algorithm exists
to write synthesis code from that structure,
it ought to be possible to package a similar
vhdl record constant along with vhdl functions
and a looping synthesis process to do the same thing
at elaboration time.

Thanks for the posting.

-- Mike Treseler

Petrov_101 · Jan 12, 2009

Interesting. It would seem that if an algorithm exists
to write synthesis code from that structure,
it ought to be possible to package a similar
vhdl record constant along with vhdl functions
and a looping synthesis process to do the same thing
at elaboration time.

Thanks for the posting.

-- Mike Treseler- Hide quoted text -

- Show quoted text -

The parser wasn't that difficult to write. It's just a TCL script.
The output vhdl is nicely formatted but obviously written by a
computer program. As long as the synthesis results were good, I
didn't worry too much about writing compact VHDL.

I've been trying to cut down the time it takes to generate VHDL code
for the more common functions that appear in my designs. This
includes state machines (vhdl and pdf diagrams) and testbench code. I
auto-generate from scripts as much as possible these days.

I've also been trying to get the documentation for code written
coincident with development. Documentation at the end of a project is
tedious and prone to error. Frankly, by the time the design is
complete I'm ready to move on to the next challenge. Unless required,
documentation tends to end up on the back burner indefinitely. I
figure if I can get a concise, specification template to spit out vhdl
*and* formatted documentation I'll be killing two birds with one
stone.

kennheinrich · Jan 13, 2009

'Better' is always in the eye of the beholder though

And there is first bit of a rub from the standpoint of straight code re-use,
the package. If you keep the package, entity and architecture all together
and source control them, then each new design, since it will have a
different address map will need it's own package. But updating the source
file for the new design now breaks things for the old design which still
needs to be maintained. As the creator of such source, you'd want to
provide the 'package' in the form of commented out sample package code that
the user would need to create on their own. That way the code that is
invariant across all designs will not need any modification and could simply
be used.

The second rub which you're already aware of is that the package effectively
then
prevents you from using a second instance with different data widths in the
same
design. Most of the time this is not a problem since many designs simply
have
one processor type of bus so only one instance is needed, but it does
indicate where
the design doesn't scale. Maybe that's what you meant when you mentioned
something about using for a medium sized design.

Ummm....the reason one has read/write ports in a design in the first place
is because you need direct access to all of the bits in the ports at all
times. If you only need access to one port at any given time then what you
need is simple dual port memory. So you need an additional output that is
an array of T_reg_readback as an output of the entity as well.

I use a two dimensional array of std_ulogic, that way data width and the
register list length are not limited by something in the package. The
package contains functions to convert to/from the 2d array to get a specific
register element. These functions are invariant across all designs. Within
each design there may also be an enumeration list of ports, which then
require conversion between the enumeration and an integer using 'val and
'pos attributes. Also, I don't actually implement the read/write ports in
that entity for a couple of reasons:

- Many designs have a bit or two in some port(s) that need to reset with
application of a reset signal. By implementing the ports themselves, you
either need to make a global decision to reset everything for every design
(which wastes routing resources, but some probably think is OK) or bring in
some generic parameters to specify which bits in which ports need to be
affected by reset and which do not (which makes the entity somewhat more
difficult to use). Neither problem is a *big* issue to deal with though.

- Sometimes the 'port select' is really a decode to some other external hunk
of code. In other words, the whole idea of implementing a global design
read/write port entity is not what you want in the first place. By simply
implementing the decoder/encoder to route data between the master and the
various slave devices (or ports) you get something that has no package
created restrictions and can scale as needed. For example, long lists of
registers running at a high clock speed will likely require multiple clocks
to do the muxing back to read data (which is going to be the critical timing
path element for this entity)...having a generic to control how many ticks
delay there is gives the user control of that function for dealing with
clock cycle performance of the module)

Lastly, a minor nit is the defaulting of the register read back data to '0'.
This is not needed and just adds additional routing and can slow down
performance in the (likely) critical path for muxing the data.

Thanks for the posting, nice to see higher level things than "my design
doesn't work...help"

Kevin Jennings

Thanks to both KJ and Jonathan for the good ideas. Agreed - some high-
level design is nice to see! I like some of the aspects that
Jonathan's approach brings. Having wrestled the register problem
myself, and in the spirit of really diving in, I'll throw the
following into the ring.

It seems to me that the main aspects his structure brings to the table
are

1) embedding of the register names into enum literals, adding to the
readability
2) a centralized and reusable decoder/mux through which all register
accesses are funnelled
3) a nice way of parameterizing and describing the registers to allow
(2) to be created

As has also been discussed, using a higher-level tool like Denali's
Blueprint, or other open source tools, can somewhat assist with point
(1) by automatically creating VHDL code which defines the constants
for you, (or with some code hacking, could probably build an enum).
I've seen an open-source tool in use (the name eludes me) which has a
nice HTML output so you can print out a nicely formatted 20 page
register spec as soon as your VHDL is compiled... it's very nice. You
can also get "C" language header files with all the right #define's to
make the software guys happy.

The centralized mux (item 2) along with the descriptors (3) are good
for describing *where* the registers are, but not so much for
describing *what* the registers are. Clearly, they will be domain-
specific, but may have some common characteristics.

In a lot of my designs, I have a clear delineation of register
classes: (a) simple read/write, like for static mode bits etc, (b)
read-only (like status and packet counters), (c) sticky alarm bits
with uniform "write-a-'1'-back-to-clear-it' semantics. I implemented
a framework which addresses aspect (3) above as well as the register
classifications. It's (yet again) partly orthogonal and partly
overlapping to both the above schemes.

I "crack" the incoming bus transaction into a single record type as
soon as the external bus hits the pins. There's always someone who
forgets whether CS is active H or L, and forgets whether or not got
gate WE with CS!

-- instead of separate cs, re, we flags we will use this
type TRANSACTION_KIND is ( T_IDLE, T_READ, T_WRITE);

-- this TYPE is used for interfaces from cracked external bus to
-- sub-blocks and direct registers. 'data' ignored for reads.
type TRANSACTION
is record
addr : std_logic_vector(ADDRWIDTH-1 downto 1);
data : std_logic_vector(BITS_PER_WORD - 1 downto 0);
action : TRANSACTION_KIND;
end record;

Then I set up a similar range descriptor, but used a bitmask to define
which bits are ignored when decoding the address.

type MEM_RANGE_DESCRIPTOR
is record
mask : std_logic_vector(ADDRWIDTH-1 downto 0);
match : std_logic_vector(ADDRWIDTH-1 downto 0);
end record;

Some simple predicates on transactions (is_in_range, is_read,
is_write, etc) make the later code fairly readable.

Now, if I want to build a simple read/write register I can use a
common procedure such as this:

-- ------------------------------------------------------------
--
-- procedure to create a regular read/write micro register
--
-- the register is a vector of D flops: loaded with value on write,
--
-- register is automatically decoded from address & bit index given
-- ------------------------------------------------------------
procedure micro_rw_reg (
constant reg_desc : in MEM_RANGE_DESCRIPTOR;
constant trans : in TRANSACTION;
signal reg : inout std_logic_vector(BITS_PER_WORD-1 downto 0)
)
is begin
if (is_in_range(trans, reg_desc)) and is_write(trans) then
reg <= trans.data;
end if;
end micro_rw_reg;

Or if I want to create only one bit of a register (say a register with
16 individual control bits) I can call the following procedure sixteen
times:

procedure micro_rw_bit (
constant reg_desc : in MEM_RANGE_DESCRIPTOR;
constant trans : in TRANSACTION;
signal reg : inout std_logic_vector(BITS_PER_WORD-1 downto 0);
constant index: in integer
)
is begin
if (is_in_range(trans, reg_desc)) and is_write(trans) then
reg(index) <= trans.data(index);
end if;
end micro_rw_bit;

Then the actual creation of, say, and interrupt source control
register could be built like this:

--
----------------------------------------------------------------------
--
-- irq source register
--
--
----------------------------------------------------------------------

src_reg_proc: process (clk)
begin
if rising_edge(clk) then
irq_src_reg(15 downto 2) <= (others => '0');
if reset = '1' then
irq_src_reg <= (others => '0');
else
micro_rw_bit( IRQ_SRC_REG_DESC, w_trans, irq_src_reg, 1);
micro_rw_bit( IRQ_SRC_REG_DESC, w_trans, irq_src_reg, 0);
end if;
end if; -- rising edge
end process src_reg_proc;

Where I had earlier defined a signal for irq_src_reg as well as a
descriptor:

constant IRQ_SRC_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00018");

Wherease I could build the three word-wide foobar, fiddle, and faddle
registers by writing this:

constant FOOBAR_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00000");

constant FIDDLE_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00004");

constant FADDLE_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00008");

reg_trinity_proc: process (clk)
begin
if rising_edge(clk) then
if reset = '1' then
foobar_reg <= (others => '0');
fiddle_reg <= (others => '0');
faddle_reg <= (others => '0');
else
micro_rw_reg( FOOBAR_REG_DESC, w_trans, foobar_reg);
micro_rw_reg( FIDDLE_REG_DESC, w_trans, fiddle_reg);
micro_rw_reg( FADDLE_REG_DESC, w_trans, faddle_reg);
end if;
end if; -- rising edge
end process reg_trinity_proc;

The approach to building the central decoder/read mux is similar to
Jonathan's but instead of decoding directly from the descriptors, I
decoded the main bus into fixed-size address segments (say on 64K
boundaries, which matches the mask of "ffff" above). Each fixed-size
segment gets routed to one "submodule" in the design; say one
submodule has a set of registers for an interrupt controller, one
submodule has registers for a video decoder, one submodule has the
chip common version, ID, and master reset registers, etc. The main
bus interface has N transaction output ports and N data readback
ports:

entity main_bus_interface
generic ( ADDRESS_CHUNK_SIZE_IN_BITS : integer ;
NPORTS : integer)
port ( ---
external bus pins
---
txn : TRANSACTION_VECTOR (0 to NPORTS); -- e.g. 64K per port
rdata : std_logic_vector(BITS_PER_WORD downto 0) );

I can build a subfunction (like a video processor, a packet interface,
or FIFO portal) in an entity which talks to the main_bus_interface
through a standard transaction bus. I can put down individual
registers fairly simply within each submodule using the procedures
described above. And furthermore, I can put down multiple subfunctions
(say three interrupt controllers and four video processors) by
instantiating multiple entities attached to the main_bus_interface;
each identical submodule then gets an identical set of registers
created at (for example) 64K offsets from each other by virtue of the
decoding performed in the main decoder. On the main decoder txn
ports, there would only be one port with a non-IDLE transaction
occurring at any given time.

So the clear drawbacks of this are:

- hard coding of ADDRWIDTH and BITS_PER_WORD - but they can be defined
in a package per FPGA
- lacking the elegance of identifying registers with enum literals
- the registers per se are not created automagically; they still have
to be written with a procedures
- pipelining and delay through the central entity have to be managed
somehow - but that's true always

But it kind of bridges the gap: I can simplify the act of defining a
register to some fairly well-named procedures, and force the designer
to declare some nice descriptors which help the VHDL documentation.
Plus I have a fairly resuable central bus entity.

My two bits.

- Kenn

kennheinrich · Jan 13, 2009

Thanks to both KJ and Jonathan for the good ideas. Agreed - some high-
level design is nice to see! I like some of the aspects that
Jonathan's approach brings. Having wrestled the register problem
myself, and in the spirit of really diving in, I'll throw the
following into the ring.

It seems to me that the main aspects his structure brings to the table
are

1) embedding of the register names into enum literals, adding to the
readability
2) a centralized and reusable decoder/mux through which all register
accesses are funnelled
3) a nice way of parameterizing and describing the registers to allow
(2) to be created

As has also been discussed, using a higher-level tool like Denali's
Blueprint, or other open source tools, can somewhat assist with point
(1) by automatically creating VHDL code which defines the constants
for you, (or with some code hacking, could probably build an enum).
I've seen an open-source tool in use (the name eludes me) which has a
nice HTML output so you can print out a nicely formatted 20 page
register spec as soon as your VHDL is compiled... it's very nice. You
can also get "C" language header files with all the right #define's to
make the software guys happy.

The centralized mux (item 2) along with the descriptors (3) are good
for describing *where* the registers are, but not so much for
describing *what* the registers are. Clearly, they will be domain-
specific, but may have some common characteristics.

In a lot of my designs, I have a clear delineation of register
classes: (a) simple read/write, like for static mode bits etc, (b)
read-only (like status and packet counters), (c) sticky alarm bits
with uniform "write-a-'1'-back-to-clear-it' semantics. I implemented
a framework which addresses aspect (3) above as well as the register
classifications. It's (yet again) partly orthogonal and partly
overlapping to both the above schemes.

I "crack" the incoming bus transaction into a single record type as
soon as the external bus hits the pins. There's always someone who
forgets whether CS is active H or L, and forgets whether or not got
gate WE with CS!

-- instead of separate cs, re, we flags we will use this
type TRANSACTION_KIND is ( T_IDLE, T_READ, T_WRITE);

-- this TYPE is used for interfaces from cracked external bus to
-- sub-blocks and direct registers. 'data' ignored for reads.
type TRANSACTION
is record
addr : std_logic_vector(ADDRWIDTH-1 downto 1);
data : std_logic_vector(BITS_PER_WORD - 1 downto 0);
action : TRANSACTION_KIND;
end record;

Then I set up a similar range descriptor, but used a bitmask to define
which bits are ignored when decoding the address.

type MEM_RANGE_DESCRIPTOR
is record
mask : std_logic_vector(ADDRWIDTH-1 downto 0);
match : std_logic_vector(ADDRWIDTH-1 downto 0);
end record;

Some simple predicates on transactions (is_in_range, is_read,
is_write, etc) make the later code fairly readable.

Now, if I want to build a simple read/write register I can use a
common procedure such as this:

-- ------------------------------------------------------------
--
-- procedure to create a regular read/write micro register
--
-- the register is a vector of D flops: loaded with value on write,
--
-- register is automatically decoded from address & bit index given
-- ------------------------------------------------------------
procedure micro_rw_reg (
constant reg_desc : in MEM_RANGE_DESCRIPTOR;
constant trans : in TRANSACTION;
signal reg : inout std_logic_vector(BITS_PER_WORD-1 downto 0)
)
is begin
if (is_in_range(trans, reg_desc)) and is_write(trans) then
reg <= trans.data;
end if;
end micro_rw_reg;

Or if I want to create only one bit of a register (say a register with
16 individual control bits) I can call the following procedure sixteen
times:

procedure micro_rw_bit (
constant reg_desc : in MEM_RANGE_DESCRIPTOR;
constant trans : in TRANSACTION;
signal reg : inout std_logic_vector(BITS_PER_WORD-1 downto 0);
constant index: in integer
)
is begin
if (is_in_range(trans, reg_desc)) and is_write(trans) then
reg(index) <= trans.data(index);
end if;
end micro_rw_bit;

Then the actual creation of, say, and interrupt source control
register could be built like this:

--
----------------------------------------------------------------------
--
-- irq source register
--
--
----------------------------------------------------------------------

src_reg_proc: process (clk)
begin
if rising_edge(clk) then
irq_src_reg(15 downto 2) <= (others => '0');
if reset = '1' then
irq_src_reg <= (others => '0');
else
micro_rw_bit( IRQ_SRC_REG_DESC, w_trans, irq_src_reg, 1);
micro_rw_bit( IRQ_SRC_REG_DESC, w_trans, irq_src_reg, 0);
end if;
end if; -- rising edge
end process src_reg_proc;

Where I had earlier defined a signal for irq_src_reg as well as a
descriptor:

constant IRQ_SRC_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00018");

Wherease I could build the three word-wide foobar, fiddle, and faddle
registers by writing this:

constant FOOBAR_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00000");

constant FIDDLE_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00004");

constant FADDLE_REG_DESC : MEM_RANGE_DESCRIPTOR :=
( mask => '0' & x"0FFFF", match => '0' & x"00008");

reg_trinity_proc: process (clk)
begin
if rising_edge(clk) then
if reset = '1' then
foobar_reg <= (others => '0');
fiddle_reg <= (others => '0');
faddle_reg <= (others => '0');
else
micro_rw_reg( FOOBAR_REG_DESC, w_trans, foobar_reg);
micro_rw_reg( FIDDLE_REG_DESC, w_trans, fiddle_reg);
micro_rw_reg( FADDLE_REG_DESC, w_trans, faddle_reg);
end if;
end if; -- rising edge
end process reg_trinity_proc;

The approach to building the central decoder/read mux is similar to
Jonathan's but instead of decoding directly from the descriptors, I
decoded the main bus into fixed-size address segments (say on 64K
boundaries, which matches the mask of "ffff" above). Each fixed-size
segment gets routed to one "submodule" in the design; say one
submodule has a set of registers for an interrupt controller, one
submodule has registers for a video decoder, one submodule has the
chip common version, ID, and master reset registers, etc. The main
bus interface has N transaction output ports and N data readback
ports:

entity main_bus_interface
generic ( ADDRESS_CHUNK_SIZE_IN_BITS : integer ;
NPORTS : integer)
port ( ---
external bus pins
---
txn : TRANSACTION_VECTOR (0 to NPORTS); -- e.g. 64K per port
rdata : std_logic_vector(BITS_PER_WORD downto 0) );

Oops.. rdata should be a VECTOR 0 to NPORTS of words, not a single
word, where each word is an slv(BITS_PER_WORD downto 0).

KJ · Jan 14, 2009

Jim Lewis said:
KJ,

I do this in all of my blocks / cores that are integrated into a chip.
This makes your top level mux an "OR" gate. Should you have a performance
problem, an OR at the top level should be easier to restructure than
a multiplexer with an address going to it.

My only point here was that a straight mux will be the minimal logic/routing
resource usage solution. Adding the default to zero if nothing is being
addressed (or any other solution really) will use at least as much or more
resources as compared to the straight mux, it will not use less than the mux
unless that other solution is able to map into some special hardware
resource of some kind in the device. If there is no such special beast, and
this data path is in the critical timing path, you might find that zeroing
the data bus when an unsupported address is being indexed is a luxury that
is not worth the price...whether or not it is a problem in a given design
depends on how many ports are addressable and the logic block resources of
the FPGA that is implementing them...like I said, it's a minor nit,
something to be aware of when the situation arises.

Kevin Jennings

Petrov_101 · Jan 14, 2009

It's worth noting that you can sometimes get Tcl to *be* the
parser for you. Hierarchically-structured things like
register definitions can map nicely on to a Tcl *script*:

<snip>

Quite true... one of the reasons I like the language so much. I
designed a VHDL processor years ago and needed an assembler for it. I
defined each mnemonic as a tcl procedure. My make file would concat
the mnemonic file with my assembly probram and run it through the tcl
shell. The output was a xilinx block ram preloaded with machine
instructions.

Andreas Ehliar · Jan 14, 2009

I've only seen the and-or take the same or more, never seen it take
less. I'd be interested in seeing an example of one where it took
less resources with the and-or then with the mux to perform the same
function.

I see the same results in a quick test where a mux like structure was
compared with an or-type structure). Registers were used on both the
inputs and the outputs (and the design was coded so that the reset
inputs of the input flip-flops should be used).

When optimized for speed, the area of both solutions were very similar
(within a few percent) but when synthesized for area, the or-based
structure was 8 percent larger than the plain mux.

(This is for an ASIC and not an FPGA. Other tradeoffs can sometimes
apply in an FPGA, especially if you need those registers on the inputs
anyway. And it might also be slightly different in another ASIC process
than the one I tested this on. And the synthesis tool could do something
different in a real design and not a synthetic test, etc etc. You probably
shouldn't base company critical decisions on this very quick analysis

)

/Andreas

KJ · Jan 14, 2009

OTOH, I tend to like to use separate re-usable cores that each
are independent and integrate them into a chip. As a result,
I have separate readback logic within each block and then another
level at the chip-level.

I do the same.

In this case, zeroing the outputs
(or creating AND-OR logic at the lowest level - which is more
directly what I do) generates about the same amount of
total logic that the multiplexers create (sometimes less,
sometimes more).

I've only seen the and-or take the same or more, never seen it take
less. I'd be interested in seeing an example of one where it took
less resources with the and-or then with the mux to perform the same
function.

On a somewhat more abstract level, the forcing of zeros to the data
while reading from an unsupported address removes an optimization that
would allow for a don't care (thinking more of Logic 101 type of don't
care here then the VHDL synthesis tool interpretation of don't care
'-') so for the case of a purely combinatorial address decode -> read
data function the and-or wouldn't do better than the mux. At best it
would use the same number of logic blocks but that is because of the
granularity of what typical block can work with. Accepting for the
moment that the and-or does not outperform the mux for a purely
combinatorial path, then that would leave open whether it is better
for a pipelined path. Even in that situation though I haven't run
across the case in a typical design where the mux used more resources
so like I said it would be interesting to see a realistic type of
example.

However, when zeroing outputs of blocks, then at the chip level
(interconnecting cores), only "OR" gates are needed. So the only
routing we have is data. OTOH, with multiplexers, the chip level
also uses multiplexers with control signals - which impact both
the size and routing of this logic.

You also may want to contemplate how the multiplexer control
logic will impact optimization.

I have...and also recognize that synthesis tool optimization aren't
obstructed by what we refer to as 'chip level' logic 're-usable core
level' logic.

Good points, thanks.

Kevin Jennings

KJ · Jan 14, 2009

Yes it is needed. The code I presented is an AND-OR mux,
and doesn't work without the zero default.

I only meant 'not needed' in the sense that you likely have no
functional design requirement to output all zeros on the data bus when
being presented with an invalid address input. Your decision to meet
the design requirements by implement an and-or structure then does
force the need for a default. Sorry about the confusion.

Kevin Jennings

Amal · Jan 15, 2009

On Sat, 10 Jan 2009 17:16:33 -0800 (PST), wrote:

--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
(e-mail address removed)://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.

Clever coding Jonathan. Now try doing the same thing in Synthesizable
SystemVerilog! I have had so many problems with SystemVerilog for
design. Even though it has improved a lot over Verilog and has
verification features, it still lacks many great features of VHDL.
Including unconstrained arrays as function parameters and return
values and some of the clever coding that you did above.

I like VHDL not because I like to just blindly defend a language, but
I like it because writing generic code (synthesizable) is a lot easier
that in Verilog and SystemVerilog. I would like to get your ideas on
the two language and your experiences with SystemVerilog.

-- Amal

Cash register challenge	2	Mar 8, 2022
Decoding no of ways and printing each decode message	2	Jun 1, 2021
Outputting signal values to terminal Within Character Array	0	Dec 10, 2021
Z-Index/Drop-down menu issues	2	Jun 4, 2023
Ads not showing on Youtube data API v3	0	Nov 24, 2021
Help with my responsive home page	2	Dec 14, 2022
Register variables	4	Oct 8, 2009
Dynamic Printf	11	Apr 22, 2009

Unassigned register decode

trescot

jeppe

kennheinrich

kennheinrich

Mike Treseler

KJ

Petrov_101

Mike Treseler

Petrov_101

kennheinrich

kennheinrich

KJ

Petrov_101

Andreas Ehliar

KJ

KJ

Amal

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads