Quartus II infered latches

jacko · Aug 13, 2008

Actually the above is not the way you'd specify the don't care on the
input either. You'd quickly find that this would fail your
simulation. What you want is the std_match function when you have
don't cares. That function will properly 'skip over' the don't cares
whereas "=" will not because the "=" function for comparing
std_logic_vectors is not overridden in that manner.

foo <= '1' when (addr = "0000") else
'1' when std_match(addr, "100-") else
'0';

KJ

I came to the conclusion that zero assignment in the don't care cases
makes the best logic, as the logic seems to implement as sums of
products. Must put all the zeros in the design, and fix q register
getting ir register on call instruction.

cheers
jacko

KJ · Aug 13, 2008

I came to the conclusion that zero assignment in the don't care cases
makes the best logic, as the logic seems to implement as sums of
products. Must put all the zeros in the design, and fix q register
getting ir register on call instruction.

Putting '0' in instead of using don't cares and the std_match function
can use more routing and more logic, it will never use less LUTs or
less routing so it's hard to see how using '0' could be the 'best'
logic. Comparing a bit to see if it is '0' (or '1') costs more than
not comparing it at all since, at a minimum, the bit must be routed to
the LUT.

KJ

Mike Treseler · Aug 13, 2008

jacko said:
I came to the conclusion that zero assignment in the don't care cases
makes the best logic, as the logic seems to implement as sums of
products.

If you are saying that you would choose
the value "1000" over "100-", I will agree,
but for these reasons:

1. The logic description,
and test requirements,
and bus interface spec will be simpler.

2. Changes to the map are less trouble.

Now, if I were down to my last LUT,
I might negotiate, but that hasn't happened
since I was targeting a 22v10.

If the values were internal states
rather than a public address, I would use an enumeration
and let synthesis pack the gates.

The argument that '0' is a better choice than '1'
would have to be one of style.
For a 4 bit decode on a FPGA, a sum of products in not needed.

-- Mike Treseler

rickman · Aug 14, 2008

If you are saying that you would choose
the value "1000" over "100-", I will agree,
but for these reasons:

1. The logic description,
and test requirements,
and bus interface spec will be simpler.

2. Changes to the map are less trouble.

Now, if I were down to my last LUT,
I might negotiate, but that hasn't happened
since I was targeting a 22v10.

But LUT usage is what this is about. The point is not that there is
another product term in the equation, the point is that there is
another *input* to the term in the equation. This can often cause
even more than one LUT to be used depending on the complexity of the
rest of the logic producing this output. I have had state machine
logic fan out in a lot of complexity. One input would go to some four
or five LUTs in the tree. If I coded it so that a given input was a
don't care in some of the terms, that could allow better packing of
the LUTs and lower the LUT usage noticeably.

But of course, this depends on the logic being coded.

If the values were internal states
rather than a public address, I would use an enumeration
and let synthesis pack the gates.

I have seen some really poor implementations for state machines. I
prefer to use one-hot encoding and manually specify the transitions.
Then I *know* the logic is fairly optimal. BTW, when using one-hot
encoding, the transitions between states map to the bits used to
specify the states. So you can create a signal for each state and
specify all of the input transitions for that state and *not* the
output transitions.

process (clk, reset) begin
if (reset = '1') then
foo <= '0';
bar <= '0';
ralph <= '0';
applesauce <= '0';
elsif (rising_edge(clk)) then
foo <= (bar and input1) or
(ralph and input2) or
(applesauce and (input3 or input4)) or
(foo and not (input5 or input6 or input7));

bar <= (foo and input5) or
(bar and not (input6 or input7));

etc...

end if;
end process;

The argument that '0' is a better choice than '1'
would have to be one of style.
For a 4 bit decode on a FPGA, a sum of products in not needed.

Yes, but this is potentially just a part of an output specification.
Otherwise why bother even thinking about it.

Rick

Andy · Aug 14, 2008

I have seen some really poor implementations for state machines. I
prefer to use one-hot encoding and manually specify the transitions.
Then I *know* the logic is fairly optimal. BTW, when using one-hot
encoding, the transitions between states map to the bits used to
specify the states. So you can create a signal for each state and
specify all of the input transitions for that state and *not* the
output transitions.

process (clk, reset) begin
if (reset = '1') then
foo <= '0';
bar <= '0';
ralph <= '0';
applesauce <= '0';
elsif (rising_edge(clk)) then
foo <= (bar and input1) or
(ralph and input2) or
(applesauce and (input3 or input4)) or
(foo and not (input5 or input6 or input7));

bar <= (foo and input5) or
(bar and not (input6 or input7));

etc...

end if;
end process;

I too have seen some really poor state machine implementations/
optimizations, but if they meet resource and timing requirements, I'd
rather write/read/maintain functionally clear, inefficiently
implemented code than functionally cryptic, efficiently implemented
code.

The code style above (individual signals for one hot states) leads to
unintentional zero- or multiple-hot states that you can't easily see.
I've been there before (that's how I used to implement state machines
in FPGA schematics a LONG time ago), and I don't want to go there
again, unless performance and/or resources absolutely dictate it.
Traditional state machine descriptions can similarly lead to
unreachable states, but at least the synthesis tool will warn you
about them.

Andy

jacko · Aug 14, 2008

I too have seen some really poor state machine implementations/
optimizations, but if they meet resource and timing requirements, I'd
rather write/read/maintain functionally clear, inefficiently
implemented code than functionally cryptic, efficiently implemented
code.

The code style above (individual signals for one hot states) leads to
unintentional zero- or multiple-hot states that you can't easily see.
I've been there before (that's how I used to implement state machines
in FPGA schematics a LONG time ago), and I don't want to go there
again, unless performance and/or resources absolutely dictate it.
Traditional state machine descriptions can similarly lead to
unreachable states, but at least the synthesis tool will warn you
about them.

Andy- Hide quoted text -

- Show quoted text -

Quartus does a one hot but not on the thing I expected it to. The
zeros in question are an output specification, as there is no need for
output if the processor is performing a read. Yes an output don't care
best would be good.

cheers
jacko

Mike Treseler · Aug 14, 2008

rickman said:
But LUT usage is what this is about.

I understand that.
My point was that these LUTs are pennies in my pocket.

But of course, this depends on the logic being coded.

Exactly.

I like to break down large state enumerations
into a smaller state register and a separate
counter or shifter register, for example.
This sometimes allows high-level simplifications
that might not be obvious otherwise.

Yes, but this is potentially just a part of an output specification.
Otherwise why bother even thinking about it.

I like to keep my output and state registers separate.
But this would be another thread,
and I think we've done it before

-- Mike Treseler

jacko · Aug 14, 2008

I understand that.
My point was that these LUTs are pennies in my pocket.

All pennies in on pocket are not in another ...

Exactly.

I like to break down large state enumerations
into a smaller state register and a separate
counter or shifter register, for example.
This sometimes allows high-level simplifications
that might not be obvious otherwise.

Yes I have now split the indirect and direct assignments into
different enumerations, and eliminated most of the don't cares needed.
Two remain. Exact combination of two differing of q, r, s, and c is
unknown. But any two better than zeros.

Now if only I could get the state machine processing to recognize the
full state machine one hot wanted of (cycle.indirect.dir) and have it
eliminate never entered hot states out of the 80, maybe things would
get even smaller.

cheers
jacko

Mike Treseler · Aug 14, 2008

jacko said:
Now if only I could get the state machine processing to recognize the
full state machine one hot wanted of (cycle.indirect.dir) and have it
eliminate never entered hot states out of the 80, maybe things would
get even smaller.

I'll bet you a beer that things
would get even smaller if you changed
your state register to an type enumeration
and let synthesis work out the details.

Last I checked, quartus and ise
use one-hot encoding for vectors
wider than two bits. The point is
that the tool takes care of the fussy
decoding details you mention and always
gets it right.

-- Mike Treseler

jacko · Aug 14, 2008

I'll bet you a beer that things
would get even smaller if you changed
your state register to an type enumeration
and let synthesis work out the details.

Last I checked, quartus and ise
use one-hot encoding for vectors
wider than two bits. The point is
that the tool takes care of the fussy
decoding details you mention and always
gets it right.

-- Mike Treseler

At present these are my enumerations

-- Build an enumerated type for the state machine
type cycle_type is (fetch, execute);

-- Indirection selector
type reg_seli is (indp, indq, indr, inds, indqq);
-- Direct selector
type reg_seld is (dirp, dirq, dirr, dirs, dirc, dirno, dircq, dirad);
type mem_op is (rd, wr);

only reg_seli is used by quartus on state machine automatic. It does
not combine as a total state machine and reduce. a reg_seli is used
first in code sequence, so I assume this is why it is picked.

cheers
jacko

jacko · Aug 14, 2008

Hi

Now at 332 LEs, by changing how the conditional read was done in the
GO test. I don't have anymore ideas how to lower this. Looks like
version 11 is the one. If the maths adds up, not using the read
register and full width memory access makes this reduce to arround 250
LEs. The design is done. Forward onto the IO and useful functioning of
the rest of the MAX II

cheers
jacko

Inferred latches questions	3	Dec 23, 2005
Modelsim vs. Synplify Pro frustrations	6	Aug 26, 2008
vhdl for data forwarding in a pipeline machine	1	Oct 9, 2003
comp.lang.vhdl FAQ part 1 of 4: general	0	Jul 8, 2003
comp.lang.vhdl FAQ part 4 of 4: glossary	0	Jul 8, 2003
comp.lang.vhdl FAQ part 3 of 4: products & services	0	Jul 8, 2003

Quartus II infered latches

jacko

KJ

Mike Treseler

rickman

Andy

jacko

Mike Treseler

jacko

Mike Treseler

jacko

jacko

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads