Hardware book like "Code Complete"?

KJ · Jul 23, 2006

Tommy Thorn said:
Really? Which of his arguments do you disagree with?

I always thought of the two-process style as being redundant, but after
reading Dr. Chu's argument, I'm revising my thinking. For one thing,
this style makes it much less disruptive to change a Moore output to a
Mealy and vise versa.

My thanks to S.C. for the reference. Good one.

But in practice one doesn't much care if any outputs are 'Mealy' or 'Moore'.
What one has is a function that needs to be implemented within specific area
(or logic resource) constraints and performance (i.e. clock cycle, Tpd, Tsu,
Tco) constraints.

Breaking what can be accomplished in one process into two (or more)
logically equivalent processes should be considered for code clarity which
can aid in support and maintenance of the code during it's lifetime as well
as for potential design reuse (although realistically re-use of any single
process is probably pretty low). Re-use happens more often at the
entity/architecture level, but the 'copy/paste/modify' type of re-use
probably happens more at the process level when it does happen.

Breaking a process into two just to have a combinatorial process to describe
the 'next' state and a separate process to clock that 'next' state into the
'current' state has no particular value when using design support and
maintenance cost as a metric of 'value'. Since the different methods
produce the final end logic there is no function or performance advantage to
either approach. On the other hand, there are definite drawbacks to
implementing combinatorial logic in a VHDL process. Two of these are
- Introduction of 'unintended' latches
- Missing signals in the sensitivity list that result in different
simulation versus synthesis results

Both of these drawbacks have manual methods that can be used to try to
minimize them from happening but the bottom line is that extra effort
(a.k.a.. cost or negative value) must be incurred to do this....all of which
is avoided by simply not using the VHDL process to implement combinatorial
logic (i.e. the 'next' state computation).

So as far as the VHDL language is concerned, there are real costs that will
be incurred every time the two process method is used but no real value
add....or at least that's my 2 cents.....

KJ

Mike Treseler · Jul 23, 2006

KJ said:
Breaking what can be accomplished in one process into two (or more)
logically equivalent processes should be considered for code clarity

I find the idea of describing both the "Q side" state
and the "D side" next state more confusing than clarifying.

which
can aid in support and maintenance of the code during it's lifetime as well
as for potential design reuse (although realistically re-use of any single
process is probably pretty low). Re-use happens more often at the
entity/architecture level,

....one reason that I favor single process entities.

but the 'copy/paste/modify' type of re-use
probably happens more at the process level when it does happen.

....for this type of reuse, I use procedures.
This provides some modularity and built-in documentation.
Even if the operation is not reused, well-named
procedures in process scope make code easier to read and maintain.

Breaking a process into two just to have a combinatorial process to describe
the 'next' state and a separate process to clock that 'next' state into the
'current' state has no particular value when using design support and
maintenance cost as a metric of 'value'. Since the different methods
produce the final end logic there is no function or performance advantage to
either approach. On the other hand, there are definite drawbacks to
implementing combinatorial logic in a VHDL process. Two of these are
- Introduction of 'unintended' latches
- Missing signals in the sensitivity list that result in different
simulation versus synthesis results

Both of these drawbacks have manual methods that can be used to try to
minimize them from happening but the bottom line is that extra effort
(a.k.a.. cost or negative value) must be incurred to do this....all of which
is avoided by simply not using the VHDL process to implement combinatorial
logic (i.e. the 'next' state computation).

So as far as the VHDL language is concerned, there are real costs that will
be incurred every time the two process method is used but no real value
add....or at least that's my 2 cents.....

Well said. Thanks for the posting.

-- Mike Treseler

KJ · Jul 23, 2006

Breaking what can be accomplished in one process into two (or more)

I find the idea of describing both the "Q side" state
and the "D side" next state more confusing than clarifying.

I find the 'D/Q' side coding less usefull too. I was referring more to
having multiple clocked processes rather than the 'two process' discussion
that is going on here. Even though the multiple clocked processes that I
meant can all be logically thought of as one process I tend to break them up
into the multiple clocked processes simply to group together somewhat
related functions. Somewhat like the way a story/chapter is broken up into
multiple paragraphs that express a particular thought.

KJ

Ian Bell · Jul 24, 2006

Davy said:
Hi all,

Is there some hardware RTL book like "Code Complete" by Steve
McConnell?

Unlikely. Creating software is far less complex and variable than creating
hardware.

Ian

KJ · Jul 24, 2006

But using lower level of abstraction (i.e., separating comb logic and

register) has its advantage. For example, if we want to add scan chain
(for testing) to a synchronous system after completing the design, we
can simply replace the regular D FFs with dual-input scan FFs in a
two-process code. This will be messy for one-process code.

Maybe I'm missing something but in the one process code you would add the
following...
if (Scan_Chain_Active = '1') then
-- Assign the clocked signals here per how it's needed for scan chain
else
-- Assign the clocked signals here per how the design needs it to
function
-- (i.e. this was what was here prior to adding scan)
end if;

I don't see that as any particular burden or how that would be any different
in the clocked process of the two process approach. It certainy isn't any
messier.

Another example is to use a meta-stabile hardened D FF for
synchronization.

Kinda lost me on what you mean or how the two process approach would
help have any advantage over the one process approach.

They even go to the
extreme to suggest that all memory elements should be separately
instantiated. This is may be one reason that "Reuse Methodology
Manual" I mentioned in an earlier message also recommends two-process
style.

And I think that is the crux of the debate between the 'one process' versus
'two process' camps, the 'two process' folks can't explicitly state any real
advantage to their approach even though they recognize real costs to doing
so....to be blunt, you're paying a price and getting nothing in return. In
this particular case, can you explain what benefit is received for
separately instantiating each memory element?

In reality, since either approach yields the exact same function and
performance the debate centers solely on productivity....which method, can
one be more productive with and why? Productivity is economoics and
measuring productivity means using currency ($$, Euros, etc) as your basis
for comparison. You can't really measure productivity without also bringing
up which tools are being used either since use of a 'better' tool may
involve being somewhat less efficient on one end with the overall goal that
the entire process is more efficient.

Comparing the two process approach to the one process the basic difference
is that you have several more lines of code to write:
- Declarations for the additional set of signals (i.e. the 'D' and 'Q'
instead of just the 'Q')
- Adding the 'default' values for each of the 'D' equations so as to not
create unintended latches.
- The 'Q <= D;' code in the second (i.e. clocked) process of the two process
approach.
- Process sensitivity list maintenance (in VHDL)

Now take design creation and debug as an activity to measure cost. For
every line of code written, one can assume that there is a certain
probability that it will be wrong. For a given designer using a given
language using a given coding technique one will likely tend to have some
particular error rate measured in errors per line of code. Since errors
need to be fixed at some point you incur some cost to doing so. If nothing
else, the two process approach encourages more lines of code and therefore a
higher probability of having an error than the one process approach so what
does the two process approach bring to the table that would lower the error
rate that might somehow offset the increased lines of code?

One can take other tasks like scan chain insertion as you did, or test
creation, life cycle support etc. whatever tasks you can think of and try to
compare costs of those tasks using the two approaches and go through the
same process to come up with a $$ measure for each task. Total up the tasks
and see which approach is better.

Now if someone in the two process camp can walk through a single task and
show how even that one task is more efficient (i.e. productive) than the one
process approach you might be able to start convincing readers until
then....well one can always dream.

KJ

KJ · Jul 24, 2006

The two-process style is infinitely more prone to latches.

Not if you assign default values to all signals in the beginning of the
process.
On the other hand, in one-process code every signal within the clocked
process infers an FF, regardless you need it or not. A variable within
the clocked process may infer an FF (if used before assigned) or may
not (if assigned before used).

Translation: More work to accomplish the same thing with no benefit.

Missing a signal in sensitivity list for combinational circuit is a bad
practice and has nothing to do with one or two processes.

Again, more work is required (sensitivity list maintenance) to
accomplish the same thing with no benefit.

This is true. It is a small price for clarity.

Once again more work is required (the additional signal declarations
result in more lines of code) to accomplish the same thing with no
benefit.

As an aside, the signal declaration price can be minimized using the
approach outlined below. I don't really recommend it because it makes
debugging in a simulator (or perhaps just Modelsim) much tougher and
you still have more lines of code (translation: more chances for
error) than with the one process approach (using either variables or
consurrent signal assigns for the combinatorial outputs).

Instead of having discrete signals (i.e. A, B, etc.) and then doubling
that number to have a 'D' and a 'Q' version like this...

signal A_d, A_q: std_ulogic;
signal B_d, B_q: std_ulogic;

Instead have the following types....

type t_THE_SIGS is record --***
A: std_ulogic;
B: std_ulogic;
end record; --***
type t_PRE_POST_FF is record --***
D: t_THE_SIGS; --***
Q: t_THE_SIGS; --***
end record; --***
signal All_My_Sigs; --***

Then the 'unclocked' process in the two process model is calculating
the signal All_My_Sigs.D; the 'clocked' process is simply
process(Clock)
begin
if rising_edge(Clock) then
All_My_Sigs.Q <= All_My_Sigs.D; --***
end if;
end process;

This makes the overhead related to handling double the number of
signals be exactly 8 lines of code (the ones that I marked with
"--***") regardless of how many signals we're talking about. 8 extra
is still 8 extra not 8 less so it's still more work and there are some
things that look nice about code written this way, but actually using
this approach has a few drawbacks that are killers
- Can be cumbersome to debug (try it with your fav simulator and you'll
probably see why)
- It generally can not be used at the top level of the design at all so
there will need to be additional code to connect up the top level
ports.
- Even on internal entities it can be difficult to use because you'll
generally need to split things further into 'in', 'out' and 'inout' for
each 'D' and 'Q'. Certain special cases might work but in general it
will mean additional code just to connect up a higher level entity to a
lower level.

I believe the constant of 8 extra lines of code is about the best that
one could hope for so I've minimized one aspect of the drawbacks only
to have it pop out with yet more code and cumbersome quirks. As a
general rule, I don't think having two sets of signals where only one
is actually needed is in any way a 'small price' to pay if this is code
written by a human. If it's machine generated code and it confers some
advantage to the tool to generate code in that manner than I have no
trouble with that.

The two-process style surely simulates slower. However, since the code
is in RTL level, not synthesized cell net list, it should be tolerable.

So after putting in the extra work....the simulation runs slower....but
probably not intolerably slower....not exactly a benefit here either.

KJ

Andy · Jul 24, 2006

Although scan chains can be inserted at the RTL source level, they are
usually best handled at the gate level, and are thus irrelevent to RTL
coding styles. Besides, as KJ has pointed out, the same structure that
infers the scan chain in a separate clocked process can be applied in a
combined process, with little or no impact to the functional code (i.e.
surround it with an if/then/else). Choice of types of registers
(metastable hardened, etc.) can be controlled by constraints/attributes
anyway, no matter how you infer them.

Any verification guide written for Verilog will naturally have a bias
towards the flexibility of two-process descriptions, since Verilog
lacks a safe way to perform blocking assignments (handled admirably
with variables in VHDL), which restore the "lost" flexibility in single
process descriptions.

The reuse guide has its foundations in tool capabilities (or
limitations) that are more than 10 years old! Separate combinatorial
and clocked descriptions were the only method available in the first
synthesis tools, when registers could not be inferred from RTL anyway,
and had to be instantiated. Thus, the combinatorial logic had to be
separated out, and it was a smaller leap for the tools (and their
users) to inferring storage from simple, Q <= D clocked processes,
still separate from the combinatorial logic. Mainstream synthesis tools
have progressed far beyond that (OTOH, the last time I looked, synopsys
still could not infer RAM from RTL arrays!)

I prefer not to limit my VHDL descriptions to the manner in which I
would write them in Verilog.

Andy

radarman · Jul 24, 2006

KJ said:
But in practice one doesn't much care if any outputs are 'Mealy' or 'Moore'.
What one has is a function that needs to be implemented within specific area
(or logic resource) constraints and performance (i.e. clock cycle, Tpd, Tsu,
Tco) constraints.

Breaking what can be accomplished in one process into two (or more)
logically equivalent processes should be considered for code clarity which
can aid in support and maintenance of the code during it's lifetime as well
as for potential design reuse (although realistically re-use of any single
process is probably pretty low). Re-use happens more often at the
entity/architecture level, but the 'copy/paste/modify' type of re-use
probably happens more at the process level when it does happen.

Breaking a process into two just to have a combinatorial process to describe
the 'next' state and a separate process to clock that 'next' state into the
'current' state has no particular value when using design support and
maintenance cost as a metric of 'value'. Since the different methods
produce the final end logic there is no function or performance advantage to
either approach. On the other hand, there are definite drawbacks to
implementing combinatorial logic in a VHDL process. Two of these are
- Introduction of 'unintended' latches
- Missing signals in the sensitivity list that result in different
simulation versus synthesis results

Both of these drawbacks have manual methods that can be used to try to
minimize them from happening but the bottom line is that extra effort
(a.k.a.. cost or negative value) must be incurred to do this....all of which
is avoided by simply not using the VHDL process to implement combinatorial
logic (i.e. the 'next' state computation).

So as far as the VHDL language is concerned, there are real costs that will
be incurred every time the two process method is used but no real value
add....or at least that's my 2 cents.....

KJ

After reading the arguments here, I have started using a mixed
approach, but I still use the two process model for state machines and
other complex logic. There are just too many times when I don't want to
wait until the next clock for an output to take affect.

At work, we have a hard requirement (as in, it won't pass a peer
review) to write in the two process model. Another hard requirement
(that I agree with) is that there should only be one clocked process
per clock. The guy that came up with the requirement predates HDL's in
general - and I'm sure there was a good reason for it at one time.

However, for my home projects, I tend to mix the one and two-process
models based on what is most convenient.

Having done so, I don't see that the two-process model is that terribly
inconvenient. I simply place a default condition at the beginning of
the process, and override the default as needed. For most processes,
this adds maybe 1-10 "extra" lines.

Perhaps it's because I was taught in the two-process model, but I find
it easier to understand what is going on when I use it, so anything
that requires me to think, I use a separate combinatorial process for.
Simple logic, like counters, pipeline registers, etc. goes into the
appropriate clocked process.

For me, this mixed approach works pretty well.

KJ · Jul 24, 2006

radarman said:
After reading the arguments here, <snip> There are just too many times
when I don't want to wait until the next clock for an output to take affect.

Stop being so impatient

Another hard requirement
(that I agree with) is that there should only be one clocked process
per clock. The guy that came up with the requirement predates HDL's in
general - and I'm sure there was a good reason for it at one time.

This struck me as kind of odd. I understand that if there is a hard
requirement to do it this way then you either will do it that way or
seek other employment. I also don't find it hard to believe either
that the reason for the requirement for may predate HDLs and there was
a good reason for it at one time.

What I find odd though is that you agree with it. Based on your other
posts to this group I could see that you could 'accept' that this is
how you have to do it but I guess I'm surprised that you would 'agree
with' something that you don't know what the reasoning is...doesn't
seem like 'radarman' talking.

By the way, if you do find the 'good' reason for having physically only
one clocked process I'd be curious to hear what it is. Although I put
myself in the 'one process' camp my 'one process' tends to be several
physical processes all clocked by the same clock. From a logic
synthesis/simulation perspective those multiple processes are all
logically 'one' process but breaking them up into physically separate
processes I find makes it easier to understand and debug.

KJ

mcgrigor · Jul 24, 2006

Hi Andy et al.,

I am relatively new to VHDL coding and have used the two-process
approach for state machines for all that time. Where can I find a good
reference on the one-process approach or a good example of it? I'm
interested to learn more about it.

Thanks.

J o h n _ E a t o n (at) hp . com (no spaces) · Jul 24, 2006

Mike said:
Eric wrote:

A few hundred copies would fly off the shelves.

There's probably about 10,000 digital designers
in the US. Not all of those do hardware description
and not all of those write their own RTL.
Those are not numbers that would excite
a major publisher.

It can't justify the cost and delay of hard-copy but
could easily be sold as a pdf. Forget the idea of
publishing a single work. This should be a work
in progress that is always adding new chapters and
ideals and evolving with the industry.

Writing and editing a book is two long years
of work, whatever the subject.

-- Mike Treseler

Two long man years. Quadruple that if you have multiple
authors that need to coordinate their works and then
divide by the number of authors on the project.

Many hands make light work. Get a couple of dozen of
experienced designers, a bunch of proof reading fact
checkers and one decent editor and you got yourself
an open source book writing project.

Put it out on sourceforge for free and make it useful for any
digital designer at any stage in their career or hobby.Do
a really good job and it can become the "bible" of the
industry that everyone has in the library.

Do we have the critical mass to pull something like this off?

Lets do a survey. If we did a cooperative book on digital
design guidelines then what chapters would you like to see
and or contribute.

my list:

1) Reset systems and how they work.

2) Designing logic for scan testing (atpg)

3) Designing bist test logic

4) Designing and using jtag test logic

5) Designing and using in-chip diagnostic and debugging logic

John Eaton

Andy · Jul 24, 2006

With variables, you don't have to wait an extra clock in single process
descriptions.

Andy

Andy Glew · Jul 24, 2006

My impression is that hardware people don't like to write much, and

even if they do, they don't have time to sit down and document all of
the important "big issues" that new people need to learn in order to be
effective.

But if anyone writes a book like this it will fly off the shelves!

Care to estimate the size of the market?

I.e. how much would the author expect to make, given typical publishing contracts?

(I've long wanted to write such a book, but have trouble with the
business case - i.e. persuading my wife. And, of course, I cannot
write it as an employee of Intel.)

Mike Treseler · Jul 24, 2006

mcgrigor said:
I am relatively new to VHDL coding and have used the two-process
approach for state machines for all that time. Where can I find a good
reference on the one-process approach or a good example of it? I'm
interested to learn more about it.

See the procedure tx_state in the reference design source here:

http://home.comcast.net/~mike_treseler/

-- Mike Treseler

radarman · Jul 24, 2006

Another hard requirement

This struck me as kind of odd. I understand that if there is a hard
requirement to do it this way then you either will do it that way or
seek other employment. I also don't find it hard to believe either
that the reason for the requirement for may predate HDLs and there was
a good reason for it at one time.

What I find odd though is that you agree with it. Based on your other
posts to this group I could see that you could 'accept' that this is
how you have to do it but I guess I'm surprised that you would 'agree
with' something that you don't know what the reasoning is...doesn't
seem like 'radarman' talking.

I've been bitten in the hindquarters with cross-clock domain problems
too many times. I now have one clocked process per clock. While I try
to avoid cross-clock asynchronous resets, occasionally (especially in
CPLD's) there is no other choice.

I find that by doing this, I catch clock domain crossing errors more
easily. For example. On one design, I had a bus interface running on a
fixed clock (32MHz), internal logic on another fixed clock (40MHz), a
state machine that controlled a digital synthesizer (DDS 9858) that ran
on the DDS utility clock (40MHz, but asynchronous to the previous
clock), and a whole other section of code that ran on the clock
generated by the synthesizer (approximately 40MHz, but variable within
a band).

To summarize:
Local bus clock -> 32MHz.
Local bus I/F and control -> 40MHz (fixed)
DDS controller -> 40MHz (frequency locked, but not phase locked to the
previous)
"special logic" -> 38MHz < f < 42MHz from DDS.

Now, in most cases I was able to limit the crossings. There were a few
F40 <-> V40 crossings due to the nature of the design, but I tried to
restrict those to a handful of sentinel signals with strict timing, and
used rate-change FIFO's where timing wasn't critical. To simplify
things a bit, I created two copies of the bus interface - one for the
40MHz fixed, and another for the 40MHz variable.

However, in the actual DDS controller, I had all three clocks in play
at once. My design had to be able to frequency and phase lock the DDS
output frequency to the fixed 40 (which is quite doable, even if the
design to do so "looks" terrible). To make things more fun, it had to
issue the controls to the DDS aligned with the DDS utility clock - so
the state machine ran on that clock domain. Thankfully, the DDS utility
clock was reliably 90 degrees out of phase with the F40 domain, so I
could design a proper clock boundary crossing circuit.

The one time I fooled around with multiple clocked processes and
multiple clock domains, I ended up spending days (weeks?) debugging a
pernicious little problem that took hours of operation to fail. I
accidentally clocked something that was in the fixed 40MHz domain with
the variable 40MHz. It took a long time to debug that problem because
the per-operation error was so small. Unfortunately, even though the
error was only +/- a few nanoseconds worst case, it was cumulative, and
would cause the system to fail eventually.

While I would have been given leave to violate the design guidelines
with that module, given the complexity of the clock domain crossings, I
found it was better to follow them. Eventually, it dawned on me that a
certain register was in the wrong process. By putting all of the
registers for each domain in a single process, I was able to wrap my
head around the problem more easily.

As a result of that experience, I also started adding the clock domain
as a suffix to the signal name in complex designs. It leads to long
names, but it is worth the effort.

Mike Treseler · Jul 24, 2006

KJ wrote:

By the way, if you do find the 'good' reason for having physically only
one clocked process I'd be curious to hear what it is.

No signal declarations or direction
conflicts to worry about.
Output register variable values
are assigned directly to out ports.

-- Mike Treseler

mcgrigor · Jul 24, 2006

See the procedure tx_state in the reference design source here:

http://home.comcast.net/~mike_treseler/

-- Mike Treseler

Thanks Mike.

Very interesting.

Mike Treseler · Jul 24, 2006

Andy said:
Care to estimate the size of the market?

2000-4000 copies over 2 years.

I.e. how much would the author expect to make, given typical publishing contracts?

Maybe $4 per book.

(I've long wanted to write such a book, but have trouble with the
business case

There isn't one.
You would have to do it for love, not money.

And, of course, I cannot
write it as an employee of Intel.

I expect that would be negotiable.
RTL design is hardly proprietary.

-- Mike Treseler

Dean Kent · Jul 24, 2006

At least one. I'd buy it. ;-).

Suggestion: Do it like the guy who wrote "Thinking in JAVA" did. As he
finished a chapter, he posted it on his website for people to review. He
got tons of corrections and feedback, which he then used to update the
chapter. At the end of it all, he published a hardcopy and to his surprise
found out he sold more than expected - because everyone who had read it
online wanted a copy on his/her desk, and because it was online originally
it was highly anticipated when it was actually published.

Seems counter-intuitive, but I believe it would work for a tome such as
this... (and if you did it right, you could generate a few bucks before it
every gets published, wink-wink, nudge-nudge, hint-hint,
know-what-I-mean--know-what-I-mean?).

Regards,
Dean

David Kanter · Jul 24, 2006

[snip]

I think what Dean means to say is that he knows an excellent, reputable
website that would put something online and attract a lot of good
readers : )

And if you do a good job, I bet you could persuade Intel to start
buying up copies in bulk for NCGs.

DK

Address Book System Help	1	Nov 18, 2022
Hardware vs simulation mismatch problem	3	Feb 17, 2011
Is there a better book to learn javascript?	2	Nov 24, 2019
How to clarify algorithm of hardware(HDL)?	2	Apr 16, 2006
Query related to division of 1.122 in mentioned code.	0	Mar 21, 2017
[ANN] LOOPGEN-Fast hardware looping VHDL IPs	0	Jun 13, 2013
Advent of Code 2023	3	Dec 1, 2023
Please complete a task, I have one task for you	0	Apr 21, 2023

Hardware book like "Code Complete"?

KJ

Mike Treseler

KJ

Ian Bell

KJ

KJ

Andy

radarman

KJ

mcgrigor

J o h n _ E a t o n (at) hp . com (no spaces)

Andy

Andy Glew

Mike Treseler

radarman

Mike Treseler

mcgrigor

Mike Treseler

Dean Kent

David Kanter

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads