Dual data rate in Xilinx WebPACK 7.1

R

Rafal Pietrak

Hi All,

I have some experience in programming (software). Now, I fetched some
books and some sources from the internet, and for my week-end
entertainment, I try to get my first experience in HDL design.... and I
fail so miserably here. HELP! pls :)

Below you'll find a skeleton of my 'processor' design. What I've tried to
achieve here is a 'two-stage' buffering: data is fetched on falling clock
edge, and outputed on rising edge. So to say: two D-FlipFlop in series,
but their clock signals in opposite phases (clock provided to one FF is
negated before it serves the other FF).

But WebPACK says, it cannot synthesize the TMP signal!! Why??

Is such simple circuit not implementable in FPGA at all?? Well, I know,
that dome CPLD *may* have just one polarity of clock available, but
WebPACK fails at synthesize stage here.

Any hints how can I achieve this sort of 'DoubleDataRate'?

--------------------------------------------------------
entity master is
Port ( data : inout std_logic_vector(3 downto 0);
addr : out std_logic_vector(7 downto 0); rd : out std_logic;
wr : out std_logic;
ale : out std_logic;
clk : in std_logic;
reset : in std_logic);
end master;

architecture Behavioral of master is
signal tmp: std_logic_vector(7 downto 0);
begin
process(clk,reset)
begin
if reset = '0' then
rd <= 'Z';
wr <= 'Z';
ale <= 'Z';
tmp <= (others => '0');
elsif rising_edge(clk) then
addr <= tmp;
elsif falling_edge(clk) then
tmp <= "0000" & data;
end if;
end process;

end Behavioral;
 
B

Brian Drummond

Hi All,

I have some experience in programming (software). Now, I fetched some
books and some sources from the internet, and for my week-end
entertainment, I try to get my first experience in HDL design.... and I
fail so miserably here. HELP! pls :)

Below you'll find a skeleton of my 'processor' design. What I've tried to
achieve here is a 'two-stage' buffering: data is fetched on falling clock
edge, and outputed on rising edge. So to say: two D-FlipFlop in series,
but their clock signals in opposite phases (clock provided to one FF is
negated before it serves the other FF).

But WebPACK says, it cannot synthesize the TMP signal!! Why??

Combining them both in the same process is the problem.
Keep each process down to a single clock edge; and find a way of
interconnecting them with signals and unclocked processes that
accomplishes what you wish.
process(clk,reset)
begin
if reset = '0' then
rd <= 'Z';
elsif rising_edge(clk) then
addr <= tmp;
elsif falling_edge(clk) then
tmp <= "0000" & data;
end if;
end process;

- Brian
 
R

Rafal Pietrak

Combining them both in the same process is the problem.
Keep each process down to a single clock edge; and find a way of
interconnecting them with signals and unclocked processes that
accomplishes what you wish.

How surprising. It works when process is split!!!

Still, I've used the 'combined' construct in single process, because
earlier, when I've split the 'reset' and 'clk' if-branches into separate
processes, the synthesizer have build something *entirely* out of the
line; but here I have to split the if-branch for a correct result .....
apparently the semantics of HDL is not what I'm acquainted with (from work
in software)... have to learn a lot :(.

Thenx for the help.
 
R

Ralf Hildebrandt

Rafal Pietrak wrote:

How surprising. It works when process is split!!!

Not surprising. ;-)

User mostly write to the same signal in both edge-triggered if-branches.
Then this is equal to a dual-edge flipflop or flipflop with two clock
inputs. Such things exist, but not in normal standard cell libraries and
FPGAs. Therefore synthesizers refuse to even look at your code, if there
is more than one edge-event included.

Your example is very special, because you don't write to one signal in
both if-branches, so a very very smart synthesizer could handle this.
But I guess no one would spend time to write an exception for such a
special case.



Addendum: I said, that dual edge flipflops are not available in common
cell libraries, but dual-edge behavior is possible (but not
recommended). But this is another topic.

Ralf
 
R

Rafal Pietrak

Rafal said:
How surprising. It works when process is split!!!

Not surprising. ;-)
[..text deleted ...]
Addendum: I said, that dual edge flipflops are not available in common
cell libraries, but dual-edge behavior is possible (but not
recommended). But this is another topic.

Yes. I think I'm a bit closer to understanding it now.

My final comment would be: I think that I've misinterpreted what VHDL
'process' is. I thought of it as an 'abstract expression' of a 'circuit
functionality' (as opposed to 'entity', which would be: a hardware
component description/reference) ... surprisingly, synthesizer treats the
'process' as a single hardware box, too; Just written down in different
words (different HDL sentences). Natural consequence is, that synthesizer
*tries*to*map* my 'process' to existing FPGA 'component', instead of
*building* something, according to my 'process specs' .... from FPGA
resources, like couple of D-FF.

/**-------------------------------------------------**/
In other words: I tried to write an "HDL sentence" that describes circuit
with *two* D-FF connected in series... and I ended up with a sentence
describing a *single* D-FF but with dual-edge clock sensitivity. My
lesson here is: a single 'process' does NOT describe an interconnect with
two FFs.

That's what I find surprising - first read (ok, first browse :) of VHDL
textbook does not give that interpretation, really.
/**-------------------------------------------------**/

PS: I write this long response just for the record. From my previous
experience I know, that a lot of 'novice mistakes' (like this one of
mine) are never exercised by gurus (sometimes, those are even not
imaginable). I hope that recapping my misunderstanding may help some
future new-bees.
 
B

Ben Jones

Rafal Pietrak said:
On Tue, 28 Feb 2006 16:59:12 +0100, Ralf Hildebrandt wrote:

In other words: I tried to write an "HDL sentence" that describes circuit
with *two* D-FF connected in series... and I ended up with a sentence
describing a *single* D-FF but with dual-edge clock sensitivity. My
lesson here is: a single 'process' does NOT describe an interconnect with
two FFs.

That's not entirely true though.

Your original process was (approximately):

process(clk,reset)
begin
if reset = '0' then
tmp <= (others => '0');
elsif rising_edge(clk) then
addr <= tmp;
elsif falling_edge(clk) then
tmp <= "0000" & data;
end if;
end process;

You should be able to write this as follows and get the behaviour you want:

process(clk,reset)
begin
if rising_edge(clk) then
addr <= tmp;
end if;

if reset = '0' then
tmp <= (others => '0');
elsif falling_edge(clk) then
tmp <= "0000" & data;
end if;
end process;

This is a single process, describing what I think you mean by "an
interconnect with two FFs". It is not unusual for a VHDL process to
represent many, many interconnected registers. If one had to write a
separate process for every flip-flop in a design, it would get very verbose
indeed! It's just unusual to have both rising- and falling-edge registers in
the same process.

(Actually, there is a subtle difference between my version and your original
code. Consider what should (or rather shouldn't) happen to "addr" on a
rising clock edge when reset = '0'. What does this imply in terms of
hardware? Is that what you meant? Does it matter?)

The real problem, I think, is that the synthesis tool doesn't realise that
rising_edge and falling_edge cannot both be true at the same time, and
therefore reads too much into your if-elsif construct. Splitting the "if"
statement up, even within a single process, should get around that.

It looks like you're thinking about this the right way, though, which is
refreshing. Good luck!

-Ben-
 
R

Rafal Pietrak

You should be able to write this as follows and get the behaviour you want:

process(clk,reset)
begin
if rising_edge(clk) then
addr <= tmp;
end if;

if reset = '0' then
tmp <= (others => '0');
elsif falling_edge(clk) then
tmp <= "0000" & data;
end if;
end process;

OK.... Works!
The real problem, I think, is that the synthesis tool doesn't realise that
rising_edge and falling_edge cannot both be true at the same time, and
therefore reads too much into your if-elsif construct. Splitting the "if"

Probably, my "how surprising" is more applicable at this point :)

Thenx again. Still, I expect I'll be back - after some experimenting :)

-R
 
K

Klaus Falser

....

You should be able to write this as follows and get the behaviour you want:

process(clk,reset)
begin
if rising_edge(clk) then
addr <= tmp;
end if;

if reset = '0' then
tmp <= (others => '0');
elsif falling_edge(clk) then
tmp <= "0000" & data;
end if;
end process;

This is a single process, describing what I think you mean by "an
interconnect with two FFs". It is not unusual for a VHDL process to
represent many, many interconnected registers. If one had to write a
separate process for every flip-flop in a design, it would get very verbose
indeed! It's just unusual to have both rising- and falling-edge registers in
the same process.
....

-Ben-

Is it not better to split this up into 2 processes?
For simulation this is certainly ok, but does it fit the pattern a
compiler for synthesis needs to understand (accordingly to the IEEE
standard) ?

Klaus
 
R

Rafal Pietrak

Couldn't help it :) Gave it another thought.... HDL semantics (or may be
just WebPACK synthesizer implementation) look even stranger to me, now.

The point is, that following your advice: if I were *reading* the HDL
code, I'd imagine a 'priority decoder':

1) when reset_active --> do something...
2) when *it's*not*, but rising_edge(clk) occur --> do something else ...
3) when this isn't happening either, but there is a falling_edge() --> do
yet another thing.

Priority encoders do exist.

There is a strong chance, that a person writing the following code meant
just that (I did). Still, as rising/falling eadges don't happen
concurrently, the priority encoder can possibly be identified by optimizer
as redundant (which IMHO, should be easier, then identifying it at
synthesizer stage) and purged before final output. But if not optimized
and purged, the circuitry, even if suboptimal, would be created correctly
nonetheless (e.g. according to author's intentions).

So it looks like I have to be *very* *very* cautious when writing HDL.
 
C

Chris S

Hi Rafal,

With respect...

The problems of understanding that you're having are typical of what
happens when S/W engineers start designing Hardware in HDL's.

HDL's don't work like software, you have to "think in hardware".

Fluency comes with experience, I'd be floundering in a similar way if I
was writing a large S/W project, thats why I'm a hardware designer.

For the record... I've done a few DDR and QDR memory interface designs,
my advice is to create independant clock phases for your capture and
launch FF's and mux the datapath using a core clock running at double
the datapath freq. You don't mention absolute clock freq's but creating
a high speed core clock and dividing down to create derived clocks for
IO and stuff is straightforward unless the design has to run at really
high speed.

I don't know if you're doing an academic project or something for a
real application, but if it is for real, then I'd stick with a
conventional approach so that you can be sure that the Synthesis & STA
tools will work correctly.

Do you understand what RTL (Register Transfer Level) means, if not I'd
suggest that you read up a bit about the RTL coding style and things
will be easier to visualise.

Of course if you're dreaming up a new technique for a paper or
something then good luck to you and I'd be interested to hear about
whatever you come up with.

Sorry if you think this post is patronising, its not intended to be,
just a bit of honest advice from an experienced engineer.

Chris
 
B

Ben Jones

Rafal Pietrak said:
Couldn't help it :) Gave it another thought.... HDL semantics (or may be
just WebPACK synthesizer implementation) look even stranger to me, now.

That is usually a sign of progress. :)
The point is, that following your advice: if I were *reading* the HDL
code, I'd imagine a 'priority decoder':
1) when reset_active --> do something...
2) when *it's*not*, but rising_edge(clk) occur --> do something else ...
3) when this isn't happening either, but there is a falling_edge() --> do
yet another thing.
Priority encoders do exist.

Granted. However, there's a world of difference between edges and levels
when it comes to synthesis. A priority encoder is a bit of combinatorial
logic that works as you describe above, but it works on levels, not edges:

if reset = '1' then
output <= "00";
elsif heads = '1' then
output <= "01";
elsif tails = '1' then
output <= "10";
else
output <= "11";
end if;

The difference is that combinatorial logic can be imagined as executing in
zero time, whereas rising_edge() and falling_edge() immediately imply some
notion of the passing of time. When a synthesis tool wants to build a piece
of combinatorial logic, it can basically use any combinatorial building
block that it wants. However, the synchronous building blocks in FPGAs (and
most digital systems) are rather more limited.

The task "transfer the value of signal A to signal B on the rising edge of
signal Clk" requires a register element - nothing else can do the job. If
you add to that "unless reset = '1', in which case assign '000' to signal
B", then a synchronous reset of that register is required. If instead you
prepend "if reset = '1' then B <= '000' else", then an asynchronous reset of
that register is required. If you go asking for
resets/sets/clears/clock-enables that aren't available, then the tool will
either try to emulate them (e.g. by adding AND gates) or will give up. The
same is true if you try to describe a transparent latch, or a double-edged
register, when such a thing isn't available in the technology.
So it looks like I have to be *very* *very* cautious when writing HDL.

Don't be too cautious - you'll never get any work done! Mostly, it comes
down to learning the necessary idioms of the language (in this case,
synthesizable VHDL). Once you've written, simulated and synthesized a few
circuits you'll wonder what you ever thought was hard about it... :)

Cheers,

-Ben-
 
B

Brian Drummond

My final comment would be: I think that I've misinterpreted what VHDL
'process' is. I thought of it as an 'abstract expression' of a 'circuit
functionality' (as opposed to 'entity', which would be: a hardware
component description/reference) ...
/**-------------------------------------------------**/
In other words: I tried to write an "HDL sentence" that describes circuit
with *two* D-FF connected in series... and I ended up with a sentence
describing a *single* D-FF but with dual-edge clock sensitivity. My
lesson here is: a single 'process' does NOT describe an interconnect with
two FFs.

That's what I find surprising - first read (ok, first browse :) of VHDL
textbook does not give that interpretation, really.

This is quite a good observation...

VHDL can describe those abstract expressions very well ... and a good
simulator like Modelsim will implement them correctly. Which is useful
for testbenches among other things - for example, you can express your
intent at the highest or cleanest level and verify its function.

But it's not so useful for actually implementing the design in hardware,
if the synthesis tool doesn't have a means of physically realising the
design. There, you need to transform the design into a lower level
description, which IS physically realisable.

The higher level description is still valuable - one use for it is to
run both implementations in parallel in a testbench, with a comparator
across their outputs to highlight any errors in the low level
implementation.

I have to agree that VHDL texts often highlight only one of these
aspects of VHDL - either the low-level aspect (so some people don't
realise VHDL has pretty good abstraction mechanisms) or the programming
language aspects - so others don't realise the compromises that have to
be made when targetting hardware...

I recommend a "VHDL synthesis guide" from your FPGA vendor of choice as
a starting point on the lower level aspects.

- Brian
 
B

Brian Drummond

Couldn't help it :) Gave it another thought.... HDL semantics (or may be
just WebPACK synthesizer implementation) look even stranger to me, now.

The point is, that following your advice: if I were *reading* the HDL
code, I'd imagine a 'priority decoder':

1) when reset_active --> do something...
2) when *it's*not*, but rising_edge(clk) occur --> do something else ...
3) when this isn't happening either, but there is a falling_edge() --> do
yet another thing.
[...]

So it looks like I have to be *very* *very* cautious when writing HDL.

Not *very* cautious.

The synthesiser will be pretty reliable at (a) delivering a reliable
hardware implementation of your design OR (b) informing you of something
it can't implement.

IMO it's not actually *wrong* to express your design at an abstract
level and in many ways it's actually a good thing - VHDL the _language_
can indeed "work like software". But be aware that VHDL the hardware
implementation tool often doesn't; it can either fail to implement the
design (which is good because it will report the error) or implement
something sub-optimal - far too large or far too slow - which is less
good.

Then you have to learn what to do different.

The "RTL" approach is a good safe way of expressing a hardware design in
VHDL, and in some cases it is necessary. I'd second the suggestion you
learn it, as a useful tool.

- Brian
 
R

Rafal Pietrak

Not *very* cautious.

Well. From the two weeks exposure to HDL, that I currently have, I'd stick
with the *very* :)

One reason is, that I don't have access to a simulator, so I try to build
*simple* circuits and then look at the RTL schematics I get from
synthesizer - most often then not, I get interconnects I didn't expected
:( So at least for now, I have to be cautious.
The synthesiser will be pretty reliable at (a) delivering a reliable
hardware implementation of your design OR (b) informing you of something
it can't implement.

I think, that my current problem is, that I cannot reliably 'speak' to
synthesizer (in VHDL). Although the synthesizer is probably very reliable,
I've only started to learn how to control it. You probably don't remember
cases when you've started with HDL, but I assure you, that at my stage of
learning, even 'innocent' synthesizer warning makes you dug into RLF
schematics looking for errors. And even in cases when I cannot find
errors, I worry if the synthesize was correct. Even today I've created
another thread with a call for help on one such case (RAM block).

So again, sticking with *very*, for now.
IMO it's not actually *wrong* to express your design at an abstract
level and in many ways it's actually a good thing - VHDL the _language_
can indeed "work like software". But be aware that VHDL the hardware
implementation tool often doesn't; it can either fail to implement the
design (which is good because it will report the error) or implement
something sub-optimal - far too large or far too slow - which is less
good.

Well, in that case, VHDL does not look too promising (at least from my
infant experience). If in real life (that is: in real designs) the
"abstract_description v.s. real_hardware" 'duality' is an omnipresent
engineer task, I'd expect HD-language to have at least minimal provisions
for conditional synthesize (or like font descriptions have hinting for
low-resolution rendering cases), but I haven't spotted it on my first
read of VHDL textbook. Are there any?

-R
 
M

Mike Treseler

Rafal said:
My final comment would be: I think that I've misinterpreted what VHDL
'process' is. I thought of it as an 'abstract expression' of a 'circuit
functionality' (as opposed to 'entity', which would be: a hardware
component description/reference) ... surprisingly, synthesizer treats the
'process' as a single hardware box, too; Just written down in different
words (different HDL sentences). Natural consequence is, that synthesizer
*tries*to*map* my 'process' to existing FPGA 'component', instead of
*building* something, according to my 'process specs' .... from FPGA
resources, like couple of D-FF.

You will get lots of opinions on this subject,
but mine is that the average design entity
ought to have exactly *one* synchronous process
composed of an if statement separating
initialization, update, and output procedures.

See the reference design here for an example of this style.

http://home.comcast.net/~mike_treseler/

Only with this style can 'abstract expression' be used
effectively for synthesis, because this requires multiple
process variables with entity scope.

I agree with you, that using multiple processes makes the same
block diagram as using multiple entities. In both cases
multiple instances are wired together. This fits the
schematic orientation that many designers prefer, but
outside the blocks, all is a netlist.

-- Mike Treseler
 
A

Allan Herriman

!! :)

Oh, that will help!

Any recommendations on which one to try first? ... based on assumption,
that I've made my first steps using Xilinx WebPACK and I'd really love to
get back from MS-Windows to Linux (WebPACK fails to install on Debian).

It might be worth staying with Windows if you are interested in the
low cost tools.

I can recommend Simili:
http://www.symphonyeda.com/editions.htm
You can run it in 'free' mode which doesn't require a license, but is
crippled.

The crippled versions of Modelsim also work (to a degree). These are
available from Xilinx and Altera as part of low cost packages.

Modelsim is also available in a native Linux version, but I don't
believe there is any way of getting that at less than full price (some
thousands of dollars for a time limited license).


If you're serious about designing with HDL, it's worth paying for a
simulator. You will use this tool more than any other (except for
your text editor).

Regards,
Allan
 
M

Mike Treseler

Allan said:
It might be worth staying with Windows if you are interested in the
low cost tools.

Simili is available for linux or windows here:
http://www.symphonyeda.com/proddownloads.htm

Either will run unlicensed in a slow, restricted mode,
but this is good enough to learn with, and it is free.
You can fully license either for $300 per year,
which is the cheapest usable linux vhdl simulator that I know of.
The crippled versions of Modelsim also work (to a degree). These are
available from Xilinx and Altera as part of low cost packages.

For a low-cost windows version, I would recommend the modelsim version
that comes with Quartus. For functional simulation, there is really
nothing Altera-specific about it.
Modelsim is also available in a native Linux version, but I don't
believe there is any way of getting that at less than full price (some
thousands of dollars for a time limited license).

That's the SE version. Many many thousands I'm afraid.
But worth the money for commercial use.
If you're serious about designing with HDL, it's worth paying for a
simulator. You will use this tool more than any other (except for
your text editor).

True. Trial and error synthesis is very tedious,
and I can't even remember how the logic analyzer
works anymore.

-- Mike Treseler
 
A

Andy

Mike,

I've seen your template, and although I prefer not to split everything
out into separate procedures for init, update, and output, I really
like the concept of a single process, even/especially for combo
outputs.

However, if there are storage elements that you do not want/need to
reset (like clb rams that cannot be reset), leaving them out of the
"init" process/clause will result in a clock disable on reset (using
synplicity at least) for those elements, which is seldom wanted either.
Synplicity warns you when this happens, but it cannot be avoided in
your template. Note that the clock disable is required to generate
hardware that behaves like the RTL in such cases, since if the reset is
active, the clocked clause will not execute.

I have found that, putting the init code at the end of the process, in
its own if-then clause, allows you to specify only those storage
elements that need reset, without the clock disable on those elements
left unreset. For example:

process (reset, clock) is
....
begin
if rising_edge(clock) then
update; -- update internal variables (regs or combos)
end if;
if reset then
init; -- reset desired flops
end if;
output; -- assign output signals
end process;

Note that you don't get a friendly reminder during synthesis that you
dropped a register from your init clause, but at least it works without
the clock disable.

Andy Jones
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top