Beginner question: What trigs processes

Discussion in 'VHDL' started by Jerker Hammarberg, Jul 17, 2003.

  1. (I'm sorry if this message appears several times - problems with newsgroups client)

    Hello all! I'm learning VHDL but despite my thorough search through several
    books I can't find the answer to the following pretty basic question: What
    exactly causes a process (with a sensitivity list) to run, and can it be run
    several times in the same point in time?

    Consider the following example:
    (Multiplier has two 16-bit inputs a and b and one 32 bit output p)

    01: architecture RTL of Multiplier is
    02: signal multin1: integer range -32768 to 32767;
    03: signal multin2: integer range -32768 to 32767;
    04: signal multout: integer;
    05: begin
    06: multout <= multin1 * multin2;
    07: process (a, b, multout)
    08: variable c: integer := 0;
    09: variable d: integer := 0;
    10: begin
    11: multin1 <= a;
    12: multin2 <= b;
    13: p <= multout;
    14: c := c + 1;
    15: d := d + p;
    16: end process;
    17: end;

    (I know it's stupid but let's use it for the purpose of explanation.) Let's
    say that new data arrive to a and b at a certain point in time. Will the
    process run once or twice? What will c and d end up being? My guess is c = 2
    and d = undefined, but then I wonder if this still holds in the synthesized
    system?

    /Jerker
     
    Jerker Hammarberg, Jul 17, 2003
    #1
    1. Advertising

  2. > Hello all! I'm learning VHDL but despite my thorough search through
    > several books I can't find the answer to the following pretty basic
    > question: What exactly causes a process (with a sensitivity list) to run,
    > and can it be run several times in the same point in time?


    Ooooh, lemme try to answer this one. :)
    There's two senses of time in VHDL: normal time (in seconds, nanoseconds,
    etc.) and delta time. Delta time are basically sequential execution moments
    at the same point in normal time. The following happens:

    LOOP
    LOOP while there is activity on any of the signals
    all processes are called to see if there has been activity on any of the
    signals in their sensitivity lists, and are executed accordingly
    increase time t to t + 1 delta
    END LOOP
    time t increases from t to t + 1 (second, pico second, whatever)
    END LOOP

    In your example, the process will run once or twice, depending on when a and
    b change. Example:

    a <= '0', '1' AFTER 2 ns;
    b <= '1', '0' AFTER 2 ns;

    In this case, the process will run once, since both a and b change at time =
    2ns;

    a <= '0', '1' AFTER 2 ns;
    b <= NOT( a );

    In this case, the process will run twice, since a changes at time = 2 ns,
    and b changes at time = 2 ns + 1 delta.

    Hope this helps. :)

    Regards,

    Pieter Hulshoff
     
    Pieter Hulshoff, Jul 17, 2003
    #2
    1. Advertising

  3. Oh wait... let me answer my own question. Since multin1 and multin2 don't
    change at delta 2, there is no activity that can trig the multiplicaton in
    the next delta and therefore it will stop.

    So I guess I'm back at my original question: Is it true that the process
    will run twice (although a and b arrives at the same simulation time)? And
    if so, how will the synthesis tool implement this - will it place two copies
    of the logic of the process on the chip?

    /Jerker
     
    Jerker Hammarberg, Jul 17, 2003
    #3
  4. > Oh wait... let me answer my own question. Since multin1 and multin2 don't
    > change at delta 2, there is no activity that can trig the multiplicaton in
    > the next delta and therefore it will stop.


    That is correct.

    > So I guess I'm back at my original question: Is it true that the process
    > will run twice (although a and b arrives at the same simulation time)?


    Yes, it does. I had not taken the multout process into account in the
    example I gave you.

    > And if so, how will the synthesis tool implement this - will it place two
    > copies of the logic of the process on the chip?


    Now we're at a different ballpark: how does synthesis handle this?

    First of all, as c and d are never read, they will be ignored, and their
    logic removed. Your process is little other than signal name changes, and
    will most likely be ignored as well, though they might show up as wires in
    the netlist. Your multout process is also combinatorial, so it will be
    implemented as a straight multiplier. Basically what you'll end up with is
    a combinatorial (no clock) multiplier, probably just like you intended. :)

    Regards,

    Pieter Hulshoff
     
    Pieter Hulshoff, Jul 18, 2003
    #4
  5. > What if c and d WERE read? Let's place a "c_out <= c;" statement in the
    > end of the process, where c_out is a 32 bit output port. Now, will the
    > implementated system have two copies of the logic for c inside? Or maybe
    > it's not even synthesizable at all? If so, is there a "synthesizability
    > rule" so that I can predict this?


    Well, I can't really think of a proper application for this, but I doubt it
    would synthesize. If you can give me a proper application for something
    like this though (describe the behaviour of what you'd want to build) I
    should be able to give you proper code for it. :)

    I don't know of a general rule of thumb for what is synthesizable, although
    there are many rules of things that don't synthesize. Pure combinatorial
    logic that needs to hold a value usually leads to interesting results
    though. :)

    Regards,

    Pieter Hulshoff
     
    Pieter Hulshoff, Jul 18, 2003
    #5
  6. Jerker Hammarberg wrote:

    > If so, is there a "synthesizability rule" so that I can predict this?



    A minimal synthesis is a entity port assigned to a constant:
    my_port_pin <= '1';

    Input assignments and any subsequent processing that affects
    no output pins on the device, synthesizes to nothing.


    I agree with Pieter that "What will these equations make?"
    is the wrong question.


    -- Mike Treseler
     
    Mike Treseler, Jul 18, 2003
    #6
  7. > Well, I can't really think of a proper application for this, but I doubt
    it
    > would synthesize. If you can give me a proper application for something
    > like this though (describe the behaviour of what you'd want to build) I
    > should be able to give you proper code for it. :)


    You're damn right, the example doesn't really make sense! Well basically
    what I'm actually trying to do is to implement a complicated mathematical
    function containing several multiplications, additions etc. To save chip
    space, I use only one multiplier and a state machine to control the access
    to it. So in the first state, the FPGA will sample the input variable, and
    some clock cycles later it will have reached the last state and will output
    the result. Then it starts anew. There's also feedback involved, so some
    partial results during the calculation will be used for the next round.

    I won't state the whole function here, but I can give a minimal example that
    should make sense: Let's say three 16-bit integers a, b and c are repeatedly
    sampled, multiplied and accumulated to a 32 bit accumulator o as follows:

    o = (last value of o) + a * b * c

    Since I only want to use one multiplier, the calculation will be done over
    two clock cycles. Then I would like to write as follows:

    architecture RTL of Function is
    signal multin1: integer range -32768 to 32767;
    signal multin2: integer range -32768 to 32767;
    signal multout: integer;
    signal state: bit := '0';
    signal next_state: bit;
    begin
    multout <= multin1 * multin2;
    process (a, b, multout, state)
    variable c_saved: integer range -32768 to 32767;
    variable o_accum: integer := 0;
    variable axb: integer;
    begin
    case state is
    when '0' =>
    c_saved := c;
    multin1 <= a;
    multin2 <= b;
    axb := multout;
    next_state <= '1';
    when '1' =>
    multin1 <= axb;
    multin2 <= c_saved;
    o_accum := o_accum + multout;
    o <= o_accum;
    next_state <= '0';
    end case;
    end process;
    process (clk)
    begin
    if clk'event and clk = '1' then
    state <= next_state;
    end if;
    end process;
    end;

    But I understand now that I can't do it like this, because when state '1' is
    clocked in, the process will be run twice, thus adding first the old, then
    the new value of multout to o_accum. Just out of curiosity I would like to
    know if this actually happens in the implemented system too?

    And maybe I'm asking too much now but... I would be really grateful to see
    the best way to rewrite this so that it works correctly!

    /Jerker
     
    Jerker Hammarberg, Jul 18, 2003
    #7
  8. I can't even begin to think of what your code would synthesize into, if it
    would synthesize at all, which I highly doubt. :)

    Ok, let's see here:

    First step: any value you need to store for a 2nd run needs to be in a
    clocked process.

    Second step: you need an indication of when your two step process starts.
    Initial values don't synthesize, so you need some kind of enable to start
    your process and/or resynchronize it. I'll assume a, b, and c are available
    for both clock cycles, and that their FlipFlops don't have any logic
    between them and this design unit.

    Third step: avoid combinatorial loops like the plague! Don't have a signal
    in a combinatorial process loop back to itself. It's deadly.

    As I like to integrate combinatorial and clocked logic, I'd build your
    function like this (using the integers, though I personally prefer using
    signed and unsigned). I also like using port type BUFFER, as this is custom
    practice in our company. I'll even add a synchronous reset for you:

    ENTITY function IS
    PORT
    (
    clk : IN std_logic;
    enable : IN std_logic;
    reset : IN std_logic;
    a : IN integer RANGE -32768 TO 32767;
    b : IN integer RANGE -32768 TO 32767;
    c : IN integer RANGE -32768 TO 32767;
    o : BUFFER integer
    )
    END ENTITY function;

    ARCHITECTURE rtl OF function IS
    SIGNAL state : std_logic;
    SIGNAL axb : integer;
    BEGIN
    PROCESS
    BEGIN
    WAIT UNTIL clk = '1';
    IF enable = '1' OR state = '0' THEN
    state <= '1';
    axb <= a*b;
    ELSE
    state <= '0';
    o <= o + axb * c;
    END IF;
    IF reset = '1' THEN
    state <= '0';
    o <= 0;
    END IF;
    END PROCESS;
    END ARCHITECTURE rtl;

    Hope this helps.

    Regards,

    Pieter Hulshoff
     
    Pieter Hulshoff, Jul 18, 2003
    #8
  9. Jerker Hammarberg

    FE Guest

    You should define c_saved, axb and o_accum as DFF (under a clocked process)
    and not as LATCH (under a combinatorial process).

    FE


    "Jerker Hammarberg" <> wrote in message
    news:7bYRa.16773$...
    > > Well, I can't really think of a proper application for this, but I doubt

    > it
    > > would synthesize. If you can give me a proper application for something
    > > like this though (describe the behaviour of what you'd want to build) I
    > > should be able to give you proper code for it. :)

    >
    > You're damn right, the example doesn't really make sense! Well basically
    > what I'm actually trying to do is to implement a complicated mathematical
    > function containing several multiplications, additions etc. To save chip
    > space, I use only one multiplier and a state machine to control the access
    > to it. So in the first state, the FPGA will sample the input variable, and
    > some clock cycles later it will have reached the last state and will

    output
    > the result. Then it starts anew. There's also feedback involved, so some
    > partial results during the calculation will be used for the next round.
    >
    > I won't state the whole function here, but I can give a minimal example

    that
    > should make sense: Let's say three 16-bit integers a, b and c are

    repeatedly
    > sampled, multiplied and accumulated to a 32 bit accumulator o as follows:
    >
    > o = (last value of o) + a * b * c
    >
    > Since I only want to use one multiplier, the calculation will be done over
    > two clock cycles. Then I would like to write as follows:
    >
    > architecture RTL of Function is
    > signal multin1: integer range -32768 to 32767;
    > signal multin2: integer range -32768 to 32767;
    > signal multout: integer;
    > signal state: bit := '0';
    > signal next_state: bit;
    > begin
    > multout <= multin1 * multin2;
    > process (a, b, multout, state)
    > variable c_saved: integer range -32768 to 32767;
    > variable o_accum: integer := 0;
    > variable axb: integer;
    > begin
    > case state is
    > when '0' =>
    > c_saved := c;
    > multin1 <= a;
    > multin2 <= b;
    > axb := multout;
    > next_state <= '1';
    > when '1' =>
    > multin1 <= axb;
    > multin2 <= c_saved;
    > o_accum := o_accum + multout;
    > o <= o_accum;
    > next_state <= '0';
    > end case;
    > end process;
    > process (clk)
    > begin
    > if clk'event and clk = '1' then
    > state <= next_state;
    > end if;
    > end process;
    > end;
    >
    > But I understand now that I can't do it like this, because when state '1'

    is
    > clocked in, the process will be run twice, thus adding first the old, then
    > the new value of multout to o_accum. Just out of curiosity I would like to
    > know if this actually happens in the implemented system too?
    >
    > And maybe I'm asking too much now but... I would be really grateful to see
    > the best way to rewrite this so that it works correctly!
    >
    > /Jerker
    >
    >
     
    FE, Jul 19, 2003
    #9
  10. Hi Jerker!


    > o = (last value of o) + a * b * c
    >
    > Since I only want to use one multiplier, the calculation will be done over
    > two clock cycles. Then I would like to write as follows:
    >
    > architecture RTL of Function is
    > signal multin1: integer range -32768 to 32767;
    > signal multin2: integer range -32768 to 32767;
    > signal multout: integer;
    > signal state: bit := '0';
    > signal next_state: bit;
    > begin
    > multout <= multin1 * multin2;
    > process (a, b, multout, state)
    > variable c_saved: integer range -32768 to 32767;
    > variable o_accum: integer := 0;
    > variable axb: integer;
    > begin
    > case state is
    > when '0' =>
    > c_saved := c;
    > multin1 <= a;
    > multin2 <= b;
    > axb := multout;
    > next_state <= '1';
    > when '1' =>
    > multin1 <= axb;
    > multin2 <= c_saved;
    > o_accum := o_accum + multout;
    > o <= o_accum;
    > next_state <= '0';
    > end case;
    > end process;
    > process (clk)
    > begin
    > if clk'event and clk = '1' then
    > state <= next_state;
    > end if;
    > end process;
    > end;
    >
    > But I understand now that I can't do it like this, because when state '1' is
    > clocked in, the process will be run twice, thus adding first the old, then
    > the new value of multout to o_accum. Just out of curiosity I would like to
    > know if this actually happens in the implemented system too?


    Yes, it will. I wrote such code several times because of "idiotic typing
    errors". The simulator runs into an "infinite loop". Synthesis should
    not comply, because it's not task of synthesis to detect such loops.
    (Some synthesis tools will warn you: "timing loop detected".)

    I would make a copy of signal o_accum, before the accumulation is done.

    Your (latch based) code may lead to hazards, because if you make a copy
    of o_accum in state 0, and the state machine switches to state 1, the
    enable-signal for the latch, that contains the copy of o_accum may not
    be disabled.
    Solution: Merge the "state-change"-process with the
    "while-state-is"-process. Then all registers become flipflops:

    if clk'event and clk = '1' then
    case state is
    when '0' =>
    c_saved := c;
    multin1 <= a;
    multin2 <= b;
    axb := multout;
    state <= '1';
    when '1' =>
    multin1 <= axb;
    multin2 <= c_saved;
    o_accum := o_accum + multout;
    o <= o_accum;
    state <= '0';
    end case;
    end if;


    ... well I hope this is correct. I have no VHDL-compiler at this PC to
    check it.

    I would recommend the flipflop-based solution, but it should be possible
    to optimize it, to use a mixed latch- and ff-based solution or even a
    pure latch-based solution. Because optimizing it is not your question,
    the ff-based solution should be o.k.. ;-)

    Ralf
     
    Ralf Hildebrandt, Jul 19, 2003
    #10
  11. Thank you all for your suggestions! You have taught me to put everything in
    the clocked process to avoid latches. However, none of the designs that you
    suggested seem to produce what I wanted. First Pieter's design:

    WAIT UNTIL clk = '1';
    IF enable = '1' OR state = '0' THEN
    state <= '1';
    axb <= a * b;
    ELSE
    state <= '0';
    o <= o + axb * c;
    END IF;
    IF reset = '1' THEN
    state <= '0';
    o <= 0;
    END IF;

    It's very elegant, and I had no idea that one could accumulate to signals
    like that. But the whole point with going through the multin-multout thing
    was to share the multiplier to save FPGA area. Maybe a good optimizer would
    find that the two multipliers aren't used at the same time and apply
    resource sharing automatically, but mine (Xilinx) doesn't, so I get two
    multipliers here.

    Here is Ralf's suggestion:

    if clk'event and clk = '1' then
    case state is
    when '0' =>
    c_saved := c;
    multin1 <= a;
    multin2 <= b;
    axb := multout;
    state <= '1';
    when '1' =>
    multin1 <= axb;
    multin2 <= c_saved;
    o_accum := o_accum + multout;
    o <= o_accum;
    state <= '0';
    end case;
    end if;

    If this works, then I'm confused again about the "What trigs processes"
    question, because as far as I understand, it takes two deltas before a * b
    actually reaches multout (in state '0'). Since the process will only execute
    once (at delta 0), axb will never be assigned this value. The same goes for
    o_accum.

    So it seems I'm still stuck... If I want to share a multiplier, I guess I
    HAVE to send the factors out of the process, and then the process HAS to
    execute twice in order to take care of the result in the same clock cycle.
    Or am I wrong here?

    I'm sorry to keep bugging you!

    /Jerker
     
    Jerker Hammarberg, Jul 20, 2003
    #11
  12. > WAIT UNTIL clk = '1';
    > IF enable = '1' OR state = '0' THEN
    > state <= '1';
    > axb <= a * b;
    > ELSE
    > state <= '0';
    > o <= o + axb * c;
    > END IF;
    > IF reset = '1' THEN
    > state <= '0';
    > o <= 0;
    > END IF;


    > Maybe a good optimizer would find that the two multipliers aren't used at
    > the same time and apply resource sharing automatically, but mine (Xilinx)
    > doesn't, so I get two multipliers here.


    Aah, the beauty of compiler limitations. :) Ok, let's try it again then
    shall we?

    ARCHITECTURE rtl OF function IS
    SIGNAL multin1 : integer range -32768 to 32767;
    SIGNAL multin2 : integer range -32768 to 32767;
    SIGNAL multout : integer;
    SIGNAL axb : integer;
    SIGNAL state : std_logic;
    BEGIN

    multout <= multin1 * multin2;

    multin_cmb: PROCESS( a, b, c, axb, state, enable )
    BEGIN
    IF enable = '1' OR state = '0' THEN
    multin1 <= a;
    multin2 <= b;
    ELSE
    multin1 <= c;
    multin2 <= axb;
    END IF;
    END PROCESS multin_cmb;

    mult_reg: PROCESS
    BEGIN
    WAIT UNTIL clk = '1';
    IF enable = '1' OR state = '0' THEN
    state <= '1';
    axb <= multout;
    ELSE
    state <= '0';
    o <= o + multout;
    END IF;
    IF reset = '1' THEN
    state <= '0';
    axb <= 0;
    o <= 0;
    END IF;
    END PROCESS mult_reg;

    END ARCHITECTURE rtl;

    Hope this works better.

    Regards,

    Pieter Hulshoff
     
    Pieter Hulshoff, Jul 20, 2003
    #12
  13. >> If this works, then I'm confused again about the "What trigs processes"
    >> question, because as far as I understand, it takes two deltas before a *
    >> b actually reaches multout (in state '0'). Since the process will only
    >> execute once (at delta 0), axb will never be assigned this value. The
    >> same goes for o_accum.

    >
    > The process is triggered everytime a signal in the sensitivity list
    > changes, but the if-clause is executed only at rising_edge(clk).
    > Therefore it is not nessecary to have all input-signals in the
    > sensitivity list. Only clk is needed.


    I'm sorry Ralf, but Jerker is correct. In your process:

    if clk'event and clk = '1' then
    case state is
    when '0' =>
    c_saved := c;
    multin1 <= a;
    multin2 <= b;
    axb := multout;

    multin1 and multin2 get their value 1 delta after the rising clock edge.
    multout gets it value 1 delta after that. This means that axb will not get
    the correct value.

    Regards,

    Pieter Hulshoff
     
    Pieter Hulshoff, Jul 21, 2003
    #13
  14. Thank you Pieter! This one really seems to do it, and everything is clear to
    me now about processes, deltas, DFFs vs flip-flops and so on - at least for
    now...

    But while we're at it, I think I managed to put together an even simpler
    version. Would you review it for me? It should work too, right? (For the
    sake of simplicity, I removed the reset and enable signals, although I
    understand that at least one of them should be there.)

    architecture RTL of Experiment is
    signal state: std_logic := '0';
    begin
    process
    variable multin1: integer;
    variable multin2: integer;
    variable multout: integer;
    begin
    wait until clk = '1';
    case state is
    when '0' =>
    multin1 := a;
    multin2 := b;
    when '1' =>
    multin1 := multout;
    multin2 := c;
    end case;
    multout := multin1 * multin2;
    case state is
    when '0' =>
    state <= '1';
    when '1' =>
    o <= o + multout;
    state <= '0';
    end case;
    end process;
    end;

    /Jerker
     
    Jerker Hammarberg, Jul 21, 2003
    #14
  15. > But while we're at it, I think I managed to put together an even simpler
    > version. Would you review it for me? It should work too, right? (For the
    > sake of simplicity, I removed the reset and enable signals, although I
    > understand that at least one of them should be there.)


    This one would work fine in my opinion. For personal reasons I just try to
    avoid using variables. They tend to lead to more timing issues (unless you
    know what you're doing), and it's a pain to find them in a netlist. As said
    though: this is just my personal preference. I know plenty of collegues
    that use them all over the place.

    Regards,

    Pieter Hulshoff
     
    Pieter Hulshoff, Jul 22, 2003
    #15
  16. Jerker Hammarberg wrote:

    > But while we're at it, I think I managed to put together an even simpler
    > version. Would you review it for me? It should work too, right?


    Consider writing a testbench, to prove it.
    I've enjoyed this thread, so I'll get you started.
    Let's add an entity to your process, to make it testable.
    --------------------------------------------
    library ieee;
    use ieee.std_logic_1164.all;

    entity mult is
    port (a, b, c : in integer;
    o : out integer;
    clk, rst : in std_ulogic );
    end mult;

    architecture synth of mult is

    begin
    this : process( clk, rst) is
    variable step_1 : boolean;
    variable multin1 : integer;
    variable multin2 : integer;
    variable multout : integer;
    begin
    clked : if rst = '1' then
    o <= 0;
    step_1 := true;
    elsif rising_edge(clk) then
    op : case step_1 is
    when true =>
    multin1 := a;
    multin2 := b;
    when false =>
    multin1 := multout;
    multin2 := c;
    end case op;
    multout := multin1 * multin2;
    step_1 := not step_1;
    end if clked;
    end process this;
    end synth;
    ------------------------------------------------

    Note that this step clarifies what is local
    to the process and what is i/o.

    State variables are normally type enumerations,
    but a boolean will do fine here.
    Using std_logic for state variables would force
    us to consider possible states such as 'H' and 'Z'.

    I know I am in the minority in this group,
    but I would encourage the appropriate use
    of variables. The upside for me is that
    the code is easier to sim in my head, and
    therefore, much more likely to work the first
    time.

    -- Mike Treseler
     
    Mike Treseler, Jul 22, 2003
    #16
  17. Mike Treseler wrote:

    > multout := multin1 * multin2;


    -- make that:

    multout := multin1 * multin2;
    o <= multout;
     
    Mike Treseler, Jul 22, 2003
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?S3VydCBTY2hyb2VkZXI=?=

    No Class at ALL!!! beginner/beginner question

    =?Utf-8?B?S3VydCBTY2hyb2VkZXI=?=, Feb 2, 2005, in forum: ASP .Net
    Replies:
    7
    Views:
    597
    =?Utf-8?B?S3VydCBTY2hyb2VkZXI=?=
    Feb 3, 2005
  2. Rensjuh
    Replies:
    7
    Views:
    984
    Mabden
    Sep 2, 2004
  3. Jeff Rodriguez
    Replies:
    23
    Views:
    1,134
    David Schwartz
    Dec 9, 2003
  4. Marc Heiler
    Replies:
    1
    Views:
    178
    Robert Klemme
    May 24, 2009
  5. william nelson

    Beginner's Beginner

    william nelson, Apr 11, 2011, in forum: Ruby
    Replies:
    7
    Views:
    226
    7stud --
    Apr 12, 2011
Loading...

Share This Page