changes for synthesizable code

Discussion in 'VHDL' started by vizziee@gmail.com, Jul 21, 2005.

  1. Guest

    Hi.

    I am working on parallel implementation of merge-sort (infact, the last
    step of merge-sort where I need to only merge two sorted arrays of
    equal lengths to form a bigger sorted array) in VHDL. I adapted an
    existing program for my implementation and it works well during
    simulation. However, it doesn't get synthesized. What changes should I
    make in the design to make it synthesizable, without changing the
    architecture. Or even if any change in the architecture is required,
    only a few latencies are acceptable. Following is my code and the
    errors I received while synthesis in Quartus II 5.0 (I think the
    'Warning' may be ignored, my issue is more towards the 'Error' part).


    Perhaps to reduce the logic element consumption, I can play in some
    part of the code. But that should not cost higher latencies.

    *****************************************************************************************

    1 library ieee;
    2 use ieee.std_logic_1164.all;
    3
    4 entity msort_c is
    5 generic (
    6 n: integer := 3;
    7 w: integer := 4
    8 );
    9
    10 port (
    11 clock: in std_logic;
    12
    13 reset: in std_logic;
    14
    15 valid: in std_logic;
    16
    17 data_array1: in bit_vector (((n*w)-1) downto 0); -- This is the
    first sorted data
    18 -- array input port of the sorting unit
    19
    20 data_array2: in bit_vector (((n*w)-1) downto 0); -- This is the
    second sorted data
    21 -- array input port of the sorting unit
    22
    23 sorted_array: out bit_vector ((((2*n)*w)-1) downto 0)
    24 );
    25 end entity msort_c;
    26
    27
    28 architecture msort_c_arch of msort_c is
    29 begin
    30 process (reset, clock, valid)
    31 variable x1, x2: natural range 0 to n;
    32 variable pick2: boolean;
    33 begin
    34 if (reset = '1') then
    35 sorted_array <= (others => '0');
    36 elsif (clock = '1' and clock'event) then
    37 if (valid = '1') then
    38
    39 x1 := 0;
    40 x2 := 0;
    41
    42 for i in 0 to ((2*n)-1) loop
    43
    44 -- Decide which list to pick from...
    45 pick2 := FALSE;
    46 if (x1 = n) then
    47 pick2 := TRUE;
    48 elsif (x2 /= n) then
    49 pick2 := data_array1((((x1+1)*w)-1) downto (x1*w))
    > data_array2((((x2+1)*w)-1) downto (x2*w));

    50 end if;
    51
    52 -- Pick
    53 if pick2 then
    54 sorted_array((((i+1)*w)-1) downto (i*w)) <=
    data_array2((((x2+1)*w)-1) downto (x2*w));
    55 x2 := x2 + 1;
    56 else
    57 sorted_array((((i+1)*w)-1) downto (i*w)) <=
    data_array1((((x1+1)*w)-1) downto (x1*w));
    58 x1 := x1 + 1;
    59 end if; --end of 'pick' if loop
    60 end loop; --end of i-choice 'for' loop
    61 else
    62 sorted_array <= (others => '0');
    63 end if;--end of 'if-valid' loop
    64 end if;--end of 'if-reset' loop
    65 end process;
    66
    67 end architecture msort_c_arch;



    ***************************************************************************************************************************************************************************************************************************************************************************
    *****************************************************************************************

    Error: VHDL error at msort_c.vhd(49): left bound of range must be a
    constant
    Warning: VHDL Process Statement warning at msort_c.vhd(30): signal or
    variable "sorted_array" may not be assigned a new value in every
    possible path through the Process Statement. Signal or variable
    "sorted_array" holds its previous value in every path with no new value
    assignment, which may create a combinational loop in the current
    design.
    Error: Can't elaborate top-level user hierarchy
    Error: Quartus II Analysis & Synthesis was unsuccessful. 2 errors, 1
    warning
    Error: Processing ended: Thu Jul 21 09:01:04 2005
    Error: Elapsed time: 00:00:01
    Error: Quartus II Full Compilation was unsuccessful. 2 errors, 1 warning
     
    , Jul 21, 2005
    #1
    1. Advertising

  2. wrote:

    > However, it doesn't get synthesized.

    ....
    > 30 process (reset, clock, valid)


    You don't need valid in the sensitivity list - but does not solve the
    synthesis problem.


    > 49 pick2 := data_array1((((x1+1)*w)-1) downto (x1*w)) ...

    ....
    > Error: VHDL error at msort_c.vhd(49): left bound of range must be a
    > constant


    You know, that the upper and lower bound of data_array1 are connected
    but for the synthesis tool these are just two numbers making the upper
    und lower bound independend from each other. You have to find another
    way a way to describe it.
    A for-loop describing just the same as you wrote may be the solution,
    because then for every bit the synthesis tool gets an exact description,
    what to do.

    But this may be not the final solution. In line 49 you compare two
    vectors, but the length of the vectors is variable. What would that be
    in hardware? I guess /several/ comparators - one for every vector width.
    (Ignore ressource sharing at the moment.) I don't expect, that a
    synthesis tool is capable of computing how many of such comparators are
    needed - because at the moment I do not. Do you know it? ;-)
    => Model hardware - don't program VHDL! Model such comparators in an
    explicit way. (You may use for-generate or a suitable nice way, but you
    should define in a clear and exact way, what hardware you want. If you
    do so, the synthesis tool will do it's job.)


    Ralf
     
    Ralf Hildebrandt, Jul 21, 2005
    #2
    1. Advertising

  3. Guest

    Thanks Ralf for your inputs.

    > But this may be not the final solution. In line 49 you compare two
    > vectors, but the length of the vectors is variable. What would that be
    > in hardware? I guess /several/ comparators - one for every vector width.
    > (Ignore ressource sharing at the moment.) I don't expect, that a
    > synthesis tool is capable of computing how many of such comparators are
    > needed - because at the moment I do not. Do you know it? ;-)

    The number of comparators is fixed. Since we always compare two numbers
    of size w bits (so number of comarators = w). It is only the position
    of these numbers in the two arrays that changes as per the 'pick'
    value. The position can not be known before hand because the arrays can
    have randomly any value.

    > => Model hardware - don't program VHDL! Model such comparators in an
    > explicit way. (You may use for-generate or a suitable nice way, but you
    > should define in a clear and exact way, what hardware you want. If you
    > do so, the synthesis tool will do it's job.)

    You are very correct. But I know the number of comparators (same
    reasoning again). What I don't know is which input to feed in these
    caomparators. That is decided by the 'pick' value.

    Is there a known hardware architecture for this last step of
    merge-sort?

    vizziee.
     
    , Jul 23, 2005
    #3
  4. sunny Guest

    > The number of comparators is fixed. Since we always compare two numbers
    > of size w bits (so number of comarators = w). It is only the position
    > of these numbers in the two arrays that changes as per the 'pick'
    > value. The position can not be known before hand because the arrays can
    > have randomly any value.

    If you know the number of comparators, that really solves the problem
    to some extent. Since your hardware is fixed in that case. Perhaps I
    sub-optimal way to solve the problem could be this: The problem with
    the code is the varying position of the numbers (which, as you say, is
    decided by other variable). If you can fix the position in your code,
    the problem gets solved. So after every comparison, you can shift your
    array so that the next number to be compared occupies the first
    position. So you will always compare the first elements of the two
    arrays. However, I don't know how much of the hardware this approach
    will consume. But certainly, it could be the solution.

    > Is there a known hardware architecture for this last step of
    > merge-sort?

    No idea. Even I would like to know one.

    Sunny.
     
    sunny, Jul 23, 2005
    #4
  5. wrote:


    >>But this may be not the final solution. In line 49 you compare two
    >>vectors, but the length of the vectors is variable. What would that be
    >>in hardware? I guess /several/ comparators - one for every vector width.
    >>(Ignore ressource sharing at the moment.) I don't expect, that a
    >>synthesis tool is capable of computing how many of such comparators are
    >>needed - because at the moment I do not. Do you know it? ;-)

    >
    > The number of comparators is fixed. Since we always compare two numbers
    > of size w bits (so number of comarators = w). It is only the position
    > of these numbers in the two arrays that changes as per the 'pick'
    > value. The position can not be known before hand because the arrays can
    > have randomly any value.


    Ok, then you gave yourself the answer: Model a w-bit wide comparator and
    several multiplexers to select the appropriate data.

    Synthesis complains about style like sig(x downto y) where x and y are
    some not-fixed integers, because then the synthesis tool assumes, that x
    and y are independend. You have to model such a selection descriping are
    strong dependency between the two limits like sig(x+8 downto x).
    Sometimes a suitable way of modelling can be found using a for-loop.

    It may be a good idea to split the description of the comparator from
    the one of the mux. This makes it often more clear what exactly must be
    the desired hardware. Furthermore the mux should be described carefully
    using every detail you know.

    Some details: If I look at the following pieces of code

    > 42 for i in 0 to ((2*n)-1) loop

    ....
    > 54 sorted_array((((i+1)*w)-1) downto (i*w))<= ...


    then the obviously the limits of the selected part of the array are
    fixed, but I guess it will be much easier for the synthesis tool, if you
    model this for

    sorted_array(1*w-1 dowto 0*w)<=...
    sorted_array(2*w-1 dowto 1*w)<=...
    ....

    Furthermore you should seperate all these statements into several
    processes. To avoid a lot of typing you may use a for-generate
    statement, but I would do this later, if everything is clear.

    If you separate all these partial vectors you have to model the
    selectors in your code (x1, x2, pick1, pick2...) in a different way, but
    I guess this approach leads to a more clean desription of the desired
    hardware.


    > What I don't know is which input to feed in these
    > caomparators. That is decided by the 'pick' value.


    Yes, I see.
    My comments ignore the behavior of the algorithm. (I have to say, that I
    did not even think about it.) I have just looked at the code and tried
    to see the related hardware, that is described. At the moment I do not
    know it, because the code is complex.

    But I think if you can decribe the selectors of the muxes in a very
    clean way (this means, that you have to know it for yourself and you
    have to model it in a way, that can be understood by the synthesis tool)
    you have the solution. And this:

    > Is there a known hardware architecture for this last step of
    > merge-sort?


    is exactly the point. If I am not wrong we have figured out three main
    parts of this architecture: the comparators, the muxes and the
    mux-selectors (controlled by something we do not know at the moment).

    I suggest to follow this top-down approach: Model every known component
    in an explicit way and then the unknown parts will be more visible.

    Ralf
     
    Ralf Hildebrandt, Jul 23, 2005
    #5
  6. Guest

    sunny wrote:
    > If you know the number of comparators, that really solves the problem
    > to some extent. Since your hardware is fixed in that case. Perhaps I
    > sub-optimal way to solve the problem could be this: The problem with
    > the code is the varying position of the numbers (which, as you say, is
    > decided by other variable). If you can fix the position in your code,
    > the problem gets solved. So after every comparison, you can shift your
    > array so that the next number to be compared occupies the first
    > position. So you will always compare the first elements of the two
    > arrays. However, I don't know how much of the hardware this approach
    > will consume. But certainly, it could be the solution.

    Thanx sunny, I tried your idea and it worked both in simulation as well
    as synthesis. Ofcourse LE usage has gone up exponentially. But probably
    that's the price. Any idea of bringing it down here.
     
    , Jul 26, 2005
    #6
  7. Guest

    Thanx a lot Ralf for your help.

    Ralf Hildebrandt wrote:
    > wrote:
    >
    >
    > then the obviously the limits of the selected part of the array are
    > fixed, but I guess it will be much easier for the synthesis tool, if you
    > model this for
    >
    > sorted_array(1*w-1 dowto 0*w)<=...
    > sorted_array(2*w-1 dowto 1*w)<=...
    > ...

    But won't this be a bit complicated compared to the simple shifting
    that sunny suggested.
    > Furthermore you should seperate all these statements into several
    > processes. To avoid a lot of typing you may use a for-generate
    > statement, but I would do this later, if everything is clear.

    But one process is enough. And why for-generate, a for-loop would also
    do instead in a process.

    > If I am not wrong we have figured out three main
    > parts of this architecture: the comparators, the muxes and the
    > mux-selectors (controlled by something we do not know at the moment).

    Yep...that's right. As for mux-selector logic, its decided by the
    output of comparators because that's what decides the 'pick' value. But
    this architecture is certianly not the optimum one. It consumes lot of
    LEs. Any idea of bringing it down.

    KVM.
     
    , Jul 26, 2005
    #7
  8. wrote:

    > Thanx sunny, I tried your idea and it worked both in simulation as well
    > as synthesis.


    Consider posting the working version.

    > Ofcourse LE usage has gone up exponentially. But probably
    > that's the price. Any idea of bringing it down here.


    Right now there is a full pass on every clock tick.
    Using a process variable this could be slower.
    Maybe 2, 4 or 8 clocks per pass.

    -- Mike Treseler
     
    Mike Treseler, Jul 26, 2005
    #8
  9. wrote:

    > But one process is enough. And why for-generate, a for-loop would also
    > do instead in a process.


    I agree, but this is a matter of
    style, not substance.

    If we exclude tri-state drivers,
    anything that can done by generating
    processes can also be done
    procedurally in a single process.
    These are just alternate descriptions of the
    same hardware.

    A single process thread is easier to
    read for the data_structures+algorithms
    crowd. Multiple processes feels more like
    a schematic netlist.

    -- Mike Treseler
     
    Mike Treseler, Jul 26, 2005
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Saeed Nari

    GL85 synthesizable code

    Saeed Nari, Jul 25, 2003, in forum: VHDL
    Replies:
    2
    Views:
    1,508
    Antti Lukats
    Jul 28, 2003
  2. Stanley
    Replies:
    2
    Views:
    1,680
    Jim Lewis
    Dec 7, 2004
  3. dcreddy1980

    Help in writing synthesizable code??

    dcreddy1980, Dec 16, 2004, in forum: VHDL
    Replies:
    2
    Views:
    572
    bxbxb3
    Dec 18, 2004
  4. Stefan Oedenkoven
    Replies:
    1
    Views:
    533
    zinga
    Jan 4, 2005
  5. Rama
    Replies:
    7
    Views:
    538
Loading...

Share This Page