Parity Check

Discussion in 'VHDL' started by Ed, Sep 30, 2004.

  1. Ed

    Ed Guest

    Hi,

    I have a 36 bit bus for which I need to generate an even parity bit. The
    only way I can think to do this, is to use a loop and check each bit
    individually. Is that the best way to do it? Does anyone know any better
    techniques for generating the parity bit?

    Thanks for any help,
    Ed, Sep 30, 2004
    #1
    1. Advertising

  2. Ed

    Georg Acher Guest

    "Ed" <> writes:
    >Hi,
    >
    >I have a 36 bit bus for which I need to generate an even parity bit. The


    Smells like PCI...

    >only way I can think to do this, is to use a loop and check each bit
    >individually. Is that the best way to do it? Does anyone know any better
    >techniques for generating the parity bit?


    The loop will work and synthesize, but depending on the tool it may get
    quite slow in a chip ;-)

    You can write a function which calculates the parity for 4 bits, and combine the
    9 outputs of 9 functions calls in two more similar stages. The function itself
    maps easily to the usual FPGA logic elements and the cascading gives a balanced
    and short path for all bits.

    --
    Georg Acher,
    http://wwwbode.in.tum.de/~acher
    "Oh no, not again !" The bowl of petunias
    Georg Acher, Oct 1, 2004
    #2
    1. Advertising

  3. Ed

    David Bishop Guest

    Georg Acher wrote:

    > "Ed" <> writes:
    >
    >>Hi,
    >>
    >>I have a 36 bit bus for which I need to generate an even parity bit. The


    Better to do it recursively

    function xor_reduce (arg : std_logic_vector )
    return std_logic is
    variable Upper, Lower : std_logic;
    variable Half : integer;
    variable BUS_int : std_logic_vector ( arg'length - 1 downto 0 );
    variable Result : std_logic;
    begin
    if (arg'LENGTH < 1) then -- In the case of a NULL range
    Result := '0';
    else
    BUS_int := to_ux01 (arg);
    if ( BUS_int'length = 1 ) then
    Result := BUS_int ( BUS_int'left );
    elsif ( BUS_int'length = 2 ) then
    Result := BUS_int ( BUS_int'right ) xor BUS_int ( BUS_int'left);
    else
    Half := ( BUS_int'length + 1 ) / 2 + BUS_int'right;
    Upper := xor_reduce ( BUS_int ( BUS_int'left downto Half ));
    Lower := xor_reduce ( BUS_int ( Half - 1 downto BUS_int'right));
    Result := Upper xor Lower;
    end if;
    end if;
    return Result;
    end;

    This code is from the VHDL-200X-ft packages, and a form of it will
    appear in the VHDL-2005 std_logic_1164 and numeric_std packages.
    http://www.eda.org/vhdl-200x/vhdl-200x-ft/
    David Bishop, Oct 1, 2004
    #3
  4. David Bishop <> writes:

    > Georg Acher wrote:
    >
    >> "Ed" <> writes:
    >>
    >>>Hi,
    >>>
    >>>I have a 36 bit bus for which I need to generate an even parity bit. The

    >
    > Better to do it recursively


    [snip function]

    I can highly recommend this. A for-loop will elaborate into a linear
    chain of XOR gates in Synopsys DC, while the recursive function
    elaborates directly to a balanced binary tree.

    Simulation time, however, will suffer.

    Regards,


    Kai
    --
    Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
    Kai Harrekilde-Petersen, Oct 1, 2004
    #4
  5. Ed

    Ed Guest

    "Ed" <> wrote in message news:cji1df$4lc$...
    > Hi,
    >
    > I have a 36 bit bus for which I need to generate an even parity bit. The
    > only way I can think to do this, is to use a loop and check each bit
    > individually. Is that the best way to do it? Does anyone know any better
    > techniques for generating the parity bit?
    >
    > Thanks for any help,
    >
    >


    Thanks for the replies.

    Yes it is for PCI. Well spotted.

    The following seems like a simple (and I guess quick) way of doing it (din
    is the 36 bit bus and parity is the output):

    int1 <= din(0) xor din(1);
    int2 <= int1 xor din(2);
    int3 <= int2 xor din(3);
    int4 <= int3 xor din(4);
    int5 <= int4 xor din(5);
    int6 <= int5 xor din(6);
    int7 <= int6 xor din(7);
    int8 <= int7 xor din(8);
    int9 <= int8 xor din(9);
    int10 <= int9 xor din(10);
    int11 <= int10 xor din(11);
    int12 <= int11 xor din(12);
    int13 <= int12 xor din(13);
    int14 <= int13 xor din(14);
    int15 <= int14 xor din(15);
    int16 <= int15 xor din(16);
    int17 <= int16 xor din(17);
    int18 <= int17 xor din(18);
    int19 <= int18 xor din(19);
    int20 <= int19 xor din(20);
    int21 <= int20 xor din(21);
    int22 <= int21 xor din(22);
    int23 <= int22 xor din(23);
    int24 <= int23 xor din(24);
    int25 <= int24 xor din(25);
    int26 <= int25 xor din(26);
    int27 <= int26 xor din(27);
    int28 <= int27 xor din(28);
    int29 <= int28 xor din(29);
    int30 <= int29 xor din(30);
    int31 <= int30 xor din(31);
    int32 <= int31 xor din(32);
    int33 <= int32 xor din(33);
    int34 <= int33 xor din(34);
    parity <= int34 xor din(35);

    How does this compare to other ways of doing it? I can't see any real
    disadvantage to doing it this way.

    Thanks again.
    Ed, Oct 1, 2004
    #5
  6. Ed a écrit:

    > The following seems like a simple (and I guess quick) way of doing it (din
    > is the 36 bit bus and parity is the output):
    >
    > int1 <= din(0) xor din(1);
    > int2 <= int1 xor din(2);

    [...]
    > int34 <= int33 xor din(34);
    > parity <= int34 xor din(35);
    >
    > How does this compare to other ways of doing it? I can't see any real
    > disadvantage to doing it this way.


    First, it's very long to type. You could have used a for... loop :eek:)
    Second, this creates a cascade of gates, with a big delay. Delay from
    din(0) to parity is 36 times longer than delay from din(35) to parity.
    Suppose a gate delay of 1ns, you end up with 36ns which is not PCI-33
    compliant.
    Balanced binary trees (such as created by the recursive function given
    earlier) have constant delays for every input.


    --
    ____ _ __ ___
    | _ \_)/ _|/ _ \ Adresse de retour invalide: retirez le -
    | | | | | (_| |_| | Invalid return address: remove the -
    |_| |_|_|\__|\___/
    Nicolas Matringe, Oct 1, 2004
    #6
  7. Ed

    Ed Guest

    "Ed" <> wrote in message news:cji1df$4lc$...
    > Hi,
    >
    > I have a 36 bit bus for which I need to generate an even parity bit. The
    > only way I can think to do this, is to use a loop and check each bit
    > individually. Is that the best way to do it? Does anyone know any better
    > techniques for generating the parity bit?
    >
    > Thanks for any help,
    >
    >


    I've tried using the recursive function, but I get the following error:

    ERROR:Xst:831 - C:/XilinxTest/Test/parity_gen.vhd (Line 29). Recursion
    detected in function 'xor_reduce'.

    So I guess my software (ISE 4.2i) doesn't allow it.

    I've synthesized the following code:

    parity <= (din(35) xor din(34) xor din(33) xor din(32)) xor
    (din(31) xor din(30) xor din(29) xor din(28)) xor
    (din(27) xor din(26) xor din(25) xor din(24)) xor
    (din(23) xor din(22) xor din(21) xor din(20)) xor
    (din(19) xor din(18) xor din(17) xor din(16)) xor
    (din(15) xor din(14) xor din(13) xor din(12)) xor
    (din(11) xor din(10) xor din(9) xor din(8)) xor
    (din(7) xor din(6) xor din(5) xor din(4)) xor
    (din(3) xor din(2) xor din(1) xor din(0));

    And the synthesis report shows:

    Maximum combinational path delay: 10.483ns

    Which I think is good enough for the 33MHz PCI bus.
    Ed, Oct 1, 2004
    #7
  8. Dear Ed,
    Let me tell you my commentary about your question. While I read all posts
    and answers of your questions I was thinking about your problem. I'll give
    you some suggestions:
    1. Because you use WebPack from Xilinx, you should go to the
    http://www.xilinx.com/xlnx/xebiz/de...ondaryNavPick=Design Tools&key=DS-ISE-WEBPACK
    and need to download the lastest version ot the software.
    2. If you wish to use cascade chacking of parity like this:
    int1 <= din(0) xor din(1);
    int2 <= int1 xor din(2);
    int3 <= int2 xor din(3);
    int4 <= int3 xor din(4);
    int5 <= int4 xor din(5);
    int6 <= int5 xor din(6);
    int7 <= int6 xor din(7);
    int8 <= int7 xor din(8);
    int9 <= int8 xor din(9);
    int10 <= int9 xor din(10);
    int11 <= int10 xor din(11);
    int12 <= int11 xor din(12);
    int13 <= int12 xor din(13);
    int14 <= int13 xor din(14);
    int15 <= int14 xor din(15);
    int16 <= int15 xor din(16);
    int17 <= int16 xor din(17);
    int18 <= int17 xor din(18);
    int19 <= int18 xor din(19);
    int20 <= int19 xor din(20);
    int21 <= int20 xor din(21);
    int22 <= int21 xor din(22);
    int23 <= int22 xor din(23);
    int24 <= int23 xor din(24);
    int25 <= int24 xor din(25);
    int26 <= int25 xor din(26);
    int27 <= int26 xor din(27);
    int28 <= int27 xor din(28);
    int29 <= int28 xor din(29);
    int30 <= int29 xor din(30);
    int31 <= int30 xor din(31);
    int32 <= int31 xor din(32);
    int33 <= int32 xor din(33);
    int34 <= int33 xor din(34);
    parity <= int34 xor din(35);

    you should know that if, you are going to put it otu of process, it will
    have behaviour like paralel assignment, and believe me, you and your
    compiler would be confused.

    3. If you decide to use "for loop" your compiler will generate much
    feedbacks, and the circuit will slowly.

    4. If you decide to use recursion, the circuit will be faster, however you
    shoud be 100% sure that every thing work properly.

    I guess that the manner which will give you the best productiveness will
    be recirsion ot the example which (Georg Acher) gave
    you.
    Best Regards
    Ivaylo Krumov
    ivailokroumov, Oct 1, 2004
    #8
  9. Ed

    rickman Guest

    Ed wrote:
    >
    > "Ed" <> wrote in message news:cji1df$4lc$...
    > > Hi,
    > >
    > > I have a 36 bit bus for which I need to generate an even parity bit. The
    > > only way I can think to do this, is to use a loop and check each bit
    > > individually. Is that the best way to do it? Does anyone know any better
    > > techniques for generating the parity bit?
    > >
    > > Thanks for any help,
    > >
    > >

    >
    > I've tried using the recursive function, but I get the following error:
    >
    > ERROR:Xst:831 - C:/XilinxTest/Test/parity_gen.vhd (Line 29). Recursion
    > detected in function 'xor_reduce'.
    >
    > So I guess my software (ISE 4.2i) doesn't allow it.
    >
    > I've synthesized the following code:
    >
    > parity <= (din(35) xor din(34) xor din(33) xor din(32)) xor
    > (din(31) xor din(30) xor din(29) xor din(28)) xor
    > (din(27) xor din(26) xor din(25) xor din(24)) xor
    > (din(23) xor din(22) xor din(21) xor din(20)) xor
    > (din(19) xor din(18) xor din(17) xor din(16)) xor
    > (din(15) xor din(14) xor din(13) xor din(12)) xor
    > (din(11) xor din(10) xor din(9) xor din(8)) xor
    > (din(7) xor din(6) xor din(5) xor din(4)) xor
    > (din(3) xor din(2) xor din(1) xor din(0));
    >
    > And the synthesis report shows:
    >
    > Maximum combinational path delay: 10.483ns
    >
    > Which I think is good enough for the 33MHz PCI bus.


    This looks ok to me. I belive that Georg was saying that you could
    define a 4 input function that would make the code more readable and
    produce the same result. I am not sure that your original code would
    create anything different in the end since the logic is the same in
    either case. Since the tools have the freedom to reorganize
    combinatorial logic the structure of the code is not always reflected in
    the resulting chip level implementation.

    --

    Rick "rickman" Collins


    Ignore the reply address. To email me use the above address with the XY
    removed.

    Arius - A Signal Processing Solutions Company
    Specializing in DSP and FPGA design URL http://www.arius.com
    4 King Ave 301-682-7772 Voice
    Frederick, MD 21701-3110 301-682-7666 FAX
    rickman, Oct 2, 2004
    #9
  10. Kai Harrekilde-Petersen <> writes:
    > I can highly recommend this. A for-loop will elaborate into a linear
    > chain of XOR gates in Synopsys DC, while the recursive function
    > elaborates directly to a balanced binary tree.


    While I agree that recursive solutions are most elegant and *will* be
    synthesized into a binary tree, I found that Synopsys DC is actually
    pretty good at turning loops into trees all by itself. At least for
    low-level functions such as AND, OR, XOR, etc.

    In fact, a couple of months ago, I synthesized an iterative XOR-reduce
    function and got a nice tree of gates in return. Can't remember if I
    had VHDL-Presto (Synopsys' new VHDL compiler) enabled or not and
    whether that made a difference.

    Best regards,
    Marcus
    Marcus Harnisch, Oct 2, 2004
    #10
  11. Marcus Harnisch <> writes:

    > Kai Harrekilde-Petersen <> writes:
    >> I can highly recommend this. A for-loop will elaborate into a linear
    >> chain of XOR gates in Synopsys DC, while the recursive function
    >> elaborates directly to a balanced binary tree.

    >
    > While I agree that recursive solutions are most elegant and *will* be
    > synthesized into a binary tree, I found that Synopsys DC is actually
    > pretty good at turning loops into trees all by itself. At least for
    > low-level functions such as AND, OR, XOR, etc.


    Note that I am talking about what the VHDL code _elaborates_ into, not
    what you get after synthesis. The loop and the recursive XOR_reduce()
    can result in the same logic after synthesis. Back when I looked at
    it, the loop approach would take much more synthesis time and effort
    for DC to turn it into a binary tree, than when DC got the tree served
    on a silver plate.

    But if DC can now elaborate from a loop to a binary tree, that just
    good. And damn well about time, I might add.

    How about muxes? Does "y <= a(sel);" elaborate to a mux these days or
    do I still need to write a case or a for-loop with an exit statement?

    Regards,


    Kai
    --
    Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
    Kai Harrekilde-Petersen, Oct 2, 2004
    #11
  12. Kai,

    Kai Harrekilde-Petersen <> writes:
    >>> A for-loop will elaborate into a linear chain of XOR gates in
    >>> Synopsys DC, while the recursive function elaborates directly to a
    >>> balanced binary tree.

    >>

    > Note that I am talking about what the VHDL code _elaborates_ into, not
    > what you get after synthesis.


    Sorry for misreading that. My bad.

    > But if DC can now elaborate from a loop to a binary tree, that just
    > good. And damn well about time, I might add.


    I didn't say that and I am pretty sure it does not. And IMHO it is not
    supposed to either. The elaboration result is a fairly generic
    representation of the code. If you wrote a loop, you'll basically get
    it. The `compile' command, which does optimization and technology
    mapping, will figure out what it thinks is best suited to achieve the
    goals (constraints).

    > How about muxes? Does "y <= a(sel);" elaborate to a mux these days or
    > do I still need to write a case or a for-loop with an exit statement?


    Don't know and I don't have access to DC at the moment. Never noticed
    anything wrong in that respect. What did you get instead? Around which
    time frame did you find that to be an issue?

    Just curious. Why do you care so much about elaboration results?

    Best regards,
    Marcus
    Marcus Harnisch, Oct 3, 2004
    #12
  13. Marcus Harnisch <> writes:

    > Kai Harrekilde-Petersen <> writes:
    >>>> A for-loop will elaborate into a linear chain of XOR gates in
    >>>> Synopsys DC, while the recursive function elaborates directly to a
    >>>> balanced binary tree.
    >>>

    >> Note that I am talking about what the VHDL code _elaborates_ into, not
    >> what you get after synthesis.

    >
    > Sorry for misreading that. My bad.
    >
    >> But if DC can now elaborate from a loop to a binary tree, that just
    >> good. And damn well about time, I might add.

    >
    > I didn't say that and I am pretty sure it does not. And IMHO it is not
    > supposed to either.


    Why not? it's a perfectly legal representation of the logic, it costs
    no extra area/cells, and will in most cases be the fastest
    implementation you can get. It's only when you DON'T have equal
    input-delays that the binary tree (or ternary or whatever your std
    cell library gives you) is not the optimal solution.

    > The elaboration result is a fairly generic
    > representation of the code. If you wrote a loop, you'll basically get
    > it. The `compile' command, which does optimization and technology
    > mapping, will figure out what it thinks is best suited to achieve the
    > goals (constraints).


    Yeah, but DC is not extracting all the information from the VHDL that
    it could. The MUX example below shows that clearly. (I might add
    that VHDL also limits the amount of information that I can convey to
    DC, but I'm not flogging that beast now).

    >> How about muxes? Does "y <= a(sel);" elaborate to a mux these days or
    >> do I still need to write a case or a for-loop with an exit statement?

    >
    > Don't know and I don't have access to DC at the moment. Never noticed
    > anything wrong in that respect. What did you get instead? Around which
    > time frame did you find that to be an issue?


    My original tests were done around 2000-2001. At the time, a straight

    y <= a(sel);

    would not turn into a nice MUX. I forget what DC did, but it churned
    out a circuit that was bigger and slower than a MUX. And it would
    take the longest time to synthesize.

    A for loop like this:

    for i in sel'range loop
    if i = sel then
    y <= a(i);
    end if
    end loop;

    will result in a chain of 2:1 MUXes. Sure, DC can turn it into a real
    N:1 mux, but it would take 2-3x the synthesis time of this loop:

    for i in sel'range loop
    if i = sel then
    y <= a(i);
    exit;
    end if
    end loop;

    (note the addition of the exit statement). That elaborates to a
    single N:1 MUX in GTECH, and will take up the minimum synthesis time
    and have the best area/timing parameters.

    > Just curious. Why do you care so much about elaboration results?


    Because what the code elaborates into will dictate, to a high degree,
    what DC can get out of it in the long run. Also, if I can write code
    that elaborates directly to what I want, DC will spend less time in
    synthesis and timing optimization.

    And when it takes about 24 hours to synthesize all RTL code for a
    chip, with multiple DC licenses, it's worth paying attention to, IMHO.

    For example, we have blocks that are notorious for spending 2-3 hours
    in the elaboration phase, and a few more in synthesis. That limits
    your turn-around time and effectiveness when trying out vaious options
    during the synthesis or changing the VHDL code to get better (faster
    or smaller logic).

    Last week a colleague and I was discussing how to make the fastest
    1-complement adder that would add 32 16-bit values (TCP and IP
    checksum, it anyone wonders). Turns out that the fastest
    implementation we could come up with includes 16 32-bit popcount()s.
    The obvious implementation is a simple for-loop, but thats results in
    poor timing and long synthesis times. So, out with a bunch of
    Full-adders and make a tree out of that. It takes a bit of time and
    will probably be fairly unreadable, but sometimes you just got to do
    it.

    I've never used AutoLogic-II myself, but colleagues that have used it
    tell me that it was significantly better back then, than what DC have
    even today.


    Regards,


    Kai
    --
    Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
    Kai Harrekilde-Petersen, Oct 3, 2004
    #13
  14. Hello Kai,

    Kai Harrekilde-Petersen <> writes:
    > it's a perfectly legal representation of the logic, it costs
    > no extra area/cells, and will in most cases be the fastest
    > implementation you can get.


    I completely agree. But if I write a loop, it is my damn right to get
    a loop ;-) If I later tell the tool that I want something more
    optimized, fine. I guess we'll have to agree to disagree in that
    respect.

    > Yeah, but DC is not extracting all the information from the VHDL that
    > it could. The MUX example below shows that clearly.


    > A for loop like this:
    >
    > for i in sel'range loop
    > if i = sel then
    > y <= a(i);
    > end if
    > end loop;
    >
    > will result in a chain of 2:1 MUXes. Sure, DC can turn it into a real
    > N:1 mux, but it would take 2-3x the synthesis time of this loop:
    >
    > for i in sel'range loop
    > if i = sel then
    > y <= a(i);
    > exit;
    > end if
    > end loop;
    >
    > (note the addition of the exit statement).


    In the first case, it is more difficult to figure out that there will
    be only one matching case. All values of `i' are unique and the `if'
    is supposed to know that. Never mind that the conditional operator
    would have to be taken into account as well, since otherwise there
    might be multiple matches despite the unique `i'. In that case the
    last one wins so that we'll get a single assignment again. Except if
    there is another assignment involving perhaps a variable. Not
    impossible, but not trivial either.

    In the second case there will be at most one matching case --
    ever. Simple.

    > Also, if I can write code that elaborates directly to what I want,
    > DC will spend less time in synthesis and timing optimization.


    In a lot of cases, my time coming up with super-cool
    pre-obfuscated^Woptimized code, explaining everyone how that stuff is
    supposed to work, verifying that not a single error has crept in,
    possibly better simulation performance, etc., is worth more than a few
    hours of synthesis.

    > So, out with a bunch of Full-adders and make a tree out of that. It
    > takes a bit of time and will probably be fairly unreadable, but
    > sometimes you just got to do it.


    Yes, but I would like to remind you that my original statement
    mentioned basic functions:

    >>>> Marcus Harnisch <> writes:
    >>>> [...] At least for low-level functions such as AND, OR, XOR, etc.


    In cases as simple as these, the time spent on coming up and verifying
    a recursive solution is really not well spent.

    I'd be interested if anyone here has used the generic Wallace tree
    components from Designware. It looks like they are internally used by
    DW itself, but might possibly be used in your own blocks. I remember
    seeing these in the DW documentation but I don't have DC access.

    Best regards,
    Marcus
    Marcus Harnisch, Oct 5, 2004
    #14
  15. Marcus Harnisch <> writes:

    > Kai Harrekilde-Petersen <> writes:
    >> it's a perfectly legal representation of the logic, it costs
    >> no extra area/cells, and will in most cases be the fastest
    >> implementation you can get.

    >
    > I completely agree. But if I write a loop, it is my damn right to get
    > a loop ;-) If I later tell the tool that I want something more
    > optimized, fine. I guess we'll have to agree to disagree in that
    > respect.


    After thinking about it a bit more, I actually would prefer if DC
    turned the loop into an N-input XOR gate (or whatever) at the GTECH
    level. And only during the optimization decide how to implement that
    N-input gate. I think that would satisfy both of us at the same time
    :)

    >> Yeah, but DC is not extracting all the information from the VHDL that
    >> it could. The MUX example below shows that clearly.


    [snip example]

    > In the first case, it is more difficult to figure out that there will
    > be only one matching case. All values of `i' are unique and the `if'
    > is supposed to know that. Never mind that the conditional operator
    > would have to be taken into account as well, since otherwise there
    > might be multiple matches despite the unique `i'. In that case the
    > last one wins so that we'll get a single assignment again. Except if
    > there is another assignment involving perhaps a variable. Not
    > impossible, but not trivial either.


    I consider the case to be trivial, but it might not be trivial to a
    compiler.

    >> Also, if I can write code that elaborates directly to what I want,
    >> DC will spend less time in synthesis and timing optimization.

    >
    > In a lot of cases, my time coming up with super-cool
    > pre-obfuscated^Woptimized code, explaining everyone how that stuff is
    > supposed to work, verifying that not a single error has crept in,
    > possibly better simulation performance, etc., is worth more than a few
    > hours of synthesis.


    It's not a clear-cut case, and sometimes it's worth optimizing the
    code. Sometimes it's not. I just wish that DC was smarter, so I
    could spend more time on the "big issues" rather than spending time to
    obfuscate everyday code snippets in order for DC to understand it.

    >>>>> Marcus Harnisch <> writes:
    >>>>> [...] At least for low-level functions such as AND, OR, XOR, etc.

    >
    > In cases as simple as these, the time spent on coming up and verifying
    > a recursive solution is really not well spent.


    If DC had a good VHDL frontend, yes. IMHO, it doesn't have a good
    VHDL frontend.

    > I'd be interested if anyone here has used the generic Wallace tree
    > components from Designware. It looks like they are internally used by
    > DW itself, but might possibly be used in your own blocks. I remember
    > seeing these in the DW documentation but I don't have DC access.


    Sorry, I don't have any experience with that.


    Kai
    --
    Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
    Kai Harrekilde-Petersen, Oct 6, 2004
    #15
  16. Ed

    Tom Verbeure Guest

    Up to now, I've always been using the xor_reduce function that's part
    of std_logic_misc. I know that this is not an official standard, but
    it's there so why no use it.

    Anyway, after this discussion, I wanted to see what DC actually does. I
    created a block that clocks a 512 input vector, does a xor-reduce and
    outputs it to a FF. I created a clock of 1 GHz (to be sure I wouldn't
    meet timing) and did a synthesis for 3 cases:
    1. xor_reduce of std_logic_misc
    2. a simple for loop
    3. the xor_reduce as suggested above.
    I used a simple 'compile', so no high effort for mapping or area.

    The result for case 1 and case 2 came out identical:
    area: 32142
    negative slack: 1.08

    case 3:
    area: 32627
    negative slack: 1.06

    The results are more or less identical. Contrary to some claims made
    above, DC is smart enough NOT to make a sequential string of XOR
    elements.

    I did the same experiment but with only 128 inputs. In this case, the
    negative slack was identical, but the area for the cases 1 and 2 was
    slightly higher instead of slightly lower.

    Tom
    Tom Verbeure, Oct 11, 2004
    #16
  17. "Tom Verbeure" <> writes:

    > Up to now, I've always been using the xor_reduce function that's part
    > of std_logic_misc. I know that this is not an official standard, but
    > it's there so why no use it.
    >
    > Anyway, after this discussion, I wanted to see what DC actually does. I
    > created a block that clocks a 512 input vector, does a xor-reduce and
    > outputs it to a FF. I created a clock of 1 GHz (to be sure I wouldn't
    > meet timing) and did a synthesis for 3 cases:
    > 1. xor_reduce of std_logic_misc
    > 2. a simple for loop
    > 3. the xor_reduce as suggested above.
    > I used a simple 'compile', so no high effort for mapping or area.
    >
    > The result for case 1 and case 2 came out identical:
    > area: 32142
    > negative slack: 1.08
    >
    > case 3:
    > area: 32627
    > negative slack: 1.06
    >
    > The results are more or less identical. Contrary to some claims made
    > above, DC is smart enough NOT to make a sequential string of XOR
    > elements.


    Either I didn't state myself clearly enough, or you missed my point:
    I've always been saying that DC _can_ do the optimizations. My point
    has been that DC takes extra time to do the optimization.

    Try running the synthesis without a constraint, or a clock of 100nsec,
    and resynthesize. Then go looking at the synthesis result. What I
    have been seeing is that in case 2, DC creates a string of XOR gates,
    and hence one of the bits has a really bad timing.

    Regards,


    Kai
    --
    Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
    Kai Harrekilde-Petersen, Oct 12, 2004
    #17
  18. Ed

    Wallclimber Guest

    > Either I didn't state myself clearly enough, or you missed my point:
    > I've always been saying that DC _can_ do the optimizations.


    Ok. I must have misunderstood.

    > My point has been that DC takes extra time to do the optimization.


    That may be true. Though probably not relevant enough to justify the
    risk of writing error prone recursive code instead of a one-liner from
    a fairly standard library.

    > Try running the synthesis without a constraint, or a clock of 100nsec,
    > and resynthesize. Then go looking at the synthesis result. What I
    > have been seeing is that in case 2, DC creates a string of XOR gates,
    > and hence one of the bits has a really bad timing.


    Once it meets your specified timing constraints, how can it be 'really
    bad timing'? A standard ripple adder has 'really bad timing' compared
    to more exotic configurations, but you won't use those until there's a
    good reason to do so.

    Once my design meets the timing, I care about minimum area (which, in
    this case, is equivalent)

    Tom
    Wallclimber, Oct 14, 2004
    #18
  19. Ed

    yassen

    Joined:
    Oct 6, 2009
    Messages:
    1
    Hi,

    I have not read all the thread, so excuse me if my post is not precise.

    I think I know a simple solution for checking parity:
    Why not using the XOR reduction opeartor?

    input [35:0] din;
    wire parity_bit = ^din;

    or, check here:
    paritycheck.wordpress.com/2007/06/27/coding-even-parity-checking-logic/

    Good luck,
    Yassen
    yassen, Oct 6, 2009
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. snacktime

    Set parity of a string

    snacktime, Jan 23, 2005, in forum: Python
    Replies:
    8
    Views:
    377
    Peter Hansen
    Jan 26, 2005
  2. Petr Jakes
    Replies:
    2
    Views:
    5,061
    Petr Jakes
    Nov 12, 2005
  3. shakeelsultan

    odd parity checker FSM

    shakeelsultan, Oct 27, 2006, in forum: VHDL
    Replies:
    0
    Views:
    798
    shakeelsultan
    Oct 27, 2006
  4. Herbert Haas

    Parity check of a word

    Herbert Haas, Aug 28, 2005, in forum: C Programming
    Replies:
    15
    Views:
    684
    Krishanu Debnath
    Sep 1, 2005
  5. Replies:
    2
    Views:
    2,286
    Eric Smith
    May 21, 2007
Loading...

Share This Page