[cross-post]path verification

Discussion in 'VHDL' started by alb, Mar 17, 2014.

  1. alb

    alb Guest

    Dear all,

    I have a microcontroller with an FPU which is delivered as an IP (I mean
    the FPU). In order to run at a decent frequency, some of the operations
    are allowed to complete in within a certain amount of cycles, but the
    main problem is that we do not know how many.

    That said, if we run the synthesis tool without timing constraints on
    those paths, we have a design that is much slower than can be.
    Multicycle constraints are out of question because they are hard to
    verify and maintain, so we decided to set false paths and perform
    post-layout sims to extract those values to be used in the RTL in a
    second iteration.

    There are several reasons why I do not particularly like this approach:

    1. it relies on post-layout sims which are resource consuming
    2. if we change technology we will likely need to do the whole process
    again
    3. we are obliged to perform incremental place&route since an optimized
    implementation (maybe done automatically) may have an impact on our
    delays.

    So far we have not come out with an alternative solution that is not
    going to imply redesign (like pipelining, c-slowing, retiming, ...).

    Any ideas/suggestions?

    Al

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    alb, Mar 17, 2014
    #1
    1. Advertising

  2. In comp.arch.fpga alb <> wrote:

    > I have a microcontroller with an FPU which is delivered as an IP (I mean
    > the FPU). In order to run at a decent frequency, some of the operations
    > are allowed to complete in within a certain amount of cycles, but the
    > main problem is that we do not know how many.


    So you paid someone for this?

    I am not sure what you mean by "a certain number of clock cycles"
    and "do not know how many".

    If it is all combinatorial, it will complete with some delay, not
    in some number of clock cycles. That is, the delay will not depend
    on any clock you supply. You then have to either be able to run
    the design through timing analysis and see how long that is, or the
    ones you bought it from should tell you.

    Though more usual, the logic should have a signal indicating when
    the result is valid.

    You could run the FPU in the timing tools with a variety (random)
    inputs and find out how long it takes. Then find the distribution
    of delays, and find a reasonable maximum. It might be data dependent
    and have a long tail. (A post-normalize shifter might depend on the
    number of digits being shifted, and the rare long shifts would have
    to be accounted for.)

    > That said, if we run the synthesis tool without timing constraints on
    > those paths, we have a design that is much slower than can be.
    > Multicycle constraints are out of question because they are hard to
    > verify and maintain, so we decided to set false paths and perform
    > post-layout sims to extract those values to be used in the RTL in a
    > second iteration.


    > There are several reasons why I do not particularly like this approach:


    > 1. it relies on post-layout sims which are resource consuming
    > 2. if we change technology we will likely need to do the whole process
    > again
    > 3. we are obliged to perform incremental place&route since an optimized
    > implementation (maybe done automatically) may have an impact on our
    > delays.


    > So far we have not come out with an alternative solution that is not
    > going to imply redesign (like pipelining, c-slowing, retiming, ...).


    The FPUs that I know of should be pipelined. (Is there a clock input?)
    You shouldn't have to do the pipelining, but you do need to know the
    number of clock cycles (and clock rate) for each operation.

    If the design is encrypted, such that you can't look at it, they
    need to give you enough information to be able to use it.

    -- glen
    glen herrmannsfeldt, Mar 17, 2014
    #2
    1. Advertising

  3. alb

    alb Guest

    Hi Glen,

    In comp.arch.fpga glen herrmannsfeldt <> wrote:
    []
    >> I have a microcontroller with an FPU which is delivered as an IP (I mean
    >> the FPU). In order to run at a decent frequency, some of the operations
    >> are allowed to complete in within a certain amount of cycles, but the
    >> main problem is that we do not know how many.

    >
    > So you paid someone for this?


    that is correct. Well it was a development on a european project where
    several universities did something and then we tried to stich it
    together... The aim was to have a small footprint embedded
    microcontroller capable of floating-point calculations.

    > I am not sure what you mean by "a certain number of clock cycles"
    > and "do not know how many".


    I admit I was not too clear, let's try again. The IP is the FPU and it
    came fully verified but never validated on the hardware (i.e. no P&R, no
    STA, no backannotate sim). We built around it a microcontroller and now
    it is time to target the technology for the specific project.

    So at this stage we do not know, considering the target logic, what is
    the logic depth for each operation of the FPU and we do not know how
    many clock cycles we need to wait in order to get the value out at the
    given target frequency.

    > If it is all combinatorial, it will complete with some delay, not
    > in some number of clock cycles. That is, the delay will not depend
    > on any clock you supply. You then have to either be able to run
    > the design through timing analysis and see how long that is, or the
    > ones you bought it from should tell you.


    As you correctly pointed out the delay does not depend on the clock
    frequency, but it depends on the target technology and final routing. In
    order for the microcontroller to work correctly I need to 'wait' for
    each specific operation a certain amount of clock cycles in order to be
    able to sample correctly the result.

    I already know that, at the target frequency, it will take more than one
    cycle to complete most of the operations, therefore my timing analysis
    will fail miserably. Not only that, without releasing some timing
    constraints on some specific path, the synthesis tool will take the
    worst path and extract the max frequency from that [1].

    Regarding the possibility to ask the developer(s), the team has fallen
    apart in the meanwhile and now we need to chase people around the globe
    to get some info (not easy).

    > Though more usual, the logic should have a signal indicating when
    > the result is valid.


    that is a valid point indeed, I implicitely assumed there's no such
    signal, but is equally possible that we haven't 'seen it'. In order to
    find something you should look for it...

    >
    > You could run the FPU in the timing tools with a variety (random)
    > inputs and find out how long it takes. Then find the distribution
    > of delays, and find a reasonable maximum. It might be data dependent
    > and have a long tail. (A post-normalize shifter might depend on the
    > number of digits being shifted, and the rare long shifts would have
    > to be accounted for.)


    I'm not sure I'm following. If for timing tools you mean STA than
    there's no such a thing like 'variety of inputs', the tool is static and
    is only calculating delays associated with paths in a graph. What you
    suggest seems more a post-layout simulation...did I get it wrong?

    Running the STA without constraining the synthesis tool might be
    suboptimal since the synthesis tool did not come out with an optimized
    implementation.

    >> That said, if we run the synthesis tool without timing constraints on
    >> those paths, we have a design that is much slower than can be.
    >> Multicycle constraints are out of question because they are hard to
    >> verify and maintain, so we decided to set false paths and perform
    >> post-layout sims to extract those values to be used in the RTL in a
    >> second iteration.

    []
    >
    > The FPUs that I know of should be pipelined. (Is there a clock input?)
    > You shouldn't have to do the pipelining, but you do need to know the
    > number of clock cycles (and clock rate) for each operation.


    The FPU is not pipelined otherwise I would have known the amount of
    clock cycles simply with the depth of the pipe. Am I wrong? Why is not
    pipelined is a different topic.

    > If the design is encrypted, such that you can't look at it, they
    > need to give you enough information to be able to use it.


    The design is not encrypted but nobody really wanted to dig into those
    details so far. I agree with you that is probably worth spending some
    effort to understand how it works in the details and come out with a
    solution that is suited to fix the root of the problem rather than come
    up with ad-hoc solutions.

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    alb, Mar 18, 2014
    #3
  4. alb

    alb Guest

    forgot to add a note that I refer to in my previous article...

    alb <> wrote:
    []
    > I already know that, at the target frequency, it will take more than one
    > cycle to complete most of the operations, therefore my timing analysis
    > will fail miserably. Not only that, without releasing some timing
    > constraints on some specific path, the synthesis tool will take the
    > worst path and extract the max frequency from that [1].

    []

    [1] I actually do not know the way the synthesis tool works, but it
    seems my simple model pretty much matches what is happening
    alb, Mar 18, 2014
    #4
  5. alb

    HT-Lab Guest

    On 17/03/2014 15:44, alb wrote:
    > Dear all,
    >
    > I have a microcontroller with an FPU which is delivered as an IP (I mean
    > the FPU). In order to run at a decent frequency, some of the operations
    > are allowed to complete in within a certain amount of cycles, but the
    > main problem is that we do not know how many.
    >
    > That said, if we run the synthesis tool without timing constraints on
    > those paths, we have a design that is much slower than can be.
    > Multicycle constraints are out of question because they are hard to
    > verify and maintain, so we decided to set false paths and perform
    > post-layout sims to extract those values to be used in the RTL in a
    > second iteration.
    >
    > There are several reasons why I do not particularly like this approach:
    >
    > 1. it relies on post-layout sims which are resource consuming
    > 2. if we change technology we will likely need to do the whole process
    > again
    > 3. we are obliged to perform incremental place&route since an optimized
    > implementation (maybe done automatically) may have an impact on our
    > delays.
    >
    > So far we have not come out with an alternative solution that is not
    > going to imply redesign (like pipelining, c-slowing, retiming, ...).
    >
    > Any ideas/suggestions?


    I would suggest you speak to your boss to see if you can spend some
    money on getting a Fishtail Focus license. This tool will automatically
    extract multicyle and false path from your design. The output is a bunch
    of SDC constraints and assertions (PSL/SVA) for verifications.

    http://www.fishtail-da.com/

    Regards,
    Hans.
    www.ht-lab.com


    >
    > Al
    >
    HT-Lab, Mar 19, 2014
    #5
  6. alb

    alb Guest

    Hi Glen, sorry for the delayed reply...been quite busy lately.
    In comp.lang.vhdl glen herrmannsfeldt <> wrote:
    []
    >> I have a microcontroller with an FPU which is delivered as an IP (I mean
    >> the FPU). In order to run at a decent frequency, some of the operations
    >> are allowed to complete in within a certain amount of cycles, but the
    >> main problem is that we do not know how many.

    []
    > Though more usual, the logic should have a signal indicating when
    > the result is valid.


    I have digged a little in the code and found a signal called /ready/ and
    thought I solved my issues, but then wait a minute, how can you
    implement a signal ready that takes into account a combinatorial path?
    And even if, I need to inform my synthesis tool about those paths being
    either multicycle paths or false paths, otherwise it'll try to make them
    fit in a single clock cycle.

    []
    > The FPUs that I know of should be pipelined. (Is there a clock input?)
    > You shouldn't have to do the pipelining, but you do need to know the
    > number of clock cycles (and clock rate) for each operation.


    For a pipelined FPU the signal /ready/ makes much more sense (at least
    that's the only sense I see), but being not the case here I'll have to
    find a different way to verify the design.

    > If the design is encrypted, such that you can't look at it, they
    > need to give you enough information to be able to use it.


    I might have found a different approach. Being the FPU part of an
    embedded microprocessor, I may take the advantage of having the
    possibility to run a program on it and perform the verification with it.
    My testbench would not generate any particular signal, just the ones
    enough for the embedded processor to run, but the program loaded into it
    will perform the FPU operations and check they are indeed correct. If
    not I'll need to incrementally add a clock cycle delay before fetching
    the result into the output register.

    This approach might be very time consuming, but I see two main advantages:

    1. it's totally agnostic w.r.t. the implementation. I do not need to
    know the details and I can run it for any technology, without the need
    to update my multicycle paths (I still need to keep the false path in
    place though).

    2. it's the simulator that works, not me. Considering how much I'm paid
    per hour, I think it is much less expensive if a stupid machine does the
    job instead of me.

    I have not yet run a full-fledged program within modelsim, but I managed
    to run a simple 'hello world' program with no time.

    Al
    alb, Mar 24, 2014
    #6
  7. alb

    rickman Guest

    On 3/17/2014 11:44 AM, alb wrote:
    > Dear all,
    >
    > I have a microcontroller with an FPU which is delivered as an IP (I mean
    > the FPU). In order to run at a decent frequency, some of the operations
    > are allowed to complete in within a certain amount of cycles, but the
    > main problem is that we do not know how many.
    >
    > That said, if we run the synthesis tool without timing constraints on
    > those paths, we have a design that is much slower than can be.
    > Multicycle constraints are out of question because they are hard to
    > verify and maintain, so we decided to set false paths and perform
    > post-layout sims to extract those values to be used in the RTL in a
    > second iteration.
    >
    > There are several reasons why I do not particularly like this approach:
    >
    > 1. it relies on post-layout sims which are resource consuming
    > 2. if we change technology we will likely need to do the whole process
    > again
    > 3. we are obliged to perform incremental place&route since an optimized
    > implementation (maybe done automatically) may have an impact on our
    > delays.
    >
    > So far we have not come out with an alternative solution that is not
    > going to imply redesign (like pipelining, c-slowing, retiming, ...).
    >
    > Any ideas/suggestions?
    >
    > Al


    If I understand you correctly, you have a piece of combinatorial logic
    and you need to know how fast it will run in your design. This will
    then let your surrounding circuitry wait some number of clock cycles to
    read the result, that give you a longer delay than the delay though the
    logic.

    I think your starting premise that multi-cycle constraints are "out of
    the question" is where you have erred. Multi-cycle constraints are
    exactly what are required and if you don't understand how to use them
    you are not likely to get a good result.

    Post P&R simulation is not a good way to validate timing because it is
    so hard to cover every path through the logic. Static timing analysis
    is the right way to do this and you need to learn to use it properly.

    --

    Rick
    rickman, Apr 2, 2014
    #7
  8. alb

    alb Guest

    Hi Rick,

    rickman <> wrote:
    []
    >> I have a microcontroller with an FPU which is delivered as an IP (I mean
    >> the FPU). In order to run at a decent frequency, some of the operations
    >> are allowed to complete in within a certain amount of cycles, but the
    >> main problem is that we do not know how many.
    >>
    >> That said, if we run the synthesis tool without timing constraints on
    >> those paths, we have a design that is much slower than can be.
    >> Multicycle constraints are out of question because they are hard to
    >> verify and maintain, so we decided to set false paths and perform
    >> post-layout sims to extract those values to be used in the RTL in a
    >> second iteration.

    []
    >
    > If I understand you correctly, you have a piece of combinatorial logic
    > and you need to know how fast it will run in your design. This will
    > then let your surrounding circuitry wait some number of clock cycles to
    > read the result, that give you a longer delay than the delay though the
    > logic.


    Precisely.

    > I think your starting premise that multi-cycle constraints are "out of
    > the question" is where you have erred. Multi-cycle constraints are
    > exactly what are required and if you don't understand how to use them
    > you are not likely to get a good result.


    There are two aspects here to consider:

    1. multicycle constraints need not only a /from/ and /to/ parameter, they also
    need a /through/ parameter. When you have a logic depth of 111 gates you start to
    understand why a multicycle constraint cannot be a sustainable solution.

    2. My experience in setting up multicycle constraints is nearly zero and starting
    off with such an approach on this type of project would be begging for troubles.

    > Post P&R simulation is not a good way to validate timing because it is
    > so hard to cover every path through the logic. Static timing analysis
    > is the right way to do this and you need to learn to use it properly.


    I've read several times on this group the skepticism behind static timing analysis
    when multicycle constraints are in place. I have to search back in the archives to
    really understand the technical motivations, but the bottom line is:

    a. is difficult to maintain them; if the logic path has been optimized the
    constraint does not work anymore
    b. is difficult to verify them; if the path *is not* multicycle you may wrongly
    relax the timing too much and never realize until another optimization takes place
    and your circuit does not work any more.

    If anyone sees a flaw in my points above I'd be glad to be corrected.

    Al

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    alb, Apr 3, 2014
    #8
  9. alb

    HT-Lab Guest

    On 03/04/2014 08:05, alb wrote:

    Hi Al,

    ...
    > I've read several times on this group the skepticism behind static timing analysis
    > when multicycle constraints are in place. I have to search back in the archives to
    > really understand the technical motivations, but the bottom line is:
    >
    > a. is difficult to maintain them; if the logic path has been optimized the
    > constraint does not work anymore


    I think you are confusing propagation (or false path) delay with
    multicycle path delay. A multicycle delay is a synchronous "number of
    clock cycle" based delay, it does not depend on the clock frequency. You
    use this delay if you know your circuit takes n clock cycles to
    propagate the result to the destination register.

    > b. is difficult to verify them;


    You can easily verify them using assertions, see end of the pdf below:

    https://www.synopsys.com/Community/..._pres/2005april/17_SystemVerilog_FishTail.pdf

    As I mentioned in another thread, learn PSL, it is a real eye opener for
    verification.

    if the path *is not* multicycle you may wrongly
    > relax the timing too much and never realize until another optimization takes place
    > and your circuit does not work any more.


    Not exactly, you will simply not get timing closure and your will
    probably end up using more resources then necessary.

    Regards,
    Hans.
    www ht-lab.com


    >
    > If anyone sees a flaw in my points above I'd be glad to be corrected.
    >
    > Al
    >
    HT-Lab, Apr 3, 2014
    #9
  10. alb

    alb Guest

    Hi Hans,

    HT-Lab <> wrote:
    [removed text and added reference in square brackets]
    >> a. is difficult to maintain them [multicycle constraints]; if the
    >> logic path has been optimized the constraint does not work anymore

    >
    > I think you are confusing propagation (or false path) delay with
    > multicycle path delay. A multicycle delay is a synchronous "number of
    > clock cycle" based delay, it does not depend on the clock frequency.


    IMHO a multicycle path delay is a propagation delay specified as
    relative to the clock period. Hence it *does* depend on the clock
    frequency, while the propagation through your gates does not (it depends
    on the technology).

    If your path takes 12.3 ns you would have to set a multicycle constraint
    of 2 with a 100MHz clock, but 3 with a 200MHz one.

    A false path is a different story. You want to inform your synthesis
    tool that a certain path is never going to be used so do not bother
    optimizing it.

    > You
    > use this delay if you know your circuit takes n clock cycles to
    > propagate the result to the destination register.


    If I know when I will be reading the result on the destination register
    I may relax the time it will take to propagate the result, but on the
    contrary if I want to know when is the earliest moment to go and get the
    result, the multicycle path is of no use.

    >> b. is difficult to verify them;

    >
    > You can easily verify them using assertions, see end of the pdf below:
    >
    > https://www.synopsys.com/Community/..._pres/2005april/17_SystemVerilog_FishTail.pdf
    >
    > As I mentioned in another thread, learn PSL, it is a real eye opener for
    > verification.


    It's in the pipe...a very long one unfortunately ;-) But thanks for the
    pointer.

    Can you verify if a certain path is not violating the setup time of your
    register? Can you verify what is the delay it takes to go from
    register A to register B through some logic?

    >
    > if the path *is not* multicycle you may wrongly
    >> relax the timing too much and never realize until another optimization takes place
    >> and your circuit does not work any more.

    >
    > Not exactly, you will simply not get timing closure and your will
    > probably end up using more resources then necessary.


    Assume a single cycle path that you set to be multicycle because of
    mistake in your analysis. The synthesis tool will not know if your
    multicycle path is correct or wrong, therefore it will relax the timing
    between the selected end points and you will sample the result at the
    wrong time.

    The STA will correctly report the path is indeed fulfilling the
    constraint, but the logic will take the result too early. If you decided
    not to roll your postlayout sim because you relied on your STA, then you
    are set to find nasty surprises on the bench.

    Al
    alb, Apr 3, 2014
    #10
  11. alb

    HT-Lab Guest

    Hi Al,

    On 03/04/2014 11:48, alb wrote:
    > Hi Hans,
    >
    > HT-Lab <> wrote:
    > [removed text and added reference in square brackets]
    >>> a. is difficult to maintain them [multicycle constraints]; if the
    >>> logic path has been optimized the constraint does not work anymore

    >>
    >> I think you are confusing propagation (or false path) delay with
    >> multicycle path delay. A multicycle delay is a synchronous "number of
    >> clock cycle" based delay, it does not depend on the clock frequency.

    >
    > IMHO a multicycle path delay is a propagation delay specified as
    > relative to the clock period. Hence it *does* depend on the clock
    > frequency, while the propagation through your gates does not (it depends
    > on the technology).


    You still have your terminology wrong, here is a SDC example of an
    typical MCP constraint:

    set_multicycle_path 2 -from reg_alu* -to reg_mult*

    Notice there is no time, just a natural number of clock cycles.

    >
    > If your path takes 12.3 ns you would have to set a multicycle constraint
    > of 2 with a 100MHz clock, but 3 with a 200MHz one.


    You are mixing your constraints. If your combinational path takes 12.3
    ns you set a clock constraint of 81MHz. If you have a MCP in your design
    you are most likely controlling the output register with an enable pin.
    You do not use a MCP to constraint a propagation delay.

    ...
    >
    > Can you verify if a certain path is not violating the setup time of your
    > register? Can you verify what is the delay it takes to go from
    > register A to register B through some logic?


    Not with assertions,

    Regards,
    Hans.
    www.ht-lab.com
    HT-Lab, Apr 3, 2014
    #11
  12. alb

    KJ Guest

    On Thursday, April 3, 2014 8:33:39 AM UTC-4, HT-Lab wrote:
    > > IMHO a multicycle path delay is a propagation delay specified as
    > > relative to the clock period. Hence it *does* depend on the clock
    > > frequency, while the propagation through your gates does not (it depends
    > > on the technology).

    >
    > You still have your terminology wrong, here is a SDC example of an
    > typical MCP constraint:
    >
    > set_multicycle_path 2 -from reg_alu* -to reg_mult*
    >
    > Notice there is no time, just a natural number of clock cycles.


    The value of '2' though is computed based on the clock period. Alb already pointed that out earlier in the thread "If your path takes 12.3 ns you would have to set a multicycle constraint of 2 with a 100MHz clock, but 3 with a 200MHz one."

    Kevin Jennings
    KJ, Apr 3, 2014
    #12
  13. alb

    alb Guest

    Hi Hans,

    HT-Lab <> wrote:
    []
    >> IMHO a multicycle path delay is a propagation delay specified as
    >> relative to the clock period. Hence it *does* depend on the clock
    >> frequency, while the propagation through your gates does not (it depends
    >> on the technology).

    >
    > You still have your terminology wrong, here is a SDC example of an
    > typical MCP constraint:
    >
    > set_multicycle_path 2 -from reg_alu* -to reg_mult*
    >


    I apologize but I did not understand from this example what is wrong in
    my terminology.

    > Notice there is no time, just a natural number of clock cycles.
    >


    reading out loud your MCP constraint:

    'the propagation delay from reg_alu* to reg_mult* has to be smaller than
    2 clock cycles (minus setup time)'

    Notion of time is automatically inferred by your tool since it knows
    what is the clock period for those particular registers. If the two
    registers are in two different clock domains I doubt you can really set
    a multicycle path constraint (at least it does not make sense to me).

    >> If your path takes 12.3 ns you would have to set a multicycle constraint
    >> of 2 with a 100MHz clock, but 3 with a 200MHz one.

    >
    > You are mixing your constraints. If your combinational path takes 12.3
    > ns you set a clock constraint of 81MHz. If you have a MCP in your design
    > you are most likely controlling the output register with an enable pin.


    I have to find out how much time I need to wait before sampling the
    logic with my output enable. There are several (in the 100s) paths
    between input and output (it's an fpu), therefore I can die under a pile
    of multicycle path constraints.

    > You do not use a MCP to constraint a propagation delay.


    IMHO yes you do. You are telling the synthesis tool that a particular
    path (or branch of a graph) can have a propagation delay:

    Tp < N * clock_period - Tsetup

    rather than the usual:

    Tp < clock_period - Tsetup

    Why would you think the MCP does not constraint the propagation delay?

    Al
    alb, Apr 3, 2014
    #13
  14. alb

    HT-Lab Guest

    On 03/04/2014 14:30, KJ wrote:
    > On Thursday, April 3, 2014 8:33:39 AM UTC-4, HT-Lab wrote:
    >>> IMHO a multicycle path delay is a propagation delay specified as
    >>> relative to the clock period. Hence it *does* depend on the clock
    >>> frequency, while the propagation through your gates does not (it depends
    >>> on the technology).

    >>
    >> You still have your terminology wrong, here is a SDC example of an
    >> typical MCP constraint:
    >>
    >> set_multicycle_path 2 -from reg_alu* -to reg_mult*
    >>
    >> Notice there is no time, just a natural number of clock cycles.

    >
    > The value of '2' though is computed based on the clock period. Alb already pointed that out earlier in the thread "If your path takes 12.3 ns you would have to set a multicycle constraint of 2 with a 100MHz clock, but 3 with a 200MHz one."
    >


    We are taking about different issues here. My argument is that you
    should not exchange a clock constraint for an MCP one,

    Regards,
    Hans.
    www.ht-lab.com

    > Kevin Jennings
    >
    HT-Lab, Apr 3, 2014
    #14
  15. alb

    HT-Lab Guest

    On 03/04/2014 14:48, alb wrote:

    Hi AL,

    > Hi Hans,
    >
    > HT-Lab <> wrote:
    > []
    >>> IMHO a multicycle path delay is a propagation delay specified as
    >>> relative to the clock period. Hence it *does* depend on the clock
    >>> frequency, while the propagation through your gates does not (it depends
    >>> on the technology).

    >>
    >> You still have your terminology wrong, here is a SDC example of an
    >> typical MCP constraint:
    >>
    >> set_multicycle_path 2 -from reg_alu* -to reg_mult*
    >>

    >
    > I apologize but I did not understand from this example what is wrong in
    > my terminology.
    >
    >> Notice there is no time, just a natural number of clock cycles.
    >>

    >
    > reading out loud your MCP constraint:
    >
    > 'the propagation delay from reg_alu* to reg_mult* has to be smaller than
    > 2 clock cycles (minus setup time)'
    >
    > Notion of time is automatically inferred by your tool since it knows
    > what is the clock period for those particular registers. If the two
    > registers are in two different clock domains I doubt you can really set
    > a multicycle path constraint (at least it does not make sense to me).
    >
    >>> If your path takes 12.3 ns you would have to set a multicycle constraint
    >>> of 2 with a 100MHz clock, but 3 with a 200MHz one.

    >>
    >> You are mixing your constraints. If your combinational path takes 12.3
    >> ns you set a clock constraint of 81MHz. If you have a MCP in your design
    >> you are most likely controlling the output register with an enable pin.

    >
    > I have to find out how much time I need to wait before sampling the
    > logic with my output enable. There are several (in the 100s) paths
    > between input and output (it's an fpu), therefore I can die under a pile
    > of multicycle path constraints.
    >
    >> You do not use a MCP to constraint a propagation delay.


    Poor choice of words on my part, I should have said you don't use an MCP
    constraint as a clock constraint.

    Regards,
    Hans.
    www.ht-lab.com

    >
    > IMHO yes you do. You are telling the synthesis tool that a particular
    > path (or branch of a graph) can have a propagation delay:
    >
    > Tp < N * clock_period - Tsetup
    >
    > rather than the usual:
    >
    > Tp < clock_period - Tsetup
    >
    > Why would you think the MCP does not constraint the propagation delay?
    >
    > Al
    >
    HT-Lab, Apr 3, 2014
    #15
  16. alb

    KJ Guest

    On Thursday, April 3, 2014 9:48:52 AM UTC-4, alb wrote:
    > I have to find out how much time I need to wait before sampling the
    > logic with my output enable. There are several (in the 100s) paths
    > between input and output (it's an fpu), therefore I can die under a pile
    > of multicycle path constraints.


    You should be able to wild card the path sources inside your block and specify exactly the output enable signal. There should be no need to specify each path source explicitly.

    Kevin Jennings
    KJ, Apr 3, 2014
    #16
  17. alb

    rickman Guest

    On 4/3/2014 3:05 AM, alb wrote:> Hi Rick,
    >
    > rickman <> wrote:
    > []
    >>> I have a microcontroller with an FPU which is delivered as an IP (I

    mean
    >>> the FPU). In order to run at a decent frequency, some of the operations
    >>> are allowed to complete in within a certain amount of cycles, but the
    >>> main problem is that we do not know how many.
    >>>
    >>> That said, if we run the synthesis tool without timing constraints on
    >>> those paths, we have a design that is much slower than can be.
    >>> Multicycle constraints are out of question because they are hard to
    >>> verify and maintain, so we decided to set false paths and perform
    >>> post-layout sims to extract those values to be used in the RTL in a
    >>> second iteration.

    > []
    >>
    >> If I understand you correctly, you have a piece of combinatorial logic
    >> and you need to know how fast it will run in your design. This will
    >> then let your surrounding circuitry wait some number of clock cycles to
    >> read the result, that give you a longer delay than the delay though the
    >> logic.

    >
    > Precisely.
    >
    >> I think your starting premise that multi-cycle constraints are "out of
    >> the question" is where you have erred. Multi-cycle constraints are
    >> exactly what are required and if you don't understand how to use them
    >> you are not likely to get a good result.

    >
    > There are two aspects here to consider:
    >
    > 1. multicycle constraints need not only a /from/ and /to/

    parameter, they also
    > need a /through/ parameter. When you have a logic depth of 111 gates

    you start to
    > understand why a multicycle constraint cannot be a sustainable solution.


    I can't say I follow that. I have only ever specified a from and to
    parameter for a timing constraint. I have never needed to indicate a
    "through" parameter. If you have special sections of the logic that
    need a shorter timing constraint than others, I would expect that to be
    a subset of the from and to, not a special "though" path. Is there
    something unique about your design that a simple from and to spec
    doesn't capture the nuance?


    > 2. My experience in setting up multicycle constraints is nearly

    zero and starting
    > off with such an approach on this type of project would be begging

    for troubles.

    How much experience do you have with any of the other approaches you are
    trying? I mean, you are here asking for advice. So clearly there are
    things about each of these approaches you are not familiar with.


    >> Post P&R simulation is not a good way to validate timing because it is
    >> so hard to cover every path through the logic. Static timing analysis
    >> is the right way to do this and you need to learn to use it properly.

    >
    > I've read several times on this group the skepticism behind static

    timing analysis
    > when multicycle constraints are in place. I have to search back in

    the archives to
    > really understand the technical motivations, but the bottom line is:
    >
    > a. is difficult to maintain them; if the logic path has been

    optimized the
    > constraint does not work anymore


    I don't follow that either. It is seldom that any from/to path would be
    optimized away. If it is, it is likely due to an error in your code
    which you will need to fix anyway.


    > b. is difficult to verify them; if the path *is not* multicycle you

    may wrongly
    > relax the timing too much and never realize until another

    optimization takes place
    > and your circuit does not work any more.


    ALL timing constraints are difficult to verify... no, make that
    impossible. That has always been one of my complaints about static
    timing analysis, there is no way to verify the constraints other than
    the coverage number which is just a pass/fail sort of thing.


    > If anyone sees a flaw in my points above I'd be glad to be corrected.


    Perhaps I am missing something. ???

    --

    Rick
    rickman, Apr 3, 2014
    #17
  18. alb

    rickman Guest

    On 4/3/2014 10:22 AM, HT-Lab wrote:> On 03/04/2014 14:30, KJ wrote:
    >> On Thursday, April 3, 2014 8:33:39 AM UTC-4, HT-Lab wrote:
    >>>> IMHO a multicycle path delay is a propagation delay specified as
    >>>> relative to the clock period. Hence it *does* depend on the clock
    >>>> frequency, while the propagation through your gates does not (it
    >>>> depends
    >>>> on the technology).
    >>>
    >>> You still have your terminology wrong, here is a SDC example of an
    >>> typical MCP constraint:
    >>>
    >>> set_multicycle_path 2 -from reg_alu* -to reg_mult*
    >>>
    >>> Notice there is no time, just a natural number of clock cycles.

    >>
    >> The value of '2' though is computed based on the clock period. Alb
    >> already pointed that out earlier in the thread "If your path takes
    >> 12.3 ns you would have to set a multicycle constraint of 2 with a
    >> 100MHz clock, but 3 with a 200MHz one."
    >>

    >
    > We are taking about different issues here. My argument is that you
    > should not exchange a clock constraint for an MCP one,


    I think you are misreading what is intended. It is assumed there is
    already a clock timing constraint of 100 MHz. That is for the general
    logic in this clock domain. But for a certain section of logic the
    output of the logic is not used for some number of clock cycles that
    will be determined by the delay through the logic which is expected to
    be longer than one clock cycle.

    The OP wants to set this number of clock cycles in the timing
    constraints of that special path to verify that the P&R output will work
    with the timing he has picked. If the timing fails he has the options
    of working to improve the timing in the P&R or changing the logic of the
    register enable to allow more clock cycles for this path.

    In no case would he want to change the timing constraint on the clock
    since that constraint is set by other aspects of his design goals.

    Do I misunderstand what you are trying to say?

    --

    Rick
    rickman, Apr 3, 2014
    #18
  19. alb

    alb Guest

    Hi Kevin,

    KJ <> wrote:
    >> I have to find out how much time I need to wait before sampling the
    >> logic with my output enable. There are several (in the 100s) paths
    >> between input and output (it's an fpu), therefore I can die under a pile
    >> of multicycle path constraints.

    >
    > You should be able to wild card the path sources inside your block and
    > specify exactly the output enable signal. There should be no need to
    > specify each path source explicitly.


    Imagine an fpu, with two input registers for the operands, one for the
    operator and an output register for the result. The result register is
    the one that will receive the output enable.

    Depending on the operator I will have a different path. If I wildcard
    the path sources than I'm overly constraining and a 'nop' operation will
    take as much as a division operation, which is not what we want.

    Since most of the combinatorial functions are reused several times in
    each operation, the datapath starts to be painfully complicated. That is
    the main reason why I discarded the option to setup multicycle
    constraints.

    The alternative, though, is not very palatable either. We decided to set
    false paths between the above mentioned registers and let post-par sim
    figure out whether we are in or out with our output enable. The problem
    is that post-par simulation may not cover the whole set of timing
    scerarios the logic will encounter.

    For instance I do not know if a backannotated simulation includes clock
    skew, while AFAIK it shoudl be taken into account in STA. The described
    approach tries to verify timing, but I'm not sure this is really going
    to be risk free.

    Certainly I can add some jitter to my clock within the simulation itself
    to make it more /realistic/ , but I will certainly not cover all the
    cases.

    Considering the target FPGA is an RTAX2000 (~20'000$), we are kind of
    unconfortable to proceed without a fully consistent picture.

    Al
    alb, Apr 4, 2014
    #19
  20. alb

    alb Guest

    Hi Rick,

    rickman <> wrote:
    []
    > > 1. multicycle constraints need not only a /from/ and /to/

    > parameter, they also
    > > need a /through/ parameter. When you have a logic depth of 111 gates

    > you start to
    > > understand why a multicycle constraint cannot be a sustainable solution.

    >
    > I can't say I follow that. I have only ever specified a from and to
    > parameter for a timing constraint. I have never needed to indicate a
    > "through" parameter. If you have special sections of the logic that
    > need a shorter timing constraint than others, I would expect that to be
    > a subset of the from and to, not a special "though" path. Is there
    > something unique about your design that a simple from and to spec
    > doesn't capture the nuance?


    Imagine your path between two registers (A and B) is set by another
    register C. The resulting operation is to be stored in register D. If
    you do not set a /through/ clause you will constraint each path with the
    maximum delay, which is not desirable.

    >
    > > 2. My experience in setting up multicycle constraints is nearly

    > zero and starting
    > > off with such an approach on this type of project would be begging

    > for troubles.
    >
    > How much experience do you have with any of the other approaches you are
    > trying? I mean, you are here asking for advice. So clearly there are
    > things about each of these approaches you are not familiar with.


    I've often done post-par sims, but it was combined with an STA,
    therefore I've always been sure the design was correct as long as STA
    did not report anything fishy *and* post-par sim succeeded.

    Recently I started to look at post-par sims as an additional step which
    is not necessarily required for synchronous logic as long as your input
    constraints are well defined.

    In this case we cannot use STA to do time analysis and I'm
    unconfortable.

    >
    > > a. is difficult to maintain them; if the logic path has been

    > optimized the
    > > constraint does not work anymore

    >
    > I don't follow that either. It is seldom that any from/to path would be
    > optimized away. If it is, it is likely due to an error in your code
    > which you will need to fix anyway.


    I certainly was talking about the /through/ clause I mentioned earlier.
    The synthesis tool might optimize away (or maybe rename) certain nets
    and you're constraint will not be applicable anymore.

    > > b. is difficult to verify them; if the path *is not* multicycle you

    > may wrongly
    > > relax the timing too much and never realize until another

    > optimization takes place
    > > and your circuit does not work any more.

    >
    > ALL timing constraints are difficult to verify... no, make that
    > impossible. That has always been one of my complaints about static
    > timing analysis, there is no way to verify the constraints other than
    > the coverage number which is just a pass/fail sort of thing.


    That is why you'd be better off if you didn't have them! :)
    alb, Apr 4, 2014
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Monty Hall
    Replies:
    0
    Views:
    684
    Monty Hall
    Feb 5, 2006
  2. Lucas, Todd
    Replies:
    0
    Views:
    691
    Lucas, Todd
    May 8, 2004
  3. Replies:
    1
    Views:
    443
    S. Justin Gengo
    Nov 29, 2005
  4. Sean Monaghan
    Replies:
    28
    Views:
    919
    Sylvia
    Sep 30, 2005
  5. Replies:
    1
    Views:
    3,201
    PeterKellner
    May 16, 2006
Loading...

Share This Page