Storing many 32-bits "parameters" ?

Discussion in 'VHDL' started by Frédéric Lochon, Nov 29, 2008.

  1. Hello,

    I have a VHDL project which is implemented in a Virtex5. This Virtex5 is
    connected to a LAN chipset.
    I use this system to communicate with a PC, and more precisely this
    system gets "many" 32-bits parameters.

    And here is the problem: such a high number of 32-bits parameters (more
    than 100) increases the synthesis and implementation time quite
    significantly because of the number of registers inferred.

    So, I'm looking for a way of optimizing such a "high" number of
    registers. Because of how the parameters are acquired in the FPGA
    (through the "slow" LAN compared to an FPGA), I can accept that the
    storage of these parameters (which are used in several different places
    in the design) is not "fast", i.e. if there is a high propagation time
    through the whole design, it's ok.

    In other words, is there a way of storing such a number of parameters
    which would not degrade the synthesis and implementation times knowing
    the I have no criteria on the performance.

    I can very easily handle up to 1 ms (which is *huge* compared to the
    50MHz clock) between when the parameters gets into the FPGA and when I
    can use them in different VHDL modules (possibly at the same time), but,
    once they are usable, several parameters may be accessed at the same time.


    My first thought was about using some Coregen IP which would reduce (at
    least) synthesis time, for example distributed RAM. But I don't think it
    would be that efficient (I would use something like 4 times more bits
    than necessary).
    I also thought using a RAM with different input and output width but the
    ratio is quite small.

    In the past, I tried using "partitions", but ISE (at least 9.1) doesn't
    handle partitions very well, and was even more buggy.


    If someone has any idea, I'm interested.

    Thanks in advance.

    --
    L'equation de la vie est si complexe,
    que croire au libre arbitre est une bonne approximation.
    Frédéric Lochon, Nov 29, 2008
    #1
    1. Advertising

  2. Frédéric Lochon wrote:

    > And here is the problem: such a high number of 32-bits parameters (more
    > than 100) increases the synthesis and implementation time quite
    > significantly because of the number of registers inferred.


    This is the downside to debugging
    by trial and error synthesis.
    It may work fine in small cases,
    but it falls apart in large ones.
    A Virtex5 design might take an hour
    to re-image and test on the bench.
    If I am lucky enough to find the problem
    this way, it will take me another hour
    to test the solution even if the fix
    is one line of code.

    By writing my own synthesis and testbench code,
    I can code and sim a small change in a minute.
    I might run an synthesis once a week, but
    to check Fmax and utilization, not functionality.

    > So, I'm looking for a way of optimizing such a "high" number of
    > registers. Because of how the parameters are acquired in the FPGA
    > (through the "slow" LAN compared to an FPGA), I can accept that the
    > storage of these parameters (which are used in several different places
    > in the design) is not "fast", i.e. if there is a high propagation time
    > through the whole design, it's ok.


    The only thing that synthesizes faster than a register is a wire.
    Changing the design to a ram or shift registers will
    not make much difference to synthesis time.

    > In the past, I tried using "partitions", but ISE (at least 9.1) doesn't
    > handle partitions very well, and was even more buggy.


    Yes, partitions are rarely worth the pain.
    ISE might have synthesis options
    like quartus smart-compile to trade
    some disk space for synthesis time.

    Good luck on your design.

    -- Mike Treseler
    Mike Treseler, Nov 29, 2008
    #2
    1. Advertising

  3. On Nov 29, 4:16 pm, Frédéric Lochon <> wrote:
    > Hello,
    >
    > I have a VHDL project which is implemented in a Virtex5. This Virtex5 is
    > connected to a LAN chipset.
    > I use this system to communicate with a PC, and more precisely this
    > system gets "many" 32-bits parameters.
    >
    > And here is the problem: such a high number of 32-bits parameters (more
    > than 100) increases the synthesis and implementation time quite
    > significantly because of the number of registers inferred.
    >


    Perhaps I'm mis-understanding you, but you need to receive ~100 32-bit
    parameters, which is 400 bytes. What is wrong with instantiating a RAM
    block of the FPGA and filling the parameters into it from the LAN
    chip ?

    Eli
    Eli Bendersky, Nov 30, 2008
    #3
  4. Eli Bendersky a écrit :
    > On Nov 29, 4:16 pm, Frédéric Lochon <> wrote:
    >> Hello,
    >>
    >> I have a VHDL project which is implemented in a Virtex5. This Virtex5 is
    >> connected to a LAN chipset.
    >> I use this system to communicate with a PC, and more precisely this
    >> system gets "many" 32-bits parameters.
    >>
    >> And here is the problem: such a high number of 32-bits parameters (more
    >> than 100) increases the synthesis and implementation time quite
    >> significantly because of the number of registers inferred.
    >>

    >
    > Perhaps I'm mis-understanding you, but you need to receive ~100 32-bit
    > parameters, which is 400 bytes. What is wrong with instantiating a RAM
    > block of the FPGA and filling the parameters into it from the LAN
    > chip ?
    >


    There's nothing wrong with that. In fact, I'm doing it this way in a
    first step.
    But, as different LAN packets concern different modules, I have to load
    them from the packet RAM and store them in registers.
    This way I can use them at any clock edge without loosing time accessing
    them (which would be a real problem) and I can access all parameters at
    the same time.
    If I had used another RAM to store module-specific parameters, there
    would have been access time considerations and it would have been
    difficult (if not impossible) to access several parameters at the same time.

    I think that a good trick would have been to have some kind of N-port
    (read-only ports) RAM (most probably distributed-RAM) which, afaik,
    doesn't exist for Virtex5.
    I believe that such an "IP" would most certainly reduce synthesis time
    and probably implementation time.

    --
    L'equation de la vie est si complexe,
    que croire au libre arbitre est une bonne approximation.
    Frédéric Lochon, Nov 30, 2008
    #4
  5. Frédéric Lochon

    KJ Guest

    "Frédéric Lochon" <> wrote in message
    news:4932cc6c$0$32057$...
    > Eli Bendersky a écrit :
    >> On Nov 29, 4:16 pm, Frédéric Lochon <> wrote:
    >>> Hello,
    >>>
    >>> I have a VHDL project which is implemented in a Virtex5. This Virtex5 is
    >>> connected to a LAN chipset.
    >>> I use this system to communicate with a PC, and more precisely this
    >>> system gets "many" 32-bits parameters.
    >>>
    >>> And here is the problem: such a high number of 32-bits parameters (more
    >>> than 100) increases the synthesis and implementation time quite
    >>> significantly because of the number of registers inferred.
    >>>

    >>
    >> Perhaps I'm mis-understanding you, but you need to receive ~100 32-bit
    >> parameters, which is 400 bytes. What is wrong with instantiating a RAM
    >> block of the FPGA and filling the parameters into it from the LAN
    >> chip ?
    >>

    >
    > There's nothing wrong with that. In fact, I'm doing it this way in a first
    > step.
    > But, as different LAN packets concern different modules, I have to load
    > them from the packet RAM and store them in registers.
    > This way I can use them at any clock edge without loosing time accessing
    > them (which would be a real problem) and I can access all parameters at
    > the same time.
    > If I had used another RAM to store module-specific parameters, there would
    > have been access time considerations and it would have been difficult (if
    > not impossible) to access several parameters at the same time.
    >


    Since the parameters eventually need to be stored in registers in order to
    use them for their function you don't gain any inherent functionality by
    storing them in some RAM. What the RAM does buy you is decoupling of the
    parameter load from the usage (i.e. new parameters can get loaded while
    still using the old ones). You haven't stated though whether or not that is
    a requirement or not for your application. If it's not a requirement, then
    adding the RAM only makes your life more difficult. If it is a requirement,
    and the RAM therefore is needed to hold the parameters until they can be
    used later, then one useful trick is to make the 100 or so parameter
    registers be in a large shift chain. The RAM transfers the data into the
    start of the chain which then transfers it to the next, etc. You still have
    the individual registers and the RAM, but now the fanout is about as low as
    you can get. It's easiest if you can load all of the parameters in the
    chain, a bit more difficult if the loading needs to be selective.

    > I think that a good trick would have been to have some kind of N-port
    > (read-only ports) RAM (most probably distributed-RAM) which, afaik,
    > doesn't exist for Virtex5.


    They're called flip flops.

    > I believe that such an "IP" would most certainly reduce synthesis time


    I doubt it.

    > and probably implementation time.
    >


    As Mike mentioned in his post, simulation using testbenches is the best
    method to date for reducing the amount of time needed to get to a working
    design (along with static timing analysis). Long build times are only a
    problem when you try to debug on the bench/system.

    Kevin Jennings
    KJ, Nov 30, 2008
    #5
  6. Mike Treseler a écrit :
    > Frédéric Lochon wrote:
    >
    > The only thing that synthesizes faster than a register is a wire.
    > Changing the design to a ram or shift registers will
    > not make much difference to synthesis time.
    >


    I already noticed once a clear difference on synthesis and
    implementation time when I made a small modification in a vhdl code that
    made XST incapable of recognizing that a signal should be implemented as
    a block RAM. I know there is a big difference between block-RAM and
    registers, and this is why there is such a difference on synthesis time.

    >> In the past, I tried using "partitions", but ISE (at least 9.1)
    >> doesn't handle partitions very well, and was even more buggy.

    >
    > Yes, partitions are rarely worth the pain.
    > ISE might have synthesis options
    > like quartus smart-compile to trade
    > some disk space for synthesis time.
    >


    I think that in the future I will partition my design physically using 2
    FPGA, because my design is part of a versatile test system which has 1
    part quite "constant" and 1 part "variable", and the "constant" part is
    the most important part and thus the part which has the bigger impact on
    synthesis time.


    You're right when you say that simulation is the best way to have a
    working design, but because my design is at 90% the same whatever the
    final application, the synthesis time is quite annoying when the
    application is "simple", and this is why I'm interested in reducing the
    synthesis and implementation time.

    --
    L'equation de la vie est si complexe,
    que croire au libre arbitre est une bonne approximation.
    Frédéric Lochon, Nov 30, 2008
    #6
  7. Frédéric Lochon wrote:

    > I already noticed once a clear difference on synthesis and
    > implementation time when I made a small modification in a vhdl code that
    > made XST incapable of recognizing that a signal should be implemented as
    > a block RAM. I know there is a big difference between block-RAM and
    > registers, and this is why there is such a difference on synthesis time.


    Yes, a block ram synthesizes faster than the
    same number of registers. However, a block
    ram by itself, does not solve your problem.

    > I think that in the future I will partition my design physically using 2
    > FPGA, because my design is part of a versatile test system which has 1
    > part quite "constant" and 1 part "variable", and the "constant" part is
    > the most important part and thus the part which has the bigger impact on
    > synthesis time.


    Good luck.
    Note that any changed to the "constant" part
    is a start-over on partitioning.

    > You're right when you say that simulation is the best way to have a
    > working design, but because my design is at 90% the same whatever the
    > final application, the synthesis time is quite annoying when the
    > application is "simple", and this is why I'm interested in reducing the
    > synthesis and implementation time.


    Everyone is entitled to their own preferences.
    I find bouncing bits on the simulator to
    be an amusing pastime.

    -- Mike Treseler
    Mike Treseler, Dec 1, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GGG
    Replies:
    10
    Views:
    12,499
    Donar
    Jul 6, 2006
  2. sarmin kho
    Replies:
    2
    Views:
    810
    A. Lloyd Flanagan
    Jun 15, 2004
  3. Miki Tebeka
    Replies:
    1
    Views:
    427
    Marcin 'Qrczak' Kowalczyk
    Jun 14, 2004
  4. sergey

    "casting" bits to bits?

    sergey, Nov 8, 2006, in forum: VHDL
    Replies:
    1
    Views:
    681
    sergey
    Nov 8, 2006
  5. Tomás

    Value Bits Vs Object Bits

    Tomás, Jun 2, 2006, in forum: C Programming
    Replies:
    13
    Views:
    528
    Hallvard B Furuseth
    Jul 1, 2006
Loading...

Share This Page