Very fast counter in VirtexII

Discussion in 'VHDL' started by Marty Ryba, Feb 21, 2009.

  1. Marty Ryba

    Marty Ryba Guest

    Hi gang,

    I have an idea for a tweak of my FPGA design that involves essentially
    building a time interval counter. I found that there are some IP cores out
    there that get as much as 100ps resolution, but before I go that route I
    want to experiment with something "free" first, especially since I don't
    need any bells and whistles like embedded bus protocols or programmable
    timers. Neither of the signals I want to time between are synchronous with
    my main clock, so I'm thinking of generating a new DCM just for this purpose
    (I think I have a few left in my XC2V6000-5). Otherwise my fastest clock is
    either 133 MHz or maybe 204.8 MHz coming from an outside clock chip (I might
    be able to goose it to 409.6 MHz).

    My question is there any good "how to" on writing a counter so that it runs
    at a maximum clock rate for my chip? I perused the Xilinx site, and there
    were some very old articles on fast counters in antique chip architectures;
    they provide OrCAD macros(?); not even VHDL.

    So, do I just naively code the counter and pray that synthesis does the
    right things (I don't need a huge number of bits; my maximum time interval
    is maybe 80 ns), or are there some tricks needed to get optimum clock speed
    (what could I rationally expect in this FPGA?)?

    Thanks for your help,

    Marty
    Marty Ryba, Feb 21, 2009
    #1
    1. Advertising

  2. "Marty Ryba" <> wrote in
    news:vQHnl.665$:

    > Hi gang,
    >
    > I have an idea for a tweak of my FPGA design that involves
    > essentially
    > building a time interval counter. I found that there are some IP cores
    > out there that get as much as 100ps resolution, but before I go that
    > route I want to experiment with something "free" first, especially
    > since I don't need any bells and whistles like embedded bus protocols
    > or programmable timers. Neither of the signals I want to time between
    > are synchronous with my main clock, so I'm thinking of generating a
    > new DCM just for this purpose (I think I have a few left in my
    > XC2V6000-5). Otherwise my fastest clock is either 133 MHz or maybe
    > 204.8 MHz coming from an outside clock chip (I might be able to goose
    > it to 409.6 MHz).
    >
    > My question is there any good "how to" on writing a counter so that it
    > runs at a maximum clock rate for my chip? I perused the Xilinx site,
    > and there were some very old articles on fast counters in antique chip
    > architectures; they provide OrCAD macros(?); not even VHDL.
    >
    > So, do I just naively code the counter and pray that synthesis does
    > the right things (I don't need a huge number of bits; my maximum time
    > interval is maybe 80 ns), or are there some tricks needed to get
    > optimum clock speed (what could I rationally expect in this FPGA?)?


    The "naive counter" has no chance of giving you a resolution of better
    than a ns with current FPGA technology.

    I'm guessing that most of the IP cores that achieve better than 1ns
    resolution do so by using a wider bus at a lower clock rate, e.g. a 100
    bit bus at 100MHz. You use logic to locate the bit position on the bus
    where a transition occurs. Each bit position in this contrived example
    represents 100ps, and each word represents 10ns.

    There are two basic ways of turning your 1 bit test signal into a wider
    bus:

    1. Use a SERDES. Most modern (larger) FPGAs have these built in, either
    as a true transceiver (with PLLs and CDR, etc.), or as a simple SERDES in
    to the IOB. The most recent FPGAs have on-board SERDES blocks that can
    sample at 100ps intervals.

    2. Use a (different) phase delay for each of the bits, and sample them
    all with the word clock. This has the advantage that the word clock is
    the highest frequency you need, however getting the phase delays right in
    an FPGA might be tricky. (This method is better suited to ASIC
    implementations.)

    There are some tricks you can use that will get you part way to your
    goal:
    - Use both clock edges for sampling. This gives you a 2x speedup (but
    requires a 50% duty cycle clock).
    - Use multiple phases from a DCM or PLL. This can give you a 4x
    speedup.

    Regards,
    Allan
    Allan Herriman, Feb 21, 2009
    #2
    1. Advertising

  3. Marty Ryba

    -jg Guest

    On Feb 21, 1:23 pm, "Marty Ryba" <>
    wrote:
    > Hi gang,
    >
    >     I have an idea for a tweak of my FPGA design that involves essentially
    > building a time interval counter. I found that there are some IP cores out
    > there that get as much as 100ps resolution, but before I go that route I
    > want to experiment with something "free" first, especially since I don't
    > need any bells and whistles like embedded bus protocols or programmable
    > timers. Neither of the signals I want to time between are synchronous with
    > my main clock


    Your title says fast counter, but the text says time interval.
    They are not quite the same thing.

    If you want to do precise interval timing, then multi-phase capture,
    and/or
    delay line capture will give you time-domain precisions above the
    clock frequency.

    What time-precision do you actually need ?
    eg 250MHz with 4 phases, resolves to 1ns

    I think I read the some of the very newest FPGAs can self-calibrate
    their
    delay lines, which saves you the trouble

    -jg
    -jg, Feb 21, 2009
    #3
  4. Note: Since Optus can't figure out how to run a Newsgroup server, the
    original post hasn't appeared for me ...

    > Marty Ryba wrote in news:vQHnl.665$:
    >
    >> Hi gang,
    >>
    >> I have an idea for a tweak of my FPGA design that involves
    >> essentially
    >> building a time interval counter. I found that there are some IP cores
    >> out there that get as much as 100ps resolution, but before I go that
    >> route I want to experiment with something "free" first, especially
    >> since I don't need any bells and whistles like embedded bus protocols
    >> or programmable timers.


    What sort of resolution and dead time do you need? If you're willing to do a
    bit of legwork with manual place'n'route, and consume a fair bit of
    resources, you can get in the order of 10 ps or so resolution and accuracy
    at the cost of 10's of nanoseconds of dead time. See:
    http://www-ppd.fnal.gov/EEDOffice-W/Projects/ckm/comadc/WaveletTDC.ppt
    They use an Altera Cyclone II, but I've implemented a similar thing on a
    Spartan 3E with reasonable success. I don't have good enough testing
    apparatus to properly measure the resolution and accuracy though. And at the
    10 ps level, you've got to think a bit about what's on the outside of the
    FPGA too ...

    The main downside to the Xilinx parts for this purpose is that you've only
    got 4 elements on the carry chain per block, as opposed to 8 in the Altera.
    You can also tweak out the dead time by throwing more resources at it
    (basically loop the end of the carry chain around to the start where it xors
    with the input, then do edge detection along the whole buffer and track the
    edges). Of course, this is so far beyond the point of "supported" that using
    it in a commercial project is debatable, but it's certainly a fun thing to
    play with.

    --
    Michael Brown
    Add michael@ to emboss.co.nz ---+--- My inbox is always open
    Michael Brown, Feb 21, 2009
    #4
  5. Marty Ryba

    gabor Guest

    On Feb 20, 10:32 pm, -jg <> wrote:
    > On Feb 21, 1:23 pm, "Marty Ryba" <>
    > wrote:
    >
    > > Hi gang,

    >
    > >     I have an idea for a tweak of my FPGA design that involves essentially
    > > building a time interval counter. I found that there are some IP cores out
    > > there that get as much as 100ps resolution, but before I go that route I
    > > want to experiment with something "free" first, especially since I don't
    > > need any bells and whistles like embedded bus protocols or programmable
    > > timers. Neither of the signals I want to time between are synchronous with
    > > my main clock

    >
    > Your title says fast counter, but the text says time interval.
    > They are not quite the same thing.
    >
    > If you want to do precise interval timing, then multi-phase capture,
    > and/or
    > delay line capture will give you time-domain precisions above the
    > clock frequency.
    >
    > What time-precision do you actually need ?
    > eg 250MHz with 4 phases, resolves to 1ns
    >
    > I think I read the some of the very newest FPGAs can self-calibrate
    > their
    > delay lines, which saves you the trouble
    >
    > -jg


    Virtex 2 doesn't have these structures. However I remember seeing
    appnotes using carry chains as delay elements. You basically
    run your input into the carry chain and then have a flip-flop
    at each stage in the chain all running on the master clock.
    Ideally your output would look like "1110000000" for a single
    transition, allowing you to interpolate between clock cycles.
    I think the original appnote was for a serdes using Virtex E,
    and the carry chain delay was used for phase adjustment
    without the IDELAY elements available in the newer parts.

    Regards,
    Gabor
    gabor, Feb 22, 2009
    #5
  6. Marty Ryba

    Guest

    On 21 Feb., 01:23, "Marty Ryba" <>
    wrote:
    > Hi gang,
    >
    >     I have an idea for a tweak of my FPGA design that involves essentially
    > building a time interval counter. I found that there are some IP cores out
    > there that get as much as 100ps resolution, but before I go that route I
    > want to experiment with something "free" first, especially since I don't
    > need any bells and whistles like embedded bus protocols or programmable
    > timers. Neither of the signals I want to time between are synchronous with
    > my main clock, so I'm thinking of generating a new DCM just for this purpose
    > (I think I have a few left in my XC2V6000-5). Otherwise my fastest clock is
    > either 133 MHz or maybe 204.8 MHz coming from an outside clock chip (I might
    > be able to goose it to 409.6 MHz).
    >
    > My question is there any good "how to" on writing a counter so that it runs
    > at a maximum clock rate for my chip? I perused the Xilinx site, and there
    > were some very old articles on fast counters in antique chip architectures;
    > they provide OrCAD macros(?); not even VHDL.
    >
    > So, do I just naively code the counter and pray that synthesis does the
    > right things (I don't need a huge number of bits; my maximum time interval
    > is maybe 80 ns), or are there some tricks needed to get optimum clock speed
    > (what could I rationally expect in this FPGA?)?
    >
    > Thanks for your help,
    >
    > Marty


    Hi Marty,
    the general way for fast design is reducing combinatorical logic. For
    counters (or FSMs) that means using shift register based designs.
    Depending on the number of clock cycles you want to count you can
    either design a simple FSM and use OneStateHot encoding, or build a
    Johnson counter with a small ripple generator (e.g. a edge detecting
    monoflop) or use a LFSR structure.
    All of these give you maximum speed.

    Have a nice synthesis
    Eilert
    , Feb 23, 2009
    #6
  7. Marty Ryba

    Guest

    On Feb 20, 8:24 pm, "Michael Brown" <> wrote:
    > Note: Since Optus can't figure out how to run a Newsgroup server, the
    > original post hasn't appeared for me ...
    >
    > > Marty Ryba wrote innews:vQHnl.665$:

    >
    > >> Hi gang,

    >
    > >>     I have an idea for a tweak of my FPGA design that involves
    > >>     essentially
    > >> building a time interval counter. I found that there are some IP cores
    > >> out there that get as much as 100ps resolution, but before I go that
    > >> route I want to experiment with something "free" first, especially
    > >> since I don't need any bells and whistles like embedded bus protocols
    > >> or programmable timers.

    >
    > What sort of resolution and dead time do you need? If you're willing to do a
    > bit of legwork with manual place'n'route, and consume a fair bit of
    > resources, you can get in the order of 10 ps or so resolution and accuracy
    > at the cost of 10's of nanoseconds of dead time. See:http://www-ppd.fnal.gov/EEDOffice-W/Projects/ckm/comadc/WaveletTDC.ppt
    > They use an Altera Cyclone II, but I've implemented a similar thing on a
    > Spartan 3E with reasonable success. I don't have good enough testing
    > apparatus to properly measure the resolution and accuracy though. And at the
    > 10 ps level, you've got to think a bit about what's on the outside of the
    > FPGA too ...
    >
    > The main downside to the Xilinx parts for this purpose is that you've only
    > got 4 elements on the carry chain per block, as opposed to 8 in the Altera.
    > You can also tweak out the dead time by throwing more resources at it
    > (basically loop the end of the carry chain around to the start where it xors
    > with the input, then do edge detection along the whole buffer and track the
    > edges). Of course, this is so far beyond the point of "supported" that using
    > it in a commercial project is debatable, but it's certainly a fun thing to
    > play with.
    >
    > --
    > Michael Brown
    > Add michael@ to emboss.co.nz ---+--- My inbox is always open


    Michael -

    Would you be willing to share your design? I'm curious to see what is
    involved.

    John Providenza
    , Feb 23, 2009
    #7
  8. Marty Ryba

    Marty Ryba Guest

    "-jg" <> wrote in message
    news:...
    On Feb 21, 1:23 pm, "Marty Ryba" <>
    wrote:
    > > I have an idea for a tweak of my FPGA design that involves essentially
    > > building a time interval counter. I found that there are some IP cores
    > > out
    > > there that get as much as 100ps resolution, but before I go that route I
    > > want to experiment with something "free" first, especially since I don't
    > > need any bells and whistles like embedded bus protocols or programmable
    > > timers. Neither of the signals I want to time between are synchronous
    > > with
    > > my main clock

    > Your title says fast counter, but the text says time interval.
    > They are not quite the same thing.
    > If you want to do precise interval timing, then multi-phase capture,
    > and/or
    > delay line capture will give you time-domain precisions above the
    > clock frequency.
    > What time-precision do you actually need ?
    > eg 250MHz with 4 phases, resolves to 1ns


    Thanks for the useful tips; it seems the primary approach is to "stretch"
    the signals of interest into fast elements like shift registers and/or carry
    chains, and then count these up at some leisure later (how?). That sounds
    like it takes a lot of resources (e.g., 16 ticks per slice if I use a
    SRL16E). This could explain why some of the papers I've glanced at seem to
    take pretty much an entire chip to make a couple of these high-end delay
    measuring devices. For now, since it seems feasible to run a small (8 bit)
    counter at 204.8 MHz, I'll try that route. 4.883 ns of precision is about
    1.5 meters when you multiply by c, so that's still useful to me. Once I get
    the basic structure figured out I can look at speeding it up. Today I got
    the input logic figured out (what signal is my start condition, and what is
    my stop). Since I'm using these signals to calibrate out differences between
    identical bitstreams on separated boards inside a common chassis, the
    differential delays inside the logic should mostly wash out.

    One "newbie" question: I notice you can't use an output pin signal to drive
    internal logic (at least Modelsim barfs on it). I ended up for now declaring
    a signal and copying some of the code that generates that output pin to
    generate my signal as well. Is there a "smarter" way?

    Thanks again,

    Marty
    Marty Ryba, Feb 24, 2009
    #8
  9. Marty Ryba

    Ken Cecka Guest

    Marty Ryba wrote:

    > "-jg" <> wrote in message
    > news:...
    > On Feb 21, 1:23 pm, "Marty Ryba" <>
    > wrote:
    >> > I have an idea for a tweak of my FPGA design that involves essentially
    >> > building a time interval counter. I found that there are some IP cores
    >> > out
    >> > there that get as much as 100ps resolution, but before I go that route
    >> > I want to experiment with something "free" first, especially since I
    >> > don't need any bells and whistles like embedded bus protocols or
    >> > programmable timers. Neither of the signals I want to time between are
    >> > synchronous with
    >> > my main clock

    >> Your title says fast counter, but the text says time interval.
    >> They are not quite the same thing.
    >> If you want to do precise interval timing, then multi-phase capture,
    >> and/or
    >> delay line capture will give you time-domain precisions above the
    >> clock frequency.
    >> What time-precision do you actually need ?
    >> eg 250MHz with 4 phases, resolves to 1ns

    >
    > Thanks for the useful tips; it seems the primary approach is to "stretch"
    > the signals of interest into fast elements like shift registers and/or
    > carry chains, and then count these up at some leisure later (how?). That
    > sounds like it takes a lot of resources (e.g., 16 ticks per slice if I use
    > a SRL16E). This could explain why some of the papers I've glanced at seem
    > to take pretty much an entire chip to make a couple of these high-end
    > delay measuring devices. For now, since it seems feasible to run a small
    > (8 bit) counter at 204.8 MHz, I'll try that route. 4.883 ns of precision
    > is about 1.5 meters when you multiply by c, so that's still useful to me.
    > Once I get the basic structure figured out I can look at speeding it up.
    > Today I got the input logic figured out (what signal is my start
    > condition, and what is my stop). Since I'm using these signals to
    > calibrate out differences between identical bitstreams on separated boards
    > inside a common chassis, the differential delays inside the logic should
    > mostly wash out.
    >
    > One "newbie" question: I notice you can't use an output pin signal to
    > drive internal logic (at least Modelsim barfs on it). I ended up for now
    > declaring a signal and copying some of the code that generates that output
    > pin to generate my signal as well. Is there a "smarter" way?


    You shouldn't need to duplicate an logic; just create an intermediate signal.

    For example this code would fail:

    ENTITY divider IS
    PORT
    (
    clk : IN STD_LOGIC;
    div : OUT STD_LOGIC
    );
    END divider;
    ARCHITECTURE model OF top IS
    BEGIN
    PROCES (clk)
    BEGIN
    IF (clk'EVENT) and (clk = '1') THEN
    div <= NOT div;
    END IF;
    END PROCESS;
    END;

    But can be fixed by using an intermediate signal:

    ENTITY divider IS
    PORT
    (
    clk : IN STD_LOGIC;
    div_o : OUT STD_LOGIC
    );
    END divider;
    ARCHITECTURE model OF top IS
    SIGNAL div : STD_LOGIC;
    BEGIN
    PROCES (clk)
    BEGIN
    IF (clk'EVENT) and (clk = '1') THEN
    div <= NOT div;
    END IF;
    END PROCESS;
    div_o <= div;
    END;

    Ken

    >
    > Thanks again,
    >
    > Marty
    Ken Cecka, Feb 24, 2009
    #9
  10. Marty Ryba

    OutputLogic

    Joined:
    May 20, 2009
    Messages:
    8
    Online LFSR Counter generator

    There is an online tool that generates a Verilog code for LFSR Counters of any value up to 31-bit wide. It's on "http OutputLogic dot com" [sorry, this site doesn't let me post a link in a regular way]
    LFSR counters are much smaller than regular ones, therefore can run faster. The catch is that they count only to a predefined value.

    Hope that helps
    OutputLogic, May 22, 2009
    #10
  11. Marty Ryba

    JohnDuq

    Joined:
    Dec 9, 2008
    Messages:
    88
    I like Ken's solution of an intermediate signal for accessing the output pin.

    Changing the port definition from OUT to INOUT is an option for some devices too.
    JohnDuq, May 29, 2009
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Yttrium

    VIRTEXII IO problem

    Yttrium, Nov 13, 2003, in forum: VHDL
    Replies:
    2
    Views:
    452
    Yttrium
    Nov 13, 2003
  2. Yttrium
    Replies:
    1
    Views:
    5,546
    David R Brooks
    Nov 27, 2003
  3. Raymond Arthur St. Marie II of III

    very Very VERY dumb Question About The new Set( ) 's

    Raymond Arthur St. Marie II of III, Jul 23, 2003, in forum: Python
    Replies:
    4
    Views:
    456
    Raymond Hettinger
    Jul 27, 2003
  4. shanx__=|;-

    very very very long integer

    shanx__=|;-, Oct 16, 2004, in forum: C Programming
    Replies:
    19
    Views:
    1,595
    Merrill & Michele
    Oct 19, 2004
  5. Abhishek Jha

    very very very long integer

    Abhishek Jha, Oct 16, 2004, in forum: C Programming
    Replies:
    4
    Views:
    410
    jacob navia
    Oct 17, 2004
Loading...

Share This Page