divider algorithm

Discussion in 'VHDL' started by thunder, May 25, 2012.

  1. thunder

    thunder Guest

    Hello

    I am currently working on a block where i need to divide the 16-bit
    byte address by 5 to generate the 5-byte aligned address for the
    internal RAM.

    I started off by using an actual divider (ie five_byte_ram_addr <=
    byte_ram_adder /5). However, this is expensive in terms of area and
    does not meet timing.

    Can anyone point me to some algorithms which detail how to approximate
    the divider functionality (and in particular in this instance an
    alogorithm which alllows approximation of divide by 5)?

    Thanks in advance.

    JO
    thunder, May 25, 2012
    #1
    1. Advertising

  2. thunder

    Gabor Guest

    thunder wrote:
    > Hello
    >
    > I am currently working on a block where i need to divide the 16-bit
    > byte address by 5 to generate the 5-byte aligned address for the
    > internal RAM.
    >
    > I started off by using an actual divider (ie five_byte_ram_addr <=
    > byte_ram_adder /5). However, this is expensive in terms of area and
    > does not meet timing.
    >
    > Can anyone point me to some algorithms which detail how to approximate
    > the divider functionality (and in particular in this instance an
    > alogorithm which alllows approximation of divide by 5)?
    >
    > Thanks in advance.
    >
    > JO


    For dividing by a constant, the simplest method is to
    multiply by the constant's inverse. If you're using an
    FPGA that has a spare 18 x 18 multiplier you could just
    multiply by 2^16 / 5 and then shift the result right 16.
    Otherwise you'd need to code the multiplier using adders
    where your constant has 50% 1 bits so the size of this
    adder tree would be half that of a general purpose
    multiplier for the same number of bits.

    -- Gabor
    Gabor, May 25, 2012
    #2
    1. Advertising

  3. On Fri, 25 May 2012 05:26:01 -0700, thunder wrote:

    > Hello
    >
    > I am currently working on a block where i need to divide the 16-bit byte
    > address by 5 to generate the 5-byte aligned address for the internal
    > RAM.
    >
    > I started off by using an actual divider (ie five_byte_ram_addr <=
    > byte_ram_adder /5). However, this is expensive in terms of area and does
    > not meet timing.
    >
    > Can anyone point me to some algorithms which detail how to approximate
    > the divider functionality (and in particular in this instance an
    > alogorithm which alllows approximation of divide by 5)?
    >
    > Thanks in advance.
    >
    > JO



    Division by 5 is equivalent to multiplication by 1/5.

    1/5 in binary is 0.001100110011001100 and so on.

    There's a very obvious pattern in that binary number, which suggests a
    shift and add approach.

    Take your number N and multiply it by binary 11 (i.e. add it to itself
    shifted left by one). This will give the "11" part of the pattern.

    Take your N x 11 and multiply that by 10001 (i.e. add it to itself
    shifted right by 4 bits).
    Now we have N x 110011

    Then multiply that by 1000000001
    Now we have N x 11001100110011

    Then multiply that by 10000000000000001
    Now we have N x 110011001100110011001100110011

    etc.

    Stop when you get your desired accuracy. I suspect 3 or 4 two-input
    adders will be needed for your 16 bit argument, but I'm just guessing and
    you should verify this for yourself.

    Take the appropriate slice of the output (to position the binary point at
    the right place).

    Make sure your testbench covers the full input space. (I hope we learned
    something from the Pentium fdiv bug.)

    Regards,
    Allan
    Allan Herriman, May 25, 2012
    #3
  4. thunder

    Brian Davis Guest

    > Can anyone point me to some algorithms which detail how to approximate
    > the divider functionality (and in particular in this instance an
    > alogorithm which alllows approximation of divide by 5)?


    The online "INTEGER DIVISION BY CONSTANTS" addendum to
    "Hacker's Delight" has some relevant material :

    www.hackersdelight.org/divcMore.pdf

    Brian


    On May 25, 8:26 am, thunder <> wrote:
    > Hello
    >
    > I am currently working on a block where i need to divide the 16-bit
    > byte address by 5 to generate the 5-byte aligned address for the
    > internal RAM.
    >
    > I started off by using an actual divider (ie five_byte_ram_addr <=
    > byte_ram_adder /5). However, this is expensive in terms of area and
    > does not meet timing.
    >
    > Can anyone point me to some algorithms which detail how to approximate
    > the divider functionality (and in particular in this instance an
    > alogorithm which alllows approximation of divide by 5)?
    >
    > Thanks in advance.
    >
    > JO
    Brian Davis, May 25, 2012
    #4
  5. thunder

    Rob Gaddi Guest

    On Fri, 25 May 2012 05:26:01 -0700 (PDT)
    thunder <> wrote:

    > Hello
    >
    > I am currently working on a block where i need to divide the 16-bit
    > byte address by 5 to generate the 5-byte aligned address for the
    > internal RAM.
    >
    > I started off by using an actual divider (ie five_byte_ram_addr <=
    > byte_ram_adder /5). However, this is expensive in terms of area and
    > does not meet timing.
    >
    > Can anyone point me to some algorithms which detail how to approximate
    > the divider functionality (and in particular in this instance an
    > alogorithm which alllows approximation of divide by 5)?
    >
    > Thanks in advance.
    >
    > JO


    You've already got 3 votes for reciprocal multiplication, let me throw
    a different one out. Don't need your ram addresses to be divided by
    5. Re-architect something upstream in order to make it to where your
    ram addresses get divided by 8 instead, and you just waste 3 bytes of
    every 8 in the address space. Address space is usually pretty cheap,
    and you can often get away without even having any RAM to back up the
    gaps, just constants.

    I say this because, in a very similar situation, I found myself trying
    to implement a /24 to translate bus addresses into local addresses into
    a very wide RAM. Making the math work was easy, getting the math to
    integrate with the rest of the system without blowing my timing budget
    wasted several days to no avail, before I finally gave up and just
    padded out the space. Padding the space out took me 30 minutes of
    changing the spec, and 30 minutes of coding, worked the first time, and
    that was that.

    --
    Rob Gaddi, Highland Technology -- www.highlandtechnology.com
    Email address domain is currently out of order. See above to fix.
    Rob Gaddi, May 25, 2012
    #5
  6. thunder

    MikeWhy Guest

    "thunder" <> wrote in message
    news:...
    > Hello
    >
    > I am currently working on a block where i need to divide the 16-bit
    > byte address by 5 to generate the 5-byte aligned address for the
    > internal RAM.
    >
    > I started off by using an actual divider (ie five_byte_ram_addr <=
    > byte_ram_adder /5). However, this is expensive in terms of area and
    > does not meet timing.
    >
    > Can anyone point me to some algorithms which detail how to approximate
    > the divider functionality (and in particular in this instance an
    > alogorithm which alllows approximation of divide by 5)?


    Divide by '1' beats everything.

    "Internal RAM" could be almost anything. Make it 40-bits wide if you can.
    Could also stitch 8-bit bram to a 32-bit bram.
    MikeWhy, May 25, 2012
    #6
  7. Hi thunder

    checkout my constant division generator. You can use it as a base for a VHDL implementation. It produces optimized C or generic assembly code for a given constant divisor.

    http://sourceforge.net/projects/kdiv/

    Best regards,
    Nikolaos Kavvadias
    Nikolaos Kavvadias, Jun 5, 2012
    #7
  8. thunder

    Mister_J

    Joined:
    Jun 28, 2012
    Messages:
    8
    I think that the solution from Rob Gaddi is the best one, but if you want to do a division, here is
    how you can do :

    You could use the new fixed point type from fixed_pkg and use the function reciprocal to calculate
    1/5. If you use this function with a non constant then it won't synthetize well because it will do
    the whole calculation in 1 clock cycle.

    If we want to calculate 1/x where x is not a constant, we can use this ideal relation :
    y = 1.0/x => x * y = 1.0 = x * y_m + x * y_(m-1) + ... + x * y_(n+1) + x * y_n = Mult
    where y_k is either 2**k, either 0

    In the reality, Mult <= 1 because sometimes the division is not exact but rounded to low.

    The goal here will be to make Mult the closest possible to 1.0. First we calculate Mult = x * y_m,
    then Mult = x * y_m + x * y_(m-1), then Mult = x * y_m + x * y_(m-1) + x * y_(m-2) and so on until
    k=n. At each iteration, we will try to set y_k = 2**k, but if we see that Mult would become higher
    than 1.0, then we will set y_k to 0. Then at each iteration, Mult will come closer to 1.0 if y_k =
    2**k or stay with its current value if y_k = 0.

    Here is how can be declared x and y in VHDL :
    signal x : UFixed (a downto b); -- for example a = 11, b = -4
    signal y : UFixed (m downto n); -- m = -b-1 = 3; n = -a-1 = -12 => (3 downto -12)


    Here is some pseudo VHDL code :

    First iteration :

    Code:
    Mult_Next_Tmp := x * 2**m; -- We try to set y_m to 2**m
    Mult_Next := Mult_Next_Tmp when Mult_Next_Tmp <= 1.0 else
                 0;
    y(m) <= '1' when Mult_Next_Tmp <= 1.0 else
            '0';
    Second iteration :

    Code:
    Mult_Next_Tmp := Mult + x * 2**(m-1); -- We try to set y_(m-1) to 2**(m-1)
    Mult_Next := Mult_Next_Tmp when Mult_Next_Tmp <= 1.0 else
                 Mult;
    y(m-1) <= '1' when Mult_Next_Tmp <= 1.0 else
              '0';
    Third iteration :

    Code:
    Mult_Next_Tmp := Mult + x * 2**(m-2); -- We try to set y_(m-2) to 2**(m-2)
    Mult_Next := Mult_Next_Tmp when Mult_Next_Tmp <= 1.0 else
                 Mult;
    y(m-2) <= '1' when Mult_Next_Tmp <= 1.0 else
              '0';
    Last iterations :

    (...)


    Jonas
    Last edited: Jun 29, 2012
    Mister_J, Jun 29, 2012
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kpk
    Replies:
    37
    Views:
    16,648
    Bob Stephens
    Jan 6, 2004
  2. Schmigz

    clk divider

    Schmigz, Feb 9, 2004, in forum: VHDL
    Replies:
    7
    Views:
    5,647
  3. Patrick

    Frequency divider

    Patrick, May 17, 2004, in forum: VHDL
    Replies:
    6
    Views:
    4,682
    Charles Bailey
    May 21, 2004
  4. vhdl divider

    , Jan 7, 2005, in forum: VHDL
    Replies:
    2
    Views:
    1,266
    mgonullu
    Jan 9, 2005
  5. Matt Clement

    MCU clock divider vs. VHDL divider

    Matt Clement, Apr 20, 2006, in forum: VHDL
    Replies:
    3
    Views:
    4,215
    Marcus Harnisch
    Apr 28, 2006
Loading...

Share This Page