divider algorithm

thunder · May 25, 2012

Hello

I am currently working on a block where i need to divide the 16-bit
byte address by 5 to generate the 5-byte aligned address for the
internal RAM.

I started off by using an actual divider (ie five_byte_ram_addr <=
byte_ram_adder /5). However, this is expensive in terms of area and
does not meet timing.

Can anyone point me to some algorithms which detail how to approximate
the divider functionality (and in particular in this instance an
alogorithm which alllows approximation of divide by 5)?

Thanks in advance.

JO

Gabor · May 25, 2012

thunder said:
Hello

I am currently working on a block where i need to divide the 16-bit
byte address by 5 to generate the 5-byte aligned address for the
internal RAM.

I started off by using an actual divider (ie five_byte_ram_addr <=
byte_ram_adder /5). However, this is expensive in terms of area and
does not meet timing.

Can anyone point me to some algorithms which detail how to approximate
the divider functionality (and in particular in this instance an
alogorithm which alllows approximation of divide by 5)?

Thanks in advance.

JO

For dividing by a constant, the simplest method is to
multiply by the constant's inverse. If you're using an
FPGA that has a spare 18 x 18 multiplier you could just
multiply by 2^16 / 5 and then shift the result right 16.
Otherwise you'd need to code the multiplier using adders
where your constant has 50% 1 bits so the size of this
adder tree would be half that of a general purpose
multiplier for the same number of bits.

-- Gabor

Allan Herriman · May 25, 2012

Hello

I am currently working on a block where i need to divide the 16-bit byte
address by 5 to generate the 5-byte aligned address for the internal
RAM.

I started off by using an actual divider (ie five_byte_ram_addr <=
byte_ram_adder /5). However, this is expensive in terms of area and does
not meet timing.

Can anyone point me to some algorithms which detail how to approximate
the divider functionality (and in particular in this instance an
alogorithm which alllows approximation of divide by 5)?

Thanks in advance.

JO

Division by 5 is equivalent to multiplication by 1/5.

1/5 in binary is 0.001100110011001100 and so on.

There's a very obvious pattern in that binary number, which suggests a
shift and add approach.

Take your number N and multiply it by binary 11 (i.e. add it to itself
shifted left by one). This will give the "11" part of the pattern.

Take your N x 11 and multiply that by 10001 (i.e. add it to itself
shifted right by 4 bits).
Now we have N x 110011

Then multiply that by 1000000001
Now we have N x 11001100110011

Then multiply that by 10000000000000001
Now we have N x 110011001100110011001100110011

etc.

Stop when you get your desired accuracy. I suspect 3 or 4 two-input
adders will be needed for your 16 bit argument, but I'm just guessing and
you should verify this for yourself.

Take the appropriate slice of the output (to position the binary point at
the right place).

Make sure your testbench covers the full input space. (I hope we learned
something from the Pentium fdiv bug.)

Regards,
Allan

Brian Davis · May 25, 2012

Can anyone point me to some algorithms which detail how to approximate

the divider functionality (and in particular in this instance an
alogorithm which alllows approximation of divide by 5)?

The online "INTEGER DIVISION BY CONSTANTS" addendum to
"Hacker's Delight" has some relevant material :

www.hackersdelight.org/divcMore.pdf

Brian

Rob Gaddi · May 25, 2012

Hello

I am currently working on a block where i need to divide the 16-bit
byte address by 5 to generate the 5-byte aligned address for the
internal RAM.

I started off by using an actual divider (ie five_byte_ram_addr <=
byte_ram_adder /5). However, this is expensive in terms of area and
does not meet timing.

Can anyone point me to some algorithms which detail how to approximate
the divider functionality (and in particular in this instance an
alogorithm which alllows approximation of divide by 5)?

Thanks in advance.

JO

You've already got 3 votes for reciprocal multiplication, let me throw
a different one out. Don't need your ram addresses to be divided by
5. Re-architect something upstream in order to make it to where your
ram addresses get divided by 8 instead, and you just waste 3 bytes of
every 8 in the address space. Address space is usually pretty cheap,
and you can often get away without even having any RAM to back up the
gaps, just constants.

I say this because, in a very similar situation, I found myself trying
to implement a /24 to translate bus addresses into local addresses into
a very wide RAM. Making the math work was easy, getting the math to
integrate with the rest of the system without blowing my timing budget
wasted several days to no avail, before I finally gave up and just
padded out the space. Padding the space out took me 30 minutes of
changing the spec, and 30 minutes of coding, worked the first time, and
that was that.

MikeWhy · May 25, 2012

thunder said:
Hello

I am currently working on a block where i need to divide the 16-bit
byte address by 5 to generate the 5-byte aligned address for the
internal RAM.

I started off by using an actual divider (ie five_byte_ram_addr <=
byte_ram_adder /5). However, this is expensive in terms of area and
does not meet timing.

Can anyone point me to some algorithms which detail how to approximate
the divider functionality (and in particular in this instance an
alogorithm which alllows approximation of divide by 5)?

Divide by '1' beats everything.

"Internal RAM" could be almost anything. Make it 40-bits wide if you can.
Could also stitch 8-bit bram to a 32-bit bram.

Nikolaos Kavvadias · Jun 5, 2012

Hi thunder

checkout my constant division generator. You can use it as a base for a VHDL implementation. It produces optimized C or generic assembly code for a given constant divisor.

http://sourceforge.net/projects/kdiv/

Best regards,
Nikolaos Kavvadias

Mister_J · Jun 29, 2012

I think that the solution from Rob Gaddi is the best one, but if you want to do a division, here is
how you can do :

You could use the new fixed point type from fixed_pkg and use the function reciprocal to calculate
1/5. If you use this function with a non constant then it won't synthetize well because it will do
the whole calculation in 1 clock cycle.

If we want to calculate 1/x where x is not a constant, we can use this ideal relation :
y = 1.0/x => x * y = 1.0 = x * y_m + x * y_(m-1) + ... + x * y_(n+1) + x * y_n = Mult
where y_k is either 2**k, either 0

In the reality, Mult <= 1 because sometimes the division is not exact but rounded to low.

The goal here will be to make Mult the closest possible to 1.0. First we calculate Mult = x * y_m,
then Mult = x * y_m + x * y_(m-1), then Mult = x * y_m + x * y_(m-1) + x * y_(m-2) and so on until
k=n. At each iteration, we will try to set y_k = 2**k, but if we see that Mult would become higher
than 1.0, then we will set y_k to 0. Then at each iteration, Mult will come closer to 1.0 if y_k =
2**k or stay with its current value if y_k = 0.

Here is how can be declared x and y in VHDL :
signal x : UFixed (a downto b); -- for example a = 11, b = -4
signal y : UFixed (m downto n); -- m = -b-1 = 3; n = -a-1 = -12 => (3 downto -12)

Here is some pseudo VHDL code :

First iteration :

Code:

Mult_Next_Tmp := x * 2**m; -- We try to set y_m to 2**m
Mult_Next := Mult_Next_Tmp when Mult_Next_Tmp <= 1.0 else
             0;
y(m) <= '1' when Mult_Next_Tmp <= 1.0 else
        '0';

Second iteration :

Code:

Mult_Next_Tmp := Mult + x * 2**(m-1); -- We try to set y_(m-1) to 2**(m-1)
Mult_Next := Mult_Next_Tmp when Mult_Next_Tmp <= 1.0 else
             Mult;
y(m-1) <= '1' when Mult_Next_Tmp <= 1.0 else
          '0';

Third iteration :

Code:

Mult_Next_Tmp := Mult + x * 2**(m-2); -- We try to set y_(m-2) to 2**(m-2)
Mult_Next := Mult_Next_Tmp when Mult_Next_Tmp <= 1.0 else
             Mult;
y(m-2) <= '1' when Mult_Next_Tmp <= 1.0 else
          '0';

Last iterations :

(...)

Jonas

comp.lang.vhdl FAQ part 1 of 4: general	0	Jul 8, 2003
Call for Papers: International Conference on Soft Computing and Applications	0	May 21, 2012
Re: Algorithm to solve equation for variable	1	Jun 28, 2003
Is there any tool to describe program algorithm in flowchart and generate code from the chart?	9	Feb 1, 2007
Data type problem in encryption algorithm	11	Nov 25, 2005
Call for Papers Reminder: International Conference on Soft Computingand Applications	0	Jun 24, 2012
Last Call for Papers Reminder (extended): International Conference onSoft Computing and Applications	0	Jul 20, 2010
Synchronization	7	Apr 2, 2009

divider algorithm

thunder

Gabor

Allan Herriman

Brian Davis

Rob Gaddi

MikeWhy

Nikolaos Kavvadias

Mister_J

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads