# Convolution in VHDL

Discussion in 'VHDL' started by Hari, Jan 16, 2004.

1. ### HariGuest

Hi ,
How do I get started with performing convolution on VHDL.The
convolution lengths I intend to perform are of the order 16, 32,
128,etc.

HAri

Hari, Jan 16, 2004

2. ### Jonathan BromleyGuest

"Hari" <> wrote in message
news:...

> How do I get started with performing convolution on VHDL.The
> convolution lengths I intend to perform are of the order 16, 32,
> 128,etc.

It is pointless to worry about the VHDL until you have
established the architecture you want to use; and you cannot
establish the architecture until you have clearly defined
the problem's constraints.

Do you need the filter order to be variable on-the-fly,
or determined at compile time?

How many system clock cycles are available per data item?

Does the filter have any degree of symmetry, or other
features that would allow you to simplify it in
various ways? (For example, if the filter has
many identical coefficients, it can be much simpler
to convolve with the first-differences of the kernel,
and integrate that convolution result.

What precision do you require in coefficients, data,
result? What do you intend to do about rounding
and truncation of results?

================================

There remains the possibility that I have misunderstood
you, and you simply want to use VHDL as a programming
language to write a simulation model of a convolution
operation, much as you might do in Matlab. That's much
easier:

package ConvoPack is
type Real_Vector is array(integer range <>) of real;
end;
use work.ConvoPack.all;
entity Convolver is
generic (
kernel: Real_Vector;
);
port (
reset : in boolean;
data : in real;
result: out real
);
end;
architecture Model of Convolver is
-- Get the kernel to have a descending subscript range,
-- to make shifting operations easier
alias NormKernel: Real_Vector(kernel'LOW to kernel'HIGH) is kernel;
begin
CanonicalConvolution: process (reset, data'TRANSACTION)
variable
DataPipe: Real_Vector(NormKernel'_RANGE) := (others => 0.0);
sum: real;
begin
if reset then
DataPipe := (others => 0.0);
result <= 0.0;
else
sum := 0.0;
DataPipe := data & DataPipe(DataPipe'LOW to DataPipe'HIGH-1);
for i in DataPipe'RANGE loop
sum := sum + DataPipe(i) * NormKernel(i);
end loop;
end if;
result <= sum;
end process;
end;

However, if you intend to build hardware, this won't help you much.
--

Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * Perl * Tcl/Tk * Verification * Project Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire, BH24 1AW, UK
Tel: +44 (0)1425 471223 mail:
Fax: +44 (0)1425 471573 Web: http://www.doulos.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.

Jonathan Bromley, Jan 16, 2004

3. ### HariGuest

Hi,
I guess my question was pretty vague. I actually intend to do a
16pt chripz transform and I need an FIR filter to perform the
convolution.I have my filter coefficients as well.There is one clk
cycle per data.I am not sure on how am I gonna put everything
together. even a pointer to a reference mite be helpful.

Hari

Hari, Jan 22, 2004
4. ### GuestGuest

"Hari" <> wrote in message
news:...

> 16pt chripz transform and I need an FIR filter to
> perform the convolution.I have my filter coefficients
> as well.There is one clk cycle per data.

Chirp-Z has no simple symmetry, so you need to do all
16 multiply-accumulates in a single cycle. That means

The only way I know to reduce the amount of hardware is
to use a system (filter) clock that is faster than the
data clock. If the data stream is slow, you may be able
to do this with a Xilinx DCM or perhaps an external PLL.

> I am not sure on how am I gonna put everything
> together. even a pointer to a reference mite be helpful.

The usual homework issue: Tell us how your thinking
is progressing, and you will find lots of people to
as if you have not even tried, and you'll get lots of
insults.

Are you familiar with the difference between canonical
and systolic implementations of a transversal filter?
The latter is much easier to implement, because each
adder has only two data inputs; the canonical form
requires a 16-input adder tree for 16 coefficients.
--
Jonathan Bromley

Guest, Jan 22, 2004