fast universal compression scheme and its implementation in VHDL

J

Jens Mander

Hi folks,

my name is Jens and I am student of the Technical University Berlin. Through
my course of study in microelectronics VHDL design is becoming my favorite
hobby. My other interests are signal processing and compression in general.
Lately I purchased an FPGA Evalution board second-hand (guess where?) and I
am now starting my first "private" implementations. Just to give you a short
intro... ;-)

I am interested in implementing compression algorithms using VHDL on an
FPGA. I want to build a data transmission system that compresses portions of
the incoming data (not the whole data but "frames" of like 800 bytes)
on-the-fly. In my search for a fast (i.e. real-time capable at a "desired"
data rate of - let's say - 300 MHZ?) "universal" compression scheme I came
across the following stepping stones:

- is there any free example code for compression algorithms available in
VHDL to get an overview and a first impression of implementation complexity?

- what would you think are the most promising algorithms for my purpose
(i.e. when statistics and semantics of the input data are unknown), first of
all I thought of delta encoding, sorted RLE, LZ, ....?

- as the input data is unknown the álgorithm must be lossless, reducing
redundancy (if possible), not irrelevancy. what are the theoretical limits
of "universal" compression, not emphazizing one particular statistical
property (like similar by values in succession)?

- what is meant by the keyword "systolic implementations" and "pipeling" in
that particular context? I came across that very often lately

- what if my code gains different compression ratios for consecutive data
portions? surely a FIFO can decouple input and output rate but eventually
the FIFO will underflow?

Thanks for you help + support in advance, any comments, hints and help is
appreciated!

Bye Jens


P.S.: I'm looking for the standard works "Sayood, Khalid: Introduction to
data compression, Academic Press, 199x or 200x" and/or ". Salomon: Data
Compression, Springer-Verlag, New York, 200x". Are there any sources of an
electronic copy (ps, pfd, etc.) or transcriptions?
 
M

Mike Treseler

Jens said:
- is there any free example code for compression algorithms available in
VHDL to get an overview and a first impression of implementation complexity?
http://www.opencores.org/browse.cgi/by_category

- what would you think are the most promising algorithms for my purpose
(i.e. when statistics and semantics of the input data are unknown), first of
all I thought of delta encoding, sorted RLE, LZ, ....?

Try those. Run some sims. Find out.
- as the input data is unknown the álgorithm must be lossless, reducing
redundancy (if possible), not irrelevancy. what are the theoretical limits
of "universal" compression,

I can imagine an incompressible packet.
- what is meant by the keyword "systolic implementations"

it repeats.

and "pipeling"

parallel processes
- what if my code gains different compression ratios for consecutive data
portions? surely a FIFO can decouple input and output rate but eventually
the FIFO will underflow?

run some sims and find out.

-- Mike Treseler
 
J

Jerry Coffin

Jens Mander wrote:

[ ... ]
- is there any free example code for compression algorithms available in
VHDL to get an overview and a first impression of implementation complexity?

Opencores.org has a JPEG compressor core, though I don't remember
whether it's in VHDL or Verilog (or something else).
- what would you think are the most promising algorithms for my purpose
(i.e. when statistics and semantics of the input data are unknown), firstof
all I thought of delta encoding, sorted RLE, LZ, ....?

It depends a little on the board you're using -- just for example, the
LZ-based compression schemes use a dictionary that can grow large
enough that you _probably_ want to store it in separate memory rather
than on the FPGA proper.

A great deal also depends on what sort of data you expect to be
compressing -- one reason there are many different forms of compression
is that few work equally well for all kinds of data.
- as the input data is unknown the álgorithm must be lossless, reducing
redundancy (if possible), not irrelevancy. what are the theoretical limits
of "universal" compression, not emphazizing one particular statistical
property (like similar by values in succession)?

According to Shannon, normal English text has about one bit of entropy
per character. Starting from the usual ASCII encoding of 8 bits per
character, an optimal output could theoretically reach about 12.5% of
the input size for normal English. This is also the theoretical limit
of compression from Huffman encoding, though in reality it doesn't
normally produce nearly that good of compression (but then again,
nearly nothing else does either).

Most other natural languages seem (amazingly) similar to English in
this respect.

Other inputs will vary over a wide range and often require different
encoding. To use one of your examples above, delta encoding is used
quite frequently on audio data, but rarely for much else.
- what is meant by the keyword "systolic implementations" and "pipeling" in
that particular context? I came across that very often lately

"Systolic" derives from the heartbeat, and in digital design refers to
something that's driven by a clock, but without a bit more information
about how it's being used, it's hard to guess what they mean by it.

Pipelining refers to separating a process into a number of stages, so
the data flows from one stage to the next. The primary idea here is to
allow a higher clock speed by reducing the number of gate delays
between one clock and the next. The major tradeoff is more complex
design.
- what if my code gains different compression ratios for consecutive data
portions?

"What if..." is the wrong question. The right question is more like
"what do I do when..." -- this is not a chance, but an inevitability.
surely a FIFO can decouple input and output rate but eventually
the FIFO will underflow?

Most likely you'll have to include some sort of flow control so when
(for example) your output FIFO is nearly full, you send a signal to
stop the input flow.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top