Deflating a File

C

Chase Preuninger

One thing I just noticed was that when I deflated a file full or
random bytes it actually increased its size. Just thought that was
kind of neat.
 
J

Joshua Cranmer

Chase said:
One thing I just noticed was that when I deflated a file full or
random bytes it actually increased its size. Just thought that was
kind of neat.

Compression works by collapsing redundant sequences of information into
shorter bit strings, typically (though not always) through some sort of
dictionary system or run-length encoding. True random data cannot be
compressed most of the time; sufficiently pseudorandom data is also not
likely to be compressed well either.

Random data being compressed would be surprising.
 
P

Patricia Shanahan

Chase said:
One thing I just noticed was that when I deflated a file full or
random bytes it actually increased its size. Just thought that was
kind of neat.

The job of a compression algorithm is to map some class of files to
shorter files. Since there is a fixed number of possible files of each
length, to achieve that it also has to map other files to longer files.

Patricia
 
R

Roedy Green

One thing I just noticed was that when I deflated a file full or
random bytes it actually increased its size. Just thought that was
kind of neat.

That's to be expected. You add the overhead without getting ANYTHING
back. A random file by definition can't be compressed.

Zipping a mess of zips buys you a little not because the zip parts
compress, but because the filenames and other uncompressed parts
compress.
 
O

Owen Jacobson

Compression works by collapsing redundant sequences of information into
shorter bit strings, typically (though not always) through some sort of
dictionary system or run-length encoding. True random data cannot be
compressed most of the time; sufficiently pseudorandom data is also not
likely to be compressed well either.

Random data being compressed would be surprising.

*Pseudo*random data can, of course, be compressed to a description
(goedel number? source code? whatever) of the algorithm plus the
initial conditions used to generate the values. This suggests to me
that there might be a fixed (or almost fixed) amount of entropy in
PRNG output, regardless of how many digits of output you have.

Hmm.

-o
 
A

Andreas Leitgeb

Owen Jacobson said:
*Pseudo*random data can, of course, be compressed to a description
(goedel number? source code? whatever) of the algorithm plus the
initial conditions used to generate the values. This suggests to me
that there might be a fixed (or almost fixed) amount of entropy in
PRNG output, regardless of how many digits of output you have.

If it's say 1e1000000000000000000 digits from the PRNG-Output, it's
not all that "regardless", as you'll need some extra bytes to encode
the actual length of the sequence :)
 
A

Arne Vajhøj

Owen said:
*Pseudo*random data can, of course, be compressed to a description
(goedel number? source code? whatever) of the algorithm plus the
initial conditions used to generate the values. This suggests to me
that there might be a fixed (or almost fixed) amount of entropy in
PRNG output, regardless of how many digits of output you have.

True.

But current available compression algorithms does not analyze
data and detect the RNG algorithm. And I am pretty sure that
they will not in the future either.

So in practice the data does not compress.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,678
Members
48,796
Latest member
Greg L.

Latest Threads

Top