Tarfile .bz2

J

Jordan

When using python to create a tar.bz2 archive, and then using winrar to
open the archive, it can't tell what the compressed size of each
individual file in the archive is. Is this an issue with winrar or is
there something that needs to be set when making the archive that isn't
there by default.

example archive:
#Note, the tabs on this are not really tabs, so good luck copying it
correctly

import os, tarfile
archive = tarfile.open("archive.tar.bz2","w:bz2")
for thing in os.listdir(somepath):
nthing = somepath+thing
if os.path.isfile(nthing): # somepath must end in "\\" for this to
work
info = archive.gettarinfo(nthing)
archive.addfile(info,file(nthing,'rb'))
archive.close()
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Jordan said:
When using python to create a tar.bz2 archive, and then using winrar to
open the archive, it can't tell what the compressed size of each
individual file in the archive is. Is this an issue with winrar or is
there something that needs to be set when making the archive that isn't
there by default.

I believe it's an issue of the file format (tar.bz2). You don't compress
individual files, but you compress the entire tar file. So it is not
meaningful to talk about the compressed size of an individual archive
member - they are all uncompressed.

Regards,
Martin
 
W

Wolfgang Draxinger

Jordan said:
When using python to create a tar.bz2 archive, and then using
winrar to open the archive, it can't tell what the compressed
size of each
individual file in the archive is. Is this an issue with
winrar or is there something that needs to be set when making
the archive that isn't there by default.

When compressing a tar archive all files in the archive are
compressed as a whole, i.e. you can only specify a compression
ration for the whole archive and not just for a single file.

Technically a tar.bz2 is actually a aggregation of multiple files
into a single tar file, which is then compressed.

This is different to e.g. PKZip in which each file is compressed
individually and the compressed files are then merged into an
archive.

The first method has better compression ratio, since redundancies
among files are compressed, too, whereas the latter is better if
you need random access to the individual files.

Wolfgang Draxinger
 
J

Jordan

So that would explain why a tar.bz2 archive can't be appended to
wouldn't it... And also explain why winrar was so slow to open it (not
something I mentioned before, but definitely noticed). I had wondered
what it was that made bz2 so much better at compression than zip and
rar. Not really on topic anymore but what's the method for tar.gz? And
even more off the topic, does anyone know a good lossless compression
method for images (mainly .jpg and .png)?

Cheers,
Jordan
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Jordan said:
Not really on topic anymore but what's the method for tar.gz?

It works like .tar.bz2, except that it uses gzip (www.gzip.org)
as the compression library. The underlying compression algorithm
is LZW.
And
even more off the topic, does anyone know a good lossless compression
method for images (mainly .jpg and .png)?

Well, .jpg files are already compressed in a lossy way (.jpg is
inherently lossy); to compress it further, you need to increase
the loss. PNG is also compressed already, see

http://www.mywebsite.force9.co.uk/png/

The compression algorithm inside PNG is zlib (which is the same
as the gzip algorithm). Perhaps you should read the comp.compression
FAQ:

http://www.faqs.org/faqs/compression-faq/

Regards,
Martin
 
Y

Yu-Xi Lim

Jordan said:
So that would explain why a tar.bz2 archive can't be appended to
wouldn't it... And also explain why winrar was so slow to open it (not
something I mentioned before, but definitely noticed). I had wondered
what it was that made bz2 so much better at compression than zip and
rar. Not really on topic anymore but what's the method for tar.gz? And
even more off the topic, does anyone know a good lossless compression
method for images (mainly .jpg and .png)?

You can get the same effect from RAR and other formats (ACE, 7z) by
using the "Solid Archive" or similar option. Ideally, you'd be
compressing lots of similar files for this to be effective. The actual
compression ratios of RAR and bz2 can be pretty similar when done this way.
 
Y

Yu-Xi Lim

Martin said:
Well, .jpg files are already compressed in a lossy way (.jpg is
inherently lossy); to compress it further, you need to increase
the loss. PNG is also compressed already, see

Not really. Stuffit has a JPEG compressor which takes advantage of the
fact that the JPEG algorithm isn't as optimal as it can be. It converts
JPEG images to its own more compact representation which then can be
converted back to JPEG as needed, without any loss. It is sadly not free.
 
F

Fredrik Lundh

Martin said:
Well, .jpg files are already compressed in a lossy way (.jpg is
inherently lossy); to compress it further, you need to increase
the loss.

or use a better algorithm, such as JPEG 2000 or Microsoft's HD Photo,
which both give better visual quality at lower bit rates (which means
that you can often get by with around half the bits compared to JPEG).

</F>
 
P

Piet van Oostrum

MvL> It works like .tar.bz2, except that it uses gzip (www.gzip.org)
MvL> as the compression library. The underlying compression algorithm
MvL> is LZW.

No, it uses a compression algorithm based on LZ77 (called DEFLATE).
Therefore gzip was not encumbered by the the LZW patent.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top