Compression (besides Huffman) and Ruby

  • Thread starter Josef 'Jupp' SCHUGT
  • Start date
J

Josef 'Jupp' SCHUGT

Hi!

I don't want to start a lenghty thread about wether that makes sense
or not. My question is only if such a thing is known to exist. I did
find Huffman but besides that rei, nada, rien, zero, nichts. Anything
else (note that the point is *plain* Ruby, no extension). Otherwise I
will try to implement such an algorithm on my own.

Josef 'Jupp' SCHUGT
 
W

Wesley J Landaker

--Boundary-02=_AL2+/FjRAtxYqwa
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Description: signed data
Content-Disposition: inline

Hi!

I don't want to start a lenghty thread about wether that makes sense
or not. My question is only if such a thing is known to exist. I did
find Huffman but besides that rei, nada, rien, zero, nichts. Anything
else (note that the point is *plain* Ruby, no extension). Otherwise I
will try to implement such an algorithm on my own.

Well, here's a really bad RLE encoder/decoder. ;)

#!/usr/bin/ruby

$last=3Dnil
$count=3D0

def compress
while gets
$_.each_byte { |c|
if c =3D=3D $last
$count +=3D 1
else
puts "#{$count}:#{$last}";
$last =3D c
$count =3D 1
end
}
end
end

def decompress
while gets
~ /(\d+):(\d+)/
$1.to_i.times { s =3D ' '; s[0] =3D $2.to_i; print s }
end
end

if ARGV.shift =3D=3D '-d'
decompress
else
compress
end

Usually I just do:

require 'zlib'

Since it comes with ruby. =3D)

=2D-=20
Wesley J. Landaker - (e-mail address removed)
OpenPGP FP: 4135 2A3B 4726 ACC5 9094 0097 F0A9 8A4C 4CD6 E3D2


--Boundary-02=_AL2+/FjRAtxYqwa
Content-Type: application/pgp-signature
Content-Description: signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQA/+2LA8KmKTEzW49IRAkvKAJ9O7L6KUfvNQCq3KpwLCOyDMO2wpgCeKIE5
XOeeqH1C0dMPlNDY2yNTVic=
=trBq
-----END PGP SIGNATURE-----

--Boundary-02=_AL2+/FjRAtxYqwa--
 
J

Jamis Buck

Josef said:
Hi!

I don't want to start a lenghty thread about wether that makes sense
or not. My question is only if such a thing is known to exist. I did
find Huffman but besides that rei, nada, rien, zero, nichts. Anything
else (note that the point is *plain* Ruby, no extension). Otherwise I
will try to implement such an algorithm on my own.

Josef 'Jupp' SCHUGT
I've played with the idea of implementing zlib in pure Ruby...
unfortunately, I've never got beyond that. I've got about 30% of a
working tar implementation, but that's not really compression. I'd be
very interested in a pure ruby zlib or bzip2 implementation, if you
decide to write one. ;)

- Jamis
 
J

Jamis Buck

Mark said:
I've got the bare bones of a Ruby implementation of adaptive arithmetic
encoding, using the Rational module. But it's r-e-a-l-l-y s-l-o-w. :)

-Mark

Heh. :) Well, the point of doing it in Ruby isn't to win races, I
guess. It's more to make these things portable. I'm sure a zlib or
bzip2 algorithm implemented in Ruby would be dog slow, too, but the
convenience provided by having them available directly, on any platform,
has to be taken into account.

Is your code in the RAA, perchance? I'd be interested in taking a peek.
 
R

Robert Klemme

Jamis Buck said:
Heh. :) Well, the point of doing it in Ruby isn't to win races, I
guess. It's more to make these things portable. I'm sure a zlib or
bzip2 algorithm implemented in Ruby would be dog slow, too, but the
convenience provided by having them available directly, on any platform,
has to be taken into account.

But wouldn't it bettern then to implement it in C (or use an existing
implementation)? I mean, C *is* portable.

Cheers

robert
 
M

Martin Weber

(... quote snipped ...)
Heh. :) Well, the point of doing it in Ruby isn't to win races, I
guess. It's more to make these things portable. I'm sure a zlib or
bzip2 algorithm implemented in Ruby would be dog slow, too, but the
convenience provided by having them available directly, on any platform,
has to be taken into account.

This doesn't make any sense. When it's possible to have a ansi c compliant
program (the ruby interpreter) which can be compiled on/for a (unix environment
on a) foreign platform, it'll also be possible to compile zlib or whatsoever
there. Probably it's more likely for such libs to be portable than ruby
itself.

I dunno where you people think ruby is running on (aside of your mind :).

Regards,

-Martin
 
J

Jamis Buck

Robert said:
But wouldn't it bettern then to implement it in C (or use an existing
implementation)? I mean, C *is* portable.

Cheers

robert

Very true. Generally, C *is* portable, and in general, such things
should be (and are) written in C. However, C is *not* portable when a
module is not available as a pre-compiled binary for a particular
platform, and the user does not have access to a C compiler (or is not a
programmer and is not confident compiling their own module). In such
cases, the value of simply dropping the module into the Ruby path is
quite high.

Please don't misunderstand me. I am not saying that every C module ever
written for Ruby should have been written *in* Ruby. That would be
ludicrous. I'm simply saying that for some things (compression
algorithms, for instance), there is some value in having them available
*both* as a C modules, *and* as pure Ruby modules.

Consider the case of self-extracting Ruby modules. You simply
distribute the pure-ruby version of the compression library with your
own code, and then anyone can extract your module, without having to go
through some (potentially elaborate) installation scheme. Before I
finally broke down and installed both wxWindows and FOX, I would avoid
programs that used them because I've always found it to be a pain to
install those systems (partly because, in addition to the systems
themselves, you then had to install bindings for your language of
choice). (Until I started using Gentoo, anyway, but that's beside the
point.)

Anyway, I hope that has clarified my position. I'm not trying to start
a flame war on the merits of C modules versus pure Ruby modules (or even
wxWindows and FOX--those were just examples from my own personal
experience), so please don't be too vehement with any opposing opinions. :)
 
R

Robert Klemme

Jamis Buck said:
Very true. Generally, C *is* portable, and in general, such things
should be (and are) written in C. However, C is *not* portable when a
module is not available as a pre-compiled binary for a particular
platform, and the user does not have access to a C compiler (or is not a
programmer and is not confident compiling their own module). In such
cases, the value of simply dropping the module into the Ruby path is
quite high.

Another reasonable conclusion is to include a ruby module for such a basic
thing as (de)compression into the std distribution. At least the cygwin
distribution includes Zlib:

irb(main):001:0> require 'zlib'
=> true
irb(main):002:0> ObjectSpace.each_object( Class ) {|cl| p cl if
/zip|zlib/i =~ cl.name }
Zlib::GzipReader
Zlib::GzipWriter
Zlib::GzipFile::LengthError
Zlib::GzipFile::CRCError
Zlib::GzipFile::NoFooter
Zlib::GzipFile::Error
Zlib::GzipFile
Zlib::Inflate
Zlib::Deflate
Zlib::ZStream
Zlib::VersionError
Zlib::BufError
Zlib::MemError
Zlib::StreamError
Zlib::DataError
Zlib::NeedDict
Zlib::StreamEnd
Zlib::Error
=> 358
irb(main):003:0>

Does that help? :)
Anyway, I hope that has clarified my position. I'm not trying to start
a flame war on the merits of C modules versus pure Ruby modules (or even
wxWindows and FOX--those were just examples from my own personal
experience), so please don't be too vehement with any opposing opinions.
:)

No offense taken.

Kind regards

robert
 
J

Jim Menard

Jamis Buck said:
Please don't misunderstand me. I am not saying that every C module ever
written for Ruby should have been written *in* Ruby. That would be
ludicrous. I'm simply saying that for some things (compression
algorithms, for instance), there is some value in having them available
*both* as a C modules, *and* as pure Ruby modules.

A good point. That's what some Smalltalks do, but in an even more useful
way. Primitives are sometimes implemented in C (or at least in the VM
itself, rather than in pure Smalltalk). If the primitive fails or does not
exist, then alternate Smalltalk code can be run.

For example, this is from Squeak's Object class:

instVarAt: index
"Primitive. Answer a fixed variable in an object. The numbering of the
variables corresponds to the named instance variables. Fail if the
index is not an Integer or is not the index of a fixed variable. Essential.
See Object documentation whatIsAPrimitive."

<primitive: 73>
"Access beyond fixed variables."
^self basicAt: index - self class instSize

The Squeak Object class has a method called "whatIsAPrimitive" whose entire
purpose is to act as documentation. Here is a snippet from the comment:

When the Smalltalk interpreter begins to execute a method which
specifies a primitive response, it tries to perform the primitive
action and to return a result. If the routine in the interpreter for
this primitive is successful, it will return a value and the
expressions in the method will not be evaluated. If the primitive
routine is not successful, the primitive 'fails', and the Smalltalk
expressions in the method are executed instead. These expressions are
evaluated as though the primitive routine had not been called.

Jim
 
J

Josef 'Jupp' SCHUGT

Hi!

* Wesley J Landaker:
require 'zlib'

Since it comes with ruby. =)

Wasn't aware that 1.8 has zlib binding. watashi wa baka desu :->

Josef 'Jupp' SCHUGT
 
J

Joel VanderWerf

Josef said:
Hi!

* Wesley J Landaker:



Wasn't aware that 1.8 has zlib binding. watashi wa baka desu :->

^^^^^^^^^^^^^^^^^^^^
Not if you know how to say that ;)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top